Assignment 7 web testers

Electronic submission policy is the same as for assignment 4.

Assignment 7 Problem 1 web tester.
This web tester does 2 sets of tests. The first set uses the a7example.scm file that you have. The second set uses randomly generated utilities, rewards, and transition probabilities. The code you submit should not have any discount factor or exploration function in place --- it should simply pick the action with the greatest expected utility.
For problems 2 and 3, use the appropriate web tester according to which method you used for problem 3. We should be able to run your reinforcement learning from scratch using the code you submit for this problem.
- A7p23 for value iteration --- the pretester will check for the procedures transform-state, init-tables, and value-iteration.
  After running the random-player for a while to learn a model of the transition probabilities and rewards, we should be able to just call your value-iteration procedure. When it finishes (hopefully not too much time!), it should have learned the hit utilities.
- A7p23 for policy iteration --- the pretester will check for the procedures transform-state, init-tables, and policy-iteration.
  After running the random-player for a while to learn a model of the transition probabilities and rewards, we should be able to just call your policy-iteration procedure. When it finishes (hopefully not too much time!), it should have learned the hit utilities.
- A7p23 for temporal differencing --- the pretester will check for the procedures transform-state, init-tables, td-learning, exploration-fn, rl-strategy and td-player.
  The rl-strategy procedure will be modified from what you turned in for problem 1 so that it uses your exploration function. The td-player procedure is a procedure of zero arguments which defines a player, akin to random-player in the a7header.scm file.
  We should be able to initialize the tables and then call (play-match 10000 (td-player)) , and it should play the game and learn utilities simultaneously.
Problem 4 web tester.
This will simply check to make sure everything that the save-learning procedure saves is still there. You may also add a modified rl-strategy procedure to this file if we should use it to test how well your learned player performs.