Assignment 4 information

Connect four tournament results

Revised tournament

There were a few problems with the original tournament:

One thing that looks like a problem but actually isn't is that many of the logs from the original tournament are "truncated." They don't show the final winning board and don't print the line about who actually won the game.
This was not a problem in my Scheme code, because when run manually, it does print everything correctly. (See below on how you can download this code and try it yourself.) I actually don't know what caused this; presumably it was some funny problem with an output buffer not being flushed to a file. It was not a problem with my program crashing.
However, it turns out that there was a bug in my Scheme program that played two students' evaluation functions against each other. This was a bug that I had actually already fixed for the Problem 4 web tester but somehow the fix didn't make it into the Connect 4 tournament code. The version of the c4tourney.com file available below now contains this fix.

There was not time to run another round robin tournament, so I ran a limited tournament:

I selected 8 entries semi-randomly based on the rankings from the online tournament. I picked one player from the players ranked 1--5, one from the players ranked 6--10, and so on. The eighth player was picked randomly from players ranked 36--44. (Although there were 46 entries in the tournament, somehow, there were only 44 ranked players in the standings, excluding our sample players of course.)
These "reference" players were: maimad, rogerm, scavem, rogerl, johnsd8, yeungc2, holdeg2, celayj.
For each entry, I ran a match with all 8 of these reference players. A match consists of 2 games where the players switch sides for the second game. Of course, I didn't play the reference players against themselves.
Here are the results. This is a sorted list by number of points where a win counts a 1, a loss -1, and a draw as 0.
I created a score from this tournament based on 20 points for submitting a working player in the tournament and gave from 1 to 20 points for performance based on the standings in the tournament.
Your score for Problem 5 is the maximum of your score for the revised tournament and for the original tournament.

Original tournament

This was a round robin tournament, and there were 46 entries. For each pair of students, we played two games, so both players had a chance to be X and O. That's 90 games for each student and a total of 2070 games which took several computer-days to run. This tournament used the regular alpha-beta minimax search since the version I had written to choose among the optimal moves had a bug in it.

It turns out that many students' players had bugs in them. I'm not exactly sure how they got by the pretesting. So I added some error trapping mechanisms to the tournament program. If there was an error generated during a call to your evaluation function, then a random valid move is picked on your behalf.

Here are the results:

Sorted by wins (ties broken by losses)
Sorted by score. This score is a sum over all your games of 1 for a win, -1 for a loss, and 0 for a draw.

We'll be emailing the transcripts of all your games soon.

Problem 5 was scored as follows. Any student that had uploaded a valid player by the assignment deadline received 5 points. The remaining 35 points were determined from the tournament results. These were based on an average of your ranking from the two versions of the results above. For each ranking, the top player received 35 points, the bottom player received 5 points. Players in the middle received points linearly proportional to their ranking (not wins or score). Students that tied in a ranking received the same score for that ranking. Points from the two rankings were averaged. For example, if you were exactly in the middle of one ranking, you would have gotten 20 points, and if you were exactly 2/3 in the ranking from the bottom, you would have gotten 25 points from that ranking. Averaged and rounded, this would come out to 23 points.

Announcements

I've been receiving several reports about students having the correct sequence of states but getting the move wrong on the problem 4 submission. The thing is that these students all report a different move as the "correct" move! I'll see what I can do to investigate on the road...
[10/26] At this point, I just don't have time to fix the bugs in the feature detectors in the connect4.scm file before I leave town in a matter of hours, and the soonest I could then upload changes would be on Tuesday the 30th. So, in the interest of stability and confusion minimization, I will be leaving this file as it is. You can the depend on this bug or you can take the code and modify it yourself.
[10/24] There was a typo in the preamble text for the problem 4 web tester. It says "115 states evaluated" and "best move in column 1". These are actually incorrect. It should be 107 states and a best move in column 6. This was corrected in the preamble text just before 5pm. The solutions actually used to compare your answer were the correct ones, though.
If you got 115 states and column 1, you may have done the same thing that I initially did which was to switch the order of the current-player and max-player arguments to the evaluation function in the ab-min-player procedure.
[10/23] Problem 4 web tester is back up.
[10/23] There was a problem with the problem 4 web tester. It had the wrong answer for the test. We'll have a corrected version up soon. We will retest the code that was submitted to the old version of the web tester and email you the results, deleting a submission if appropriate. Stay tuned for details.
[10/21] Updated the submission policy on the submission page to clarify how Problem 5 (the Connect 4 evaluation function) will be handled.
[10/19] Alpha-beta minimax oracle is up (to use for Problem 5 to test your c4-eval function).
[10/18] I've started to put answers to student questions for this assignment on a web page (see above).
[10/18] IMPORTANT ANNOUNCEMENT: for problem 4, I must require that you evaluate nodes in the order that they are given to you by the c4-children procedure, i.e. the first child in the list should be evaluated first, then the second child, and so on. This way, your procedure will evaluate the same nodes as my solutions...
[10/18] A reminder of announcments in class today:
- For problem 4, you must implement the standard alpha-beta minimax search, not the version that finds all optimal moves. Otherwise, your submission will lose points because it will evaluate too many nodes.
- Your alpha-beta minimax implementation must be able to work with any evaluation function. (I will test it using my own.)
- For Problem 5, you must have an evaluation function uploaded and accepted by the web tester by the deadline (10am 10/25). You then have until the following Thursday (10am 11/1) to improve your evaluation function and play it against other students' evaluation functions. We will take the current evaluation function at that time for each student and run a round-robin tournament, the results of which will be used to determine the "performance" portion of Problem 5.
- For problem 5, your evaluation function will be run using my version of alpha-beta minimax which chooses randomly among the optimal moves, i.e. problems 4 and 5 are not dependent upon each other.
[10/18] Most of the web testing stuff is up for problem 5.
[10/17] The procedure to get the childern for Nim is called nim-children which appears once correctly and once incorrectly (as nim-gc) one page 2 of the assignment.
[10/16] I've put some hints in the hints section below.
[10/15] I revised the "submission page". There has been one change to the electronic submission policy (possibility of "submission penalties" for when we must manually intervene when you should have known better). Mostly, however, I felt that a few issues needed to be clarified based on student questions.
[10/15] I switched the a4code.com file from the web server to our ftp server. If you had trouble downloading it, try again with the new link. (This is only a "feature" on Windows machines.)
[10/12] The web tester and oracles for problems 1 and 2 are up. There is also a "web tester tester" page and a "syntax check" page. See Assignment 4 submission page for details.
[10/4] Support code Version 1.1 is out. I'll have more information about support code and such up later, but you can at least get started with this.

Support code

There are two files containing support code for this assignment:

a4code.com (Version 1.1 released 10/4)
This file of compiled Scheme code contains the procedures that implement Nim, allow you to play against the computer for Nim and Welter's game, and some support code for Connect 4.
connect4.scm (Version 1.1 released 10/4)
This file of Scheme source code contains the remaining support code for Connect 4, in particular, the state represntation and feature detectors.

Links

Here are a few links for Connect 4 Java applets:

Some of these might actually be the same applet...

Hints

Problem 1

[10/18] If you get a feedback message about having the "correct value" (or "correct game value") but an "inconsistent node", this means:
- the value you returned (i.e. a +1 or -1 indicating that MAX either will win or lose the game assuming MAX and MIN play optimally) is correct, but
- either (1) the leaf node does not have this value, or (2) at some ancestor node (of this leaf node), MAX or MIN has a move which will lead to the opposite outcome. (therefore, this is not an "optimal" game)
I suggest you try drawing out the game tree and figuring out what the paths are for optimal games and checking whether you've in fact returned one of them.
I suggest that your max-player and min-player procedures return the same information as the minimax procedure does.
You will find it easier to build the leaf node on the way down the game tree rather than on the way back up.

Problem 4

[10/18] The provided inf:<, inf:<=, inf:>, inf:>=, inf:=, inf:max, and inf:min procedures extend the real numbers to include the symbols pos-infinity (which is greater than any real number) and neg-infinity (which is less than any real number). For example:
```
(inf:> 7 2)                          ==>  #t
(inf:> 7 'pos-infinity)              ==>  #f
(inf:= 'neg-infinity 'neg-infinity)  ==>  #t
(inf:max 8 'neg-infinity)            ==>  8
(inf:min 8 'neg-infinity)            ==>  neg-infinity
```
Note that the inf:max and inf:min procedures, unlike their regular counterparts, only take 2 arguments.
[10/18] Here's an example create-c4-player procedure:
```
(define (create-c4-player eval-fn depth-cutoff)
  (define (ab-minimax board player)
    (second (ab-max board
                    player 
                    eval-fn
                    'neg-infinity
                    'pos-infinity
                    0
                    depth-cutoff)))
  ab-minimax)
```
In my implementation, the convention I use is that the ab-max and ab-min procedures return a list where the first element is the value of the subtree and the second element is the best move. Since the player function must return a move, it only takes the second element of this list.
You can use whatever convention you want in what your ab-max and ab-min procedures return so long as the player function returns a valid move, i.e. an integer between 1 and 7 inclusive.

Problem 5

[10/18] I'd encourage you to write your own "feature detectors" for your evaluation function. To do it efficiently, you'll have to learn how to use regular expressions to search for patterns in the board; you can take a look at the feature detectors that I've provided in the connect4.scm file. There are definitely some useful features that you might want to detect that my provided procedures do not.