OkonomiDev
Friday, November 08, 2002
      ( 11:26 AM ) Matt  

Machine Learning


A lot of research has been done, and is still being done on machine learning. In the Artificial Intelligence journal, there have recently been some papers on the state of the art in computer go algorithms. (See "Computer Go" by Martin Muller, Artificial Intelligence volume 134, 2002.) I'm sorry I can't give you a URL because the material is not publicly available.


The best computer go algorithms these days have a lot of very go-specific knoledge built into them. The aforementioned paper summarizes some of it. I am searching for an alternative approach to a good go AI, and use a really powerful neural net. Using neural nets is nothing new. There have been numerous papers and senior projects done on the application of neural nets to go. None of them have been very successful. One approach which is a hybrid of the hard coded go knowledge and a neural net is called NeuroGo. Perhaps something like this has potential to be very successful, but I'd like to get away from the hard coded knowledge entirely.


The chance of a neural net succeeding at playing go is pretty small, but I'd still like to try. One way of training a neural net is by competitive play. You play the neural net against another strategy, and reward it for good moves and punish it for bad ones. (This can be achieved by a technique called Temporal Difference Learning. There are lots of papers on this.) It has been shown that neural nets learn fastest when competing against strategies of similar, or slightly higher strength. To begin the training of my neural nets, I plan to use the crawler strategies described in my November 5 posting. These are sufficiently weak to train neural nets with no skill, and the strategies are nondeterministic enough to prevent the neural net from learning to win by a set sequence.


Another method of training neural nets is to have them predict the moves in professional games. The hope is that after studying professional games, the net will produce similar moves, and these moves will be good. The usual result of this training is that the net produces moves which look like professional moves, but miss some critical point of play. Imagine two programs, one which simply plays the moves from professional game A, and another which plays moves from professional game B. They two programs play nice looking moves, but they are not playing the same game. Even a neural net trained on hundreds or thousands of professional games will still make this mistake.


A third method of training, competitive co-evolution, is similar to the first. Two neural nets, or groups of neural nets, play competitively against each other. If neural net A is stronger than neural net B, then B will learn from A, and eventually reach, and hopefully surpass A. Then A will learn from B. This learning can be enhanced by having two groups, called species, which breed amongst themselves.


A combination of these three types of learning is probably critical. How much of which sort of learning is optimal is unknown. Perhaps even the sequence of learning styles matters. This is one of the first things I plan to investigate.

# -




Comments: Post a Comment
archives:


Dreams I have...

Powered by Blogger
Feel free to e-mail me.

free hit counter