UCT algorithm to beat Go master
Mogo and the UCT algorithm beat a human opponent at master level in the game of Go for the first time on March 22, 2008 in a tournament organized by the French Federation of Go.
The game of Go
It is a game of
Chinese origin which is played on a board named goban of 19x19 cases, but
that is often reduced to 13x13 or 9x9 cases. The players have white or black
stones they pose at the vacant intersections of lines of the board, alternately,
in order to encircle the stones of the opponent to score points.
In this historical duel against the computer, the goban used was 9x9.
The game of Go resembles to Othello that be played on a 8x8 cases board. In Othello stones surrounded take color of stones which surround them and therefore change their owner.
Othello software with the computer as opponent exist for a long time and they use simple algorithms such as alpha-beta.
The Go game has nothing to do with Gomoku which is derived from the FiveInLine game. This a 3x3 boxes and the aim is to create a diagonal or line or three pieces of the same colour.
Mogo and the UCT algorithm
In 1998, the Deeper Blue program from IBM has fought for the first time a chess master, Gary Kasparov. However, the game of Go is more difficult to program because the number of moves to predict is virtually unlimited.
The best Go program is currently Mogo, developed by INRIA, CNRS and Paris-Sud University. This program uses the UCT algorithm to browse the tree of possible moves.
UCT, Upper bound for Confidence Tree is derived from UCB, Upper
Confidence Bounds and includes the use of Monte Carlo as evaluation
function.
The tree is crawled from the root, that is the current position of stones
and is driven by the UCT algorithm until a terminal node is reached.
At this point an evaluation function is applied that is named Monte-Carlo.
The UCT algorithm travels several times the same branches, but they are evaluated every time and therefore it will tend to go the most promising branches.
Monte-Carlo
Applied to Go, Monte-Carlo simulation consist to fill randomly the board with stones and count the points for each player, to give a value to this terminal node.
The name Monte-Carlo refers to chance, which is the basis of the method used.