Wednesday, February 21, 2007

Algorithm helps computers beat humans at Go



Algorithm helps computers beat humans at Go

Computers can beat some of the world's top chess players, but the most powerful machines have failed at the popular Asian board game Go, in which human intuition has so far proven key. Two Hungarian scientists have come up with an algorithm that helps computers pick the right move in Go, played by millions around the world, in which players must capture spaces by placing black-and-white marbles on a board in turn.
"We are not far from reaching the level of a professional Go player," Levente Kocsis of the Hungarian Academy of Sciences computing lab Sztaki said.
The 19x19 grid board that top players use is still hard for a machine to use, but the new algorithm is promising because it makes better use of the growing power of computers than did earlier Go software.
"Programs using this method immediately improve if you use two processors instead of one, say, which was not typical for earlier programs," Kocsis said.
Whereas a chess program can evaluate a scenario by assigning numerical values to pieces--9 to the queen and 1 to a pawn, for example--and to the tactical worth of their position, that technique is not valid for Go.
In Go, all marbles have an identical value, and scenarios are more complex, so the computer has to think about all potential moves through the end of the game and emulate the outcome of each alternative move.
Even the most powerful computers have failed at that task, but Kocsis and colleague Csaba Szepesvari have found a way to help computers focus on the most promising moves, using an analogy with slot machines in a casino.
Punters will find that certain one-armed bandits in a casino appear to pay more on average than others, but an intelligent player should also try machines that have so far paid less, in case they are hiding a jackpot, Kocsis said.
The key is to find the balance between the two sorts of machine.
Go software using a similar method, called UCT (Upper Confidence bounds applied to Trees), does not have to scan all possible outcomes of a game and can quickly find the best mix of scenarios to check.


"This bandit algorithm has proven advantages," Kocsis said.
The possible outcomes of a game are like branches of a tree, and earlier Go programs, unable to scan all branches, picked some at random and tried to find the best move from that sample.
The UCT method (PDF) helps a computer decide which routes are most worth investigating. Programs based on it have consistently won games against most other machines, according to Kocsis.

No comments: