Computer Go demystified: An interview with Martin Müller

go game 300x300 picture

The board game Go.

A lot of Go players are interested in computer Go.

And, especially after the recent Man vs Machine match between John Tromp and Zen19, many people have questions about it.

I had questions too and I also didn’t know how to answer some of the questions that readers were asking at Go Game Guru.

I’m certainly no expert on computer Go, but I talked to someone who is: Martin Müller from the University of Alberta in Canada. Here’s what Martin had to say…

An interview with Martin Müller

David Ormerod: To start with please tell us a bit about yourself and your research interests. How did you learn of Go and how did you become involved in computer Go?

martin mueller computer go picture

Martin Müller, computer Go expert.

Martin Müller: I am a professor in the Department of Computing Science at the University of Alberta in Edmonton, Canada.

My research interests are in heuristic search, studying how to solve large, complex problems using computer searches.

The main application areas studied in my research group are games such as Go, and automated planning.

In recent years, Monte Carlo search methods have been our main focus – both for games and for planning. As part of my game-related activities, I am the leader of the team developing the open source software Fuego, which was the first program to defeat a top professional in an even game on 9×9.

I learned Go when I was 15 years old and played a lot in my teens and early twenties. I am a 5, 6 or 7 Dan amateur player, depending on the country. My biggest success was probably taking 2nd place at the US Go congress open tournament in 1998.

I became interested in computer Go as an undergraduate in my home country of Austria, through my supervisor. This was around 1985. I have stayed with the topic ever since, doing a Diploma thesis, a PhD and a few postdocs, before getting my current job.

What’s Monte Carlo?

Most people with any interest at all in computer Go know that the strongest programs these days use a ‘Monte Carlo’ algorithm, but many people don’t know much more about it than that.

Could you briefly explain where the term Monte Carlo came from and what it means in this context?

The term Monte Carlo refers to an affluent suburb of Monaco which is famous for its Casino. Monte Carlo methods use statistics collected from randomized simulations as a way to analyze complex systems which are too hard to ‘solve’ by other means.

They were first developed for nuclear physics and atomic bomb research in the 1940s. Nowadays they are very widely used, but their application to games such as Go took off just a few years ago.

Now that computers are powerful enough, Monte Carlo methods are used across a wide variety of disciplines.

For example, I’ve used them at work to help with risk analysis. It’s often difficult to explain to people why this approach works though, because it seems so counterintuitive at first.

Do you have a good analogy to explain how a large enough number of random simulations can provide a useful answer to a question?

Statistical sampling, which is at the core of Monte Carlo methods, is a very powerful technique.

For example, think about opinion polls. Any single random person who you ask about their opinion may be completely crazy, but if you ask one thousand people, who are carefully selected to represent the overall population, then you get quite a good idea of the general mood and can use that to make informed decisions.

This is why we keep getting more and more of those pesky phone calls doing surveys at dinner time!

How computer Go programs improved

It’s been more than five years since UCT (an extension of Monte Carlo search) was first applied to Go, but the strongest programs were still at the kyu level not that long ago (at least on 19×19 boards).

In contrast, the strongest programs these days are dan level and they seem relatively sharp, even in tactical situations.

To what extent do they make use of heuristics for shape, tesuji, life and death, the opening and so on?

Many programs use learned local patterns such as 3×3 for simple shape, and they modify the playouts to avoid some bad tactical moves.

Also, when there is a single important fight going on, the full board search will be able to analyze it quite deeply, and do well in the tactics. The problems start when there are several fights going on at the same time.

For the opening, some programs simply use large scale patterns to imitate popular openings played by human experts. But usually those are not absolute rules. These moves simply get a bonus, but the search can override them. So it is better than the hard coded ‘expert systems’ of the 1980s.

What other changes and improvements have helped computers get to their current mid-dan level on larger boards since then?

I think many factors are involved. Better patterns and rules as above, better search, better parallel scaling, several years of testing, debugging and tuning the programs, and better hardware all help.

What are the pros and cons of combining a knowledge based approach with a Monte Carlo approach?

Pros: it can patch up the weaknesses in the simulations to some degree. It helps the search focus on ‘probably good’ moves.

Cons: you lose speed, and may introduce other biases in the search.

Computers vs humans in the game Go

What are the top Go programs currently good or bad at compared to human players of a similar level?

For example, they seem to be better at fast games (which surprises many people). They also seem to have problems with wasting ko threats and with fighting ko in general.

Do you have any theories about why this is?

The programs are generally good in overall balance and counting. They know what it takes to win and will not lose quietly or be overly aggressive when they are ahead. They will also avoid most blunders and never get tired.

There are many situations where humans have specialized knowledge, but current simulations are not good enough to solve them.

If these remain out of reach of the search, then the programs get into trouble. Examples are seki, semeai, ko, nakade, or situations with several weak-looking but settled groups of stones on the board.

And yes, the problem of wasting ko threats is particularly annoying for the programmer.

Will computer Go strength scale?

Since the strength of modern Go programs scales much better than it did before, does that mean it’s just a matter of waiting for hardware to improve enough to challenge professional Go players?

Or do you expect the current approach to plateau at some point, until a new breakthrough is made?

Experience in many other games tells us that steady progress in algorithms combined with faster hardware will eventually be enough. I don’t think that Go is ‘special’ in that regard.

Still, I would like to see one or two more ‘big ideas’ emerge in order to make faster progress. Maybe one more big idea is enough to seriously challenge humans on 19×19.

computer go processing power 550x412 picture

A graph showing how computer Go strength scales with processing power. The x-axis represents the number of simulations (using a logarithmic scale) the y-axis is ELO rating.

(The graph above was created by Don Dailey in 2008.)

The future

What sort of problems or improvements are Go developers and researchers currently working on?

Many people are working on practical, step by step improvements such as better knowledge for playouts, or better parallel scaling.

Personally, I am very interested in adding a third dimension, local search, in addition to global (full board) search and simulations. This is what I do as a human player, and it allows me to keep track of several simultaneous fights.

However, it is a very difficult problem and it’s hard to come up with effective algorithms that work in a similar way.

What do you think we can learn, or have learned, from AI research into Go, that can be used in other areas?

The main surprise for me is how it works so well, with comparatively little domain-specific knowledge. I believe this is very encouraging news for people working on other hard problems where domain knowledge is difficult to get or to formalize.

In the last few years we have seen an explosion of new applications of Monte Carlo search methods, not just in other games, but also in areas such as energy management, optimization and automated planning.

What can people (particularly young people) do if they have an interest in computer Go and want to learn more about it?

There is a lot of information out on the internet, but it is not particularly well organized and much of the material is outdated or of poor quality.

The best way to learn would be to study with one of the active research groups, such as the MoGo group in Paris Sud, the Dutch groups in Maastricht and Tilburg, or of course our own group at the University of Alberta.

Other ways to get started are to write your own program, or study and contribute to the open source programs such as Fuego, Pachi or Orego.

There is a computer Go mailing list which keeps archives of previous discussions. Finally, there are some overview articles about computer Go, and lots of research papers.

Thanks for taking the time to talk to us today.

You are very welcome.

For further reading you might also want to check out Martin’s computer Go blog, results and analysis from Zen19′s recent match with John Tromp and Chris Ball’s recent article: Computers are [now] very good at the game of Go, which was published last week.

As usual, please feel free to leave comments and questions below.

About David Ormerod

David likes teaching, learning, playing and writing about the game Go. He's taught hundreds of people to play Go, including many children at schools in Australia. In 2010 David was the Australian representative at the 31st World Amateur Go Championships. He's a 5 dan amateur Go player and is the editor of Go Game Guru. You can find David on Google+ and follow Go Game Guru on Facebook, Google+ or Twitter.

Comments

  1. Hi David, it is a good interview. Thanks a lot.

  2. Thank you very much for a very informative piece of writing. The interview dispelled a lot of the misconceptions I had about computer go in an easy-to-understand manner. Great read.

  3. Nice interview, David. I didn’t know anybody working on computer go was such a strong player. A question I would ask is, have we learned anything about go from the way computers play it?

    I don’t know a whole lot about chess. But my impression is that computers beating humans at chess has had a less than salutary effect on the game. Computers make winning moves that humans don’t understand, and humans are reduced to blindly imitating them. I hope this doesn’t happen to go, but I suppose it is inevitable. “And he died with a hammer in his hand…..”

    • If you knew more about chess you would not have this impression. Chess engines have evolved from mere punching bags to valuable tools to test new ideas, prove that such move that was discarded by the engine is in fact better and so on…

      Chess is still very creative however lots of wrong moves do not make it to the tournament hall because they have been punished at home by the computer. Novelties are rare (though not so rare) but they tend to be powerful moves, then people set to counter them etc.

      For mere amateur such ad myself they are extremely valuable to understand the game. Evaluation graphs allow me to know when I made a blunder, enabling me to try to work out where I went wrong. If I fail I can display all the lines that prove me wrong. I can try variations etc.

      So my experience tends to show me that strong engines have a rather salutory effect on the game.

  4. Rodney Topor says:

    The interview was good, but Müller didn’t really explain how Monte Carlo search is actually used in Go programs, so I found it less interesting than I had hoped.

    • I can second that. I am not a mathematician, but a general idea of how it works would be interesting. Here is my maybe very wrong example to give an idea of what I mean, comments are welcome, of course.

      Say, the program selects in a position 10 moves that could be good, and does that too for the follow-up moves. These moves are randomly played 1.000 times, the move that gets the best result is chosen.

      Very wrong and too simple, I’m sure, but this kind of description could be interesting. I tried to Google this method, and I am lost within one paragraph.

      Does the program select in a given position choose the same move every time?

      By the way, quite an interesting interview, well chosen questions.

      Kind regards,
      Paul

    • David Ormerod says:

      I see, perhaps I should have asked more about that. I didn’t because that could be a whole separate article on its own.

      My understanding of the basic idea is that it generates a large game tree (you could imagine it to be sort of like the game tree you see in some sgf editors) in memory by playing out lots of random games with branching from the current position. The challenge in Go is evaluating the position, but if you play to the end of the game, you can evaluate by simply counting the score (and computers are very good at that).

      If you do enough simulations you’ll see that some moves lead to more wins than others, even with random playouts. Then, with that win/loss information, you can use an algorithm to aggregate the probability of winning back up the tree for each node until you get back to the current position (at the root of the tree). Then the computer just chooses a move with a high probability of winning.

      Because the playouts are random and the number of simulations is finite, the probabilities could be slightly different each time you evaluated the same position, so Paul, I think that would mean the computer wouldn’t always play the same move when there are a few different moves of similar value. It would depend on the position though, sometimes there’s only one move, like in a life and death problem :).

      I don’t know enough about how Monte Carlo methods are specifically used in Go to write a whole article myself, but if somebody does and wants to write something about it here, just let me know.

      You could also have a look at this: http://teytaud.over-blog.com/article-35709049.html – it has pictures if you’re having trouble visualizing what I’m talking about. I believe Teytaud is involved in Mogo.

  5. Very interesting! As a sample, I have used Monte Carlo methods for a toy raytracer: instead of sending 1 ray per point (as in standard raytracing), you send a random number of rays to each point (with a maximum of 10, for example)… The results are much better than with 1 ray-point, and is much faster than sending 10 for each point. In the raytracing world this was incredibly improved by the Metropolis-Hastings method, which was the one used in particle physics in the 50s.

    Just my small 2 cents of information!

    Ruben

  6. Éverson says:

    Nice interview! I am glad to know that computers go softwares are improving. I heard that playing against a “bot” is bad, but I think that playing against these new “dan bots” can help someone to improve. While you are loosing to a bot, it means that you have “something” to learn from it. The moment that you start winning it is the time to stop playing against the bot because now you learned how to defeat it, and besides humans, bots don’t learn from its mistakes (yet). ^__^

    • David Ormerod says:

      Yes, I’d agree that you could learn a lot by playing some programs now, particularly if they’re stronger than you. They play a pretty decent game these days. Sometimes I play them on 9×9 for fun and it’s quite challenging now. Don’t forget to play people too though :).

  7. Hello David,

    nice interview, can I translate it to russian and publish on http://weiqi.crimea.ua?

    • David Ormerod says:

      Yes, no problem translating it (or any other articles). Have a look at the creative commons license information at the very bottom of the page and please give credit by linking back to the original article when you do a translation.

  8. Wang Tsung says:

    What are the strongest commerical and non-commercial Go programs available at this time?

    • David Ormerod says:

      Zen (aka Tencho no Igo) would appear to be the strongest commercial program at the moment. We’ll have the English version in our Go shop as soon as it’s available. Apart from that I’ve heard that Crazystone and The Many Faces of Go are also quite strong, but there could be others too.

      Martin’s project, Fuego, is one of the strongest open source programs. Mogo is also strong, as far as I know it’s mainly used for research and hasn’t been commercialized, but you used to be able to download an older version of the binaries.

  9. thanks! it’s always interesting to know something behind the scene of computer Go

  10. Has anyone ever tried the HTM-paradigm by Jeff Hawkins from Numenta on Go?

    That would be my question to an expert in the field.

    • David Ormerod says:

      I’m not sure about that Dieter, but Martin said I could follow up with him if there are some good questions from readers, so I’ll ask him for you in about a week and publish an update in this article.

      As far as I know (as you’d probably know too) neural networks and things were being tried with Go (without much success) around the time when the Monte Carlo breakthrough happened. I guess that the focus shifted to what was working after that.

  11. Very interesting article.
    Thank you

  12. David, Thanks for explaining how the Monte Carlo method is used in Go programs. I still don’t understand why it’s as effective as it is, but the evidence is there.

    • David Ormerod says:

      I think most people are surprised at how well it works Rodney. I know I am.

      Even Martin said in the interview, “The main surprise for me is how it works so well, with comparatively little domain-specific knowledge.”

      The idea sounds crazy, but it works…

  13. balakirev says:

    It does seem a bit overwhelming at first, that I am so much weaker than some random simulations. However, thinking of it a bit more it really does make sense.

    Still, I really wonder if with all those billions of simulations that have been played out since the beginning of the monte carlo search method being applied to go, has any program in any game actually stumbled across “perfectly play” unknowingly? Or at least made a kifu worthy of a top pro?

  14. Lewis Truong says:

    Hey Dave, it’s been such a long time. How are you going buddy? How do I get into contact with you again? Not sure if you remember me? You taught me and my little brother (Matthew)

Speak your mind