After Lee lost the first three games, his chance of winning the five game match had evaporated.
His revised goal, and the hope of millions of his fans, was that he might succeed in winning at least one game against the machine before the match concluded.
However, his prospects of doing so appeared to be bleak, until suddenly, just when all seemed to be lost, he pulled a rabbit out of a hat.
And he didn’t even have a hat!
Lee Sedol won game four by resignation.
A new game plan
Having lost the match already, the heavy burden of expectation had been lifted from Lee’s shoulders and he was now able to play more freely.
Once again, Lee reviewed the previous game well into the night with other top professional players, looking for a chink in AlphaGo’s considerable armor.
On Sunday, March 13, 2016, he arrived at the Four Seasons hotel ready to do battle with AlphaGo once again.
The game plan they came up appeared to be to try a type of ‘amashi strategy’, which is among the more extreme styles of play (but is still a valid approach to the game).
To put it simply, it was close to being the opposite of Lee’s strategy from the previous day.
In game three, Lee (as Black) developed a large sphere of influence in the opening, provoking AlphaGo to dive in and bear the brunt of a severe attack.
While this was a good strategy for Black, AlphaGo managed the position surprisingly well. So well, in fact, that it was quickly able to shift out of a defensive stance and counter-attack.
In contrast, Lee’s strategy in game four was to take hard profit (territory) in the corners and on the sides of the board, allowing AlphaGo to develop influence over the center in compensation.
While Lee (playing White) was squirreling away ‘cash’, AlphaGo was making a promising but uncertain investment in the future potential of the top and the center of the board.
A future which Lee was planning to bet against.
Crashing the market
So why did Lee choose this plan?
Well, we know from the first three games that AlphaGo is very good at estimating its probability of winning.
It appears to be able to do this even more accurately than the best human players.
With the help of this skill, AlphaGo seems to be able to manage risk more precisely than humans can, and is completely happy to accept losses as long as its probability of winning remains favorable.
The Japanese have a name for this style of play, as it closely resembles the prevalent style of Japanese professionals over the previous few decades.
They call it ‘souba‘ Go, which means something like ‘market price’.
The essence of the idea is that you accept whatever seems to be the fair value of a position, rather than wagering the whole game on a single, complicated negotiation.
This typically leads to a drawn out game where you seek to get the slightly better end of the deal in the majority of trades, and end up with a winning position in the endgame.
As John Fairbairn has pithily put it, it’s like trying to win by arbitrage.
Of course, stock traders don’t stand a chance of beating modern trading algorithms at their own game, so we shouldn’t expect Go players to do so either.
What Lee and his friends had realized, was that they needed to completely upend the market.
Going all in
With his cash firmly secured under his mattress, Lee invaded Black’s large sphere of influence deeply with White 40, brazenly daring AlphaGo to attack him.
Continuing up to 46, Lee lightly scattered stones throughout AlphaGo’s potential territory at the top.
He was not intending to save any particular stone. He only sought to flexibly establish a presence in this space and tank AlphaGo’s earlier investment.
This came across as somewhat unreasonable, as AlphaGo had paid Lee good money for that potential!
However, this was what Lee wanted — to force an all or nothing battle where AlphaGo’s accurate negotiating skills were largely irrelevant.
AlphaGo didn’t have much choice in the matter. Either it could fight back and seek to gain compensation by attacking, or it could lose.
Attack light stones on a large scale
If AlphaGo had attacked any of Lee’s scattered stones directly, he would have been delighted.
It would be easy for a player like Lee to dodge any straightforward attacks, and leave AlphaGo with relatively little to show for its efforts.
A better strategy is to attack indirectly, threatening to surround all of the invading stones on a large scale and swallow them whole.
This puts more pressure on the opponent to defend somehow and was exactly what AlphaGo did with the shoulder hit at Black 47.
This move leaned against White’s stone at R11, while eyeing White’s stones at the top indirectly. It was similar to a strategy Lee had tried on the previous day.
In the moves that followed, Black sought to get in front of White, preventing Lee’s group on the right side from connecting up with and rescuing its allies in the center.
This tactic was a success for AlphaGo up to Black 67, at the cost of an acceptable four stone sacrifice on the right side.
AlphaGo took the lead with the knight’s move at Black 69.
A brilliant refusal to trade
Rather than completing the transaction, by playing 70 at O6, Lee immediately reduced Black’s potential territory in the center with White 70.
AlphaGo defended firmly with Black 71, which appeared to say “I’m winning,” but Lee probed its weaknesses fiercely from 72 to 76.
Finally, as commentators were lamenting that the game seemed to be decided already, Lee unleashed a brilliant tesuji at White 78 – the only move that would keep him in the contention.
AlphaGo failed to play the best response with Black 79, and its stocks suddenly crashed to pennies on the dollar.
Tesuji are the close range tactics of Go. If you imagine Go as a mental martial art, tesuji are the techniques of hand to hand combat. They are clever moves which contribute to Go’s beauty as an artform.
Strong players can usually spot a tesuji in the blink of an eye, through years of training and experience. So, I imagine, can AlphaGo’s policy network, otherwise it wouldn’t be able to play so well.
But not all tesujis are equal. Some can be found by most players. Others are so rare, so exquisite, that even the majority of professionals don’t see them.
The move Lee played was the latter kind.
Most other professionals who were commenting the game live didn’t see the move.
Gu Li 9p, a top Chinese professional who is Lee Sedol’s friend and fierce rival, was commenting on the game in China.
Gu described White 78 as the “hand of god,” and claimed that he didn’t see the move coming. Neither, it would appear, did AlphaGo.
AlphaGo blows a gasket
After 79, Black’s territory at the top collapsed in value.
White’s invading stones had managed to escape through a hidden tunnel, and White 92 made miai of H14 and J10 (meaning White could play one or the other).
This was when things got weird. From 87 to 101 AlphaGo made a series of very bad moves.
We’ve talked about AlphaGo’s ‘bad’ moves in the discussion of previous games, but this was not the same.
In previous games, AlphaGo played ‘bad’ (slack) moves when it was already ahead. Human observers criticized these moves because there seemed to be no reason to play slackly, but AlphaGo had already calculated that these moves would lead to a safe win.
The bad moves AlphaGo played in game four were not at all like that. They were simply bad, and they ruined AlphaGo’s chances of recovering.
They’re the kind of moves played by someone who forgets that their opponent also gets to respond with a move. Moves that trample over possibilities and damage one’s own position — achieving less than nothing
The game continued for another 80 or so moves, but it ended with AlphaGo’s eventual resignation.
A jubilant Lee Sedol had scored his first win against the machine.
What happened to AlphaGo
The question on many observers’ minds right now must be, what happened?
AlphaGo’s lead developer, David Silver, attended the post-game press conference, but he didn’t explain the problem in detail.
Silver and Hassabis said that they would have to wait until they returned to their office in the UK (after the match) to analyze the problem in detail.
We, in turn, will have to wait until they do so before we find out what happened.
Some educated guesses
We can’t yet know for sure what went wrong, but we can make some guesses.
We’ve seen these kinds of moves before, played by other Go AIs. They seem to be a common failure pattern of AIs which rely on Monte Carlo Tree Search to choose moves.
AlphaGo is greatly superior to that previous generation of AIs, but it still relies on Monte Carlo for some aspects of its analysis.
My theory is that when there’s an excellent move, like White 78, which just barely works, it becomes easier for the computer to overlook it.
This is because sometimes there’s only one precise move order which makes a tesuji work, and all other variations fail.
Unless every line of play is considered, it becomes more likely that the computer (and humans for that matter) will overlook it.
This means that approximation techniques used by the computer (which work really well most of the time) will see many variations that lead to a win, missing the needle in the haystack that leads to a near certain loss.
I’ve seen this happen many times when I played games against Monte Carlo bots for fun.
After I played a tesuji which the computer apparently didn’t see, it suddenly started making bizarre moves and wrote off the game.
It’s not possible to make this happen in every game, but it happens often enough that Younggil and I began to describe it as “going crazy” in honor of the computer Go program Crazy Stone.
I’m not sure why exactly this happens, and I hope the AlphaGo team or more knowledgable readers will chime in with their own thoughts.
Win small or lose big
As we’ve discussed before, the algorithms which guide computer Go players seek to maximize the probability of winning. The margin of victory or defeat is irrelevant.
This leads to a behavior where computers usually “win small, or lose big”. When computers are behind, they takes risks in an attempt to catch up, sometimes crazy risks which make it easier to shut them out of the game.
For the most part though, this is the behavior you want to see. Computers will never lose quietly like humans sometimes do.
The very bad moves, however, may be caused by something like the horizon effect.
When the computer’s prospects of winning suddenly plummet, after the opponent plays an unexpected move, there isn’t any variation that reliably leads to a win.
What Monte Carlo AIs sometimes appear to do in these kinds of situations is play meaningless sente exchanges which don’t achieve anything except to defer the loss until later on. To push it over the horizon, perhaps? Or, to win if the opponent fails to answer correctly.
This theory may be totally wrong, and is based on experience with the previous generation of Go programs.
I don’t know for sure whether it applies to AlphaGo at all. If you have a better explanation, or care to speculate, please let us know.
Can Lee Sedol do this again?
Certainly it’s possible, but it may be very difficult.
You need a special kind of position to pull this kind of stunt, and such situations don’t arise in every game.
AlphaGo may also be more resistant to this problem than previous Go AIs, because it also uses its policy network to select moves.
However, it must be pointed out that the policy network failed to properly score White 78 in the first place, probably because it was such an unusual move.
We will have to see what happens on Tuesday when Lee plays the final game of the match. If anyone can repeat this feat, he can.
Brief analysis of game four
Here is An Younggil 8p’s preliminary analysis of game four. Further game commentary will be posted over the coming week.
The opening up to Black 11 was exactly the same as in game two. Lee thought the opening in game two was good for White, so he chose this path once again.
White 12 was played to compel Black to extend at the bottom. If Black doesn’t play at 13, White’s pincer at K4 will be more powerful than in game two.
Black 23 and 25 were interesting, and White 26 to 39 was part of Lee’s strategy to take solid territory during the opening.
White dives in
White’s invasion at 40 seemed to be premature, and the sequence from Black 47 to 53 took control of the game.
Fighting spirit motivated White 62 and 64, but Black took the lead up to 69.
Lee Sedol’s brilliant move
After some preparation from 70 to 77, White 78 was a brilliant move which seemed to provoke a strange miscalculation on AlphaGo’s part.
White 82 was an excellent followup, but Black 83 to 85 were also well timed exchanges.
Black 87, 89, 97 and 101 were incomprehensible, and Black 105 was the last losing move. It should have been at J12 to capture White’s stones and redress the balance of territory.
If Black had played 87 around J14, the game would still have been playable.
A dramatic reversal
The game was reversed by White 92, and White established a clear lead with 110.
White 126 was a very big move which helped to ensure White’s advantage.
Black’s endgame from 131 to 141 was perfect, and the game became a bit closer again.
However, but White 144 was big, and Lee Sedol’s endgame was accurate under the time pressure.
Lee Sedol’s masterpiece
This game was a masterpiece for Lee Sedol and will almost certainly become a famous game in the history of Go. After his brilliant move at 78, Lee’s play was perfect.
Lee entered byo-yomi (one minute per move) at move 90, but his moves were calm and solid and his mental state was as hard as rock. It was very impressive to watch.
Lee’s strategy in this game ended up working well, and it looks like he found at least one of AlphaGo’s weaknesses.
Let’s see what happens in game five!
Don’t miss the last game
The fifth and final game will be played on Tuesday March 15.
Can Lee Sedol pull off another win?
Subscribe to our free Go newsletter for weekly updates, including news and detailed commentary of the AlphaGo match.
Lee Sedol vs AlphaGo – Game 4