Lee Sedol Beats AlphaGo in Game 4

mikeyouse · on March 13, 2016

Relevant tweets from Demis;

    Lee Sedol is playing brilliantly! #AlphaGo thought it
    was doing well, but got confused on move 87. We
    are in trouble now...

    Mistake was on move 79, but #AlphaGo only came to
    that realisation on around move 87

    When I say 'thought' and 'realisation' I just mean the
    output of #AlphaGo value net. It was around 70% at
    move 79 and then dived on move 87

    Lee Sedol wins game 4!!! Congratulations! He was
    too good for us today and pressured #AlphaGo into
    a mistake that it couldn’t recover from

From: https://twitter.com/demishassabis

comboy · on March 13, 2016

I'll risk assumption that somebody from Deepmind team is reading this.

Guys, please, publish charts of win prob estimated by alpha go in time during these games. Some heatmap telling which moves did it consider as best for both sides during the games would also be cool, but that's surely more time consuming to prepare.

It would be great to be able to have such things for top pro tournaments in the future.

glasslion · on March 13, 2016

Actually, it's a tradition that both Players should replay the game after the match, discuss about good moves, bad moves and what they were thinking during the match.

I felt so sorry for Lee Sedol when I saw him lose the second match, facing an empty chair ,and he could only ask one of his friend to review the game.

https://zh.wikipedia.org/wiki/%E5%A4%8D%E7%9B%98

massemphasis · on March 14, 2016

Or the helpless look of Aja Huang... like "I have no idea how to decipher that neural network, that's part of why you are here".

ktRolster · on March 13, 2016

He made $30k for losing that game, so I don't feel too sorry for him.

vowelless · on March 13, 2016

If it really was about money, I don't think he would have spent so much of his life dedicated to Go ...

simondedalus · on March 13, 2016

yeah, not to belabor it, but if you can do something like be a champion in go or chess, chances are you have the mental skillset to do something exponentially more lucrative.

CydeWeys · on March 13, 2016

This is not true. Scientific studies have shown that skills in abstract strategy games are so specific to the game that they do not transfer to other tasks. The urban legend about Chess grandmasters all being geniuses is false, in other words. They're actually worse at most jobs than people who've been doing those jobs for awhile, because they lack the experience, and they aren't more predisposed to having a genius intellect that would let them overcome that.

A Chess or Go champion is probably doing the single-most lucrative activity that they are capable of.

V-2 · on March 13, 2016

What scientific studies are you referring to?

Of course chess grandmasters aren't all-around geniuses, but many traits that are prerequisites for a successful chess player certainly are translatable into careers in other fields, and correlate with above average brainpower (good memory, discipline, long attention span, spatial intelligence etc.)

Botvinnik was an accomplished engineer, Euwe had PhD in mathematics, Anatoly Karpov is a millionaire, interestingly enough a lot of recognized chess players had careers in music (like Taimanov or Smyslov)...

Being a grandmaster surely requires an above average intellect, there is no such thing as a chess savant. While it's an urban legend that Kasparov's (arguably the greatest chess player ever) IQ was in the ballpark of 190, he did clock at 135. Such a result is not unheard of, yet still placing him in top 1% or so.

CydeWeys · on March 13, 2016

There's a whole body of research out there, but I just scratched the surface of it. Here's the first result that's coming up from a random Google search. https://psy.fsu.edu/faculty/ericsson/ericsson.exp.perf.html The references should be worth exploring, and I bet there's more recent research that expands on it as well. Here's a quote from the paper:

> In a recent review, Ericsson and Lehmann (1996) found that (1) measures of general basic capacities do not predict success in a domain, (2) the superior performance of experts is often very domain specific and transfer outside their narrow area of expertise is surprisingly limited and (3) systematic differences between experts and less proficient individuals nearly always reflect attributes acquired by the experts during their lengthy training.

I will say this for Chess or Go grandmasters: They have drive and dedication (and in some cases, compulsiveness). That alone would probably allow them to do better than the average person in another field if they had pursued that field from the get-go. Also, I'd caution you against relying on hand-picked anecdotes; I could just as easily pick out a bunch of Chess players who weren't good for anything else. You'd need broad-based statistics.

gohrt · on March 15, 2016

You seem to be confusing two different claims:

1. A champion Go/Chess player could retire, and jump a successful career as <Something else>. You refute 1.

2. A champion Go/Chess player could have pursued a career in <Something else> from the start and made more money. You have not refuted 2.

For an example, many physics PhDs become successful software engineers and quants, making a mid-career jump to a different field.

V-2 · on March 13, 2016

Thanks for the link. Well, of course I just quoted a few examples without pretending it's legitimate piece of statistics, however, there hasn't been all that many world chess champions so far (and Botvinnik, Smyslov, Euwe, Karpov, belong in that category), and thus a few PhDs or successful intellectual careers is already enough to show that something clearly sets this bunch apart from the general public. You're less likely to find a plumber or a shelf-stacker in that bus (no disrespect for people in these professions). Whatever causes that is another matter. I agree with you that drive and dedication - and not necessarily some innate talents - could be the crucial factor in this.

rifung · on March 13, 2016

I'm not sure you're disagreeing with the person you're responding to. Just because the skills involved are very specific doesn't imply that the people with those skills only are capable of obtaining said skill.

What the person you're responding to means is that if they have the mental capacity (and also discipline) to play the game at such a high level, they could probably also excel in other professions if their goal was to make money instead of doing something they loved.

Of course we can debate whether they would achieve such success if they were doing something they didn't enjoy as much, but I think most people would argue that most chess or go champions could make more money or in other words do more "lucrative" activities if they chose to do them instead of dedicated all their time to the game.

CydeWeys · on March 13, 2016

You are relying too much on the concept of a generalized mental capacity, which many researchers (maybe even most) would say does not exist. You seem to be assuming that grandmasters are just amazingly smart and they happen to apply those smarts to one particular skill, but what's closer to the truth is that they happen to be particularly amazing at that one skill, in a way that doesn't really correlate strongly with having amazingly high skill in other areas.

waterhouse · on March 14, 2016

If you took the people who would have grown up to be chess grandmasters, intercepted them just before they started spending significant amounts of time studying chess, and had them instead spend all that time studying programming or physics or something, how do you think they would rank up?

You seem to be arguing against the idea, "A chess grandmaster is a genius and therefore can walk into Google and immediately start doing more and better work than most of their senior programmers". I don't think anyone believes this (correct me if I'm wrong).

I think a more serious idea is, "A chess grandmaster is a genius and therefore learns faster and has a higher performance ceiling and such than most people, and if they spent a couple of years learning to program, they could become an entry-level Google programmer, after which they would rise more quickly than most hires, and eventually would outperform most of Google's senior programmers."

By the way, I think most of the best chess players were extreme chess prodigies. (Just looked at Kasparov, Karpov, Shirov, Kramnik, Anand, and Carlsen's Wiki pages; all but Kramnik had the year listed, and they all became grandmasters around age 17-19, except Carlsen, who was around 14. Kramnik's page mentioned winning a gold medal for the Russian team at age 16, and that he wasn't a grandmaster when selected for the team and this was unusual.) I think this is consistent with them being highly gifted children, who choose to spend their time doing chess.

Lawtonfogle · on March 14, 2016

>I don't think anyone believes this (correct me if I'm wrong).

While not exactly what you wrote, people do think that someone is a genius at some subject can become a genius at another area with less work than it took for someone in either field to originally become a genius at that field, especially society groups the areas together (so sports star becoming master programmer is far less likely to be believed than chess master becoming master programmer).

marvy · on March 14, 2016

But what about https://en.wikipedia.org/wiki/G_factor_(psychometrics)?

erostrate · on March 13, 2016

> Scientific studies have shown that skills in abstract strategy games are so specific to the game that they do not transfer to other tasks

Do you have a reference for that please? I'm interested in the subject.

danneu · on March 13, 2016

Though being able to compel yourself to do great things seems like a completely different skill set from being able to do great things on someone else's dime or on a team.

For example, my brilliant programmer friend that can't hold a job. Or the artist that can only paint their own inspirations. Or the savant that doesn't get along with anybody.

yosefk · on March 13, 2016

I dunno, Bobby Fischer was kinda deranged, for instance, and that often hurts outcomes in otherwise "lucrative" positions. Incidentally, he is credited with raising chess player compensation through his demands.

Generally I'd guess you're right though.

iheartmemcache · on March 13, 2016

Maybe not a leadership position but I've worked on the engineering side of quant work and am familiar enough with fairly advanced Go / Chess players (e.g. national youth champions for age brackets for the US) to know that most of them could make enough to retire off one years salary during the boom years of algorithmic trading (~2003-2007ish).

I'm not sure how you define "deranged" (AFAIK, that's not a medically defined term within the DSM-IV) but most of those brilliant people end up being a little 'off'. My father was an academic, one of the people he went to graduate school with was working in Boston while I was a child. He was absolutely groundbreaking work but he's so difficult to collaborate with (think: the mannerisms of Richard Stallman) that he's been floating around universities until his welcome is worn out. He can figure out remarkable things in higher level computational chemistry, but he can't really figure out humans. Had he decided instead during the 80s to work at Renaissance instead of pursuing academic research, he almost certainly would be worth in the low hundreds of millions.

V-2 · on March 14, 2016

Well, mental issues did break his chess career just the same.

b3n · on March 13, 2016

With the number of tournaments he has won (many of which have a winner's purse of <$100 000) I suspect he is already a millionaire.

gohrt · on March 15, 2016

Those are the big tournaments -- do the little tournaments stack up, or is it a year of practice leading up to an annual big win?

webreac · on March 14, 2016

I can not find it using google, but there was a Go champion who had stop playing go during one year to earn money playing mah jong.

Evgeny · on March 13, 2016

False dichotomy? It is quite possible to enjoy an activity and gain money at the same time.

tibbon · on March 13, 2016

The best (like absolute best) football|basketball|baseball players in the world make approximately what per game?

mindcrime · on March 13, 2016

Well, defensive end Olivier Vernon just signed a deal with the New York Giants which will pay him an average annual salary of around 17 million dollars. And there was a huge signing bonus as well.[1]

NFL seasons are 16 regular season games, plus 4 pre-season games. And then there are the playoffs, which a given team might or might not make or advance in.

All told, given that the signing bonus is amortized over all the games he plays, the annual salary, etc., I think it would be fair to say that Vernon will make around a million dollars a game.

Aside: Vernon isn't necessarily "the best" DE in the NFL, but due to market forces and the way things work with the salary cap, free agency rules, etc., the contract he just signed is one of the largest for a defensive player in the league. QB's tend to make even more, but I can't recall a really high profile QB who has signed a big deal recently.

[1]: http://www.spotrac.com/nfl/new-york-giants/olivier-vernon/

cubano · on March 13, 2016

Osweiler QB from the Broncos just signed a huge deal with the Texans...$18mil/year but I haven't heard how much of that is guaranteed so I'm sure.

A really big issue with NFL contracts is this "guaranteed money" thing...I believe the NFL is the only major US sports league who give player contracts without it, so you have to take those salary numbers with a grain of salt.

stevehawk · on March 14, 2016

34 guaranteed over first two years

Pxtl · on March 13, 2016

Football is kind of an oddcase since it the most popular sport in the richest country, and it runs very few games per year.

Retric · on March 13, 2016

Best is around 10x that, but at the cost of their body and long term health. Nobody is playing professional football|basketball|baseball in there 50's, and even just 40 is pushing it. Where pro go players can 60+.

rsync · on March 14, 2016

Gordie Howe played in the NHL until he was 53.[1]

Not really refuting your statement, which is essentially true, but it's worth your time to read the wikipedia page.

He played on a professional hockey team with his two adult sons at one point(!). He played in the NHL in five different decades.

[1] https://en.wikipedia.org/wiki/Gordie_Howe

Retric · on March 14, 2016

Yep, it's interesting to see a few edge cases out there. There are a tiny number of people that came close to 50 in baseball:basketball:football.

  Julio Franco played baseball till 49.
  George Blanda played football till 48
  Nat Hickey played baseball till 45.

gohrt · on March 15, 2016

Interestingly, the world champions seem rather young.

balls187 · on March 13, 2016

If as many people who pay to watch (NFL|NBA|MLB) paid to watch Go, you'd likely see similar payouts.

I would like to see AlphaGo play 21 against Steph Curry.

lifthrasiir · on March 13, 2016

I.e. Something like [1] for every move. I was a bit disappointed to learn that such table is only available for that particluar move at the first reading.

[1] http://www.nature.com/nature/journal/v529/n7587/fig_tab/natu...

fdej · on March 13, 2016

Off-topic: we (the human species) have built an AI that can master the game of Go, but we don't yet have the intelligence to publish charts correctly on the web. That blur of a JPG is half a megabyte!

cooper12 · on March 13, 2016

Yeah honestly that would be much better presented as an SVG since its mostly solid colors and geometric shapes. Every browsers supports them, and all the big image editors as well; the trouble is that the rest of our software and culture hasn't caught on. Why am I taking bitmap screenshots of websites when they're almost completely text? We need to start building a tools ecosystem around vector graphics. An extra bonus would be a machine-readable representation of the board as well. We live in a digital age, but its almost like we're cavemen still dealing with the modern equivalent of analog formats. (And at least analog has fidelity)

dr_zoidberg · on March 14, 2016

Screenshots might not be the best example, since there may be some "shiny cute effects" going around. However, reading your idea sprang another one in my head: screenshots shouldn't be restrained to the same rendering resolution as the rest of the system/the screenshot resolution should be configurable.

I don't require the screenshot to be instantaneous, I require it to appear instantaneous. In that sense, if the whole rendering pipeline is working to give me a framerate of 60 fps, then I could spend ten times as much rendering a screenshot without that delay being noticeable. Also, why on earth does Windows (don't know the behaviour on Linux/Mac) apply ClearType to a screenshot? That has always bugged me, there are some situations where you tolerate it and others where it hurts.

massemphasis · on March 14, 2016

It didn't master anything. What the DeepMind team did was make an "AI" which means it can _imitate_ (to a certain degree) of "playing go".

taejo · on March 14, 2016

What's the difference between "imitating" playing go and playing go, when it can beat any one us at go?

magoghm · on March 13, 2016

That would be awesome.

toolslive · on March 13, 2016

"Do not anthropomorphise computers. They really hate that" (NN)

zsiciarz · on March 13, 2016

By NN you mean a neural network, of course? That makes even more sense...

toolslive · on March 13, 2016

NN = nomen nescio ("I don't know the name") You use it when you cannot name the source of a quote, or when you would like to protect the source.

https://en.wikipedia.org/wiki/Nomen_nescio

andrepd · on March 13, 2016

Wouldn't "-Unknown" convey the same sense without the possible confusion?

haswell · on March 13, 2016

I think this conveys a slightly different meaning.

When I see "-Unknown", I interpret it as "somebody said it, but no one is quite sure who".

"(NN)" in the current thread was used to indicate "I personally don't know", which does not imply the quote is unattributable.

simonebrunozzi · on March 13, 2016

NN is (was) also used in neapolitan comedy, derived from earlier use throughout the Roman Empire and Middle Age. The "figlio di NN", or "son of NN" means someone who was found and adopted (typically by nuns) and whose parents were unknown.

NN can be used in general for people whose origin is uncertain. I think in this specific case it's a bit misleading - although the wikipedia article seems to suggest that NN can also be used as a synonym for "unknown", although from a historical perspective it is a bit incorrect.

The literal translation from Latin creates some confusion if you didn't know the context.

I hope this helps.

solotronics · on March 13, 2016

So NN can be used as an school version of anonymous?

zanalyzer · on March 13, 2016

... wrote the bot trying to hide its identity

lazugod · on March 13, 2016

Using the acronym of the phrase defeats the purpose, though, since it looks like someone's initials.

theoh · on March 13, 2016

Or Netochka Nezvanova? https://en.wikipedia.org/wiki/Netochka_Nezvanova

nopinsight · on March 13, 2016

Not so coincidentally, "This was when things got weird. From 87 to 101 AlphaGo made a series of very bad moves." [1]

The bad moves in the eyes of humans could be risky bets or the horizon effect. [1] [2] AlphaGo's use of deep neural nets (value networks) to evaluate board positions should significantly help counter the horizon effect, but since move 78 by Lee Sedol which turned the situation around was unexpected by some top pros (Gu Li referred to it as the 'hand of god' [1]), the patterns which follow are likely rare in possible game states and therefore not strongly embedded into the value networks, leading to AlphaGo's loss.

I hope the DeepMind team will help enlighten us on this in the near future.

[1] https://gogameguru.com/lee-sedol-defeats-alphago-masterful-c...

[2] https://en.m.wikipedia.org/wiki/Horizon_effect

hibikir · on March 13, 2016

78 was hard, but not impossible: If you watch the AGA commentary of the game, they had two pros at the time 78 happened, and they found 78 as the best answer a few minutes before Lee did, expecting AlphaGo to go with a stronger, yet still good not good enough 79, that left the game even, instead of basically lost. Then they were elated about how AlphaGo seemed to have failed to read the whole thing, and instead of doing preparatory moves for a nasty ko fight, the best option at the time, AlphaGo just took a route that provided almost no compensation.

If anything, it seems to me that AlphaGo's problem here might be the time management: Seeing a really scary situation, Lee Seido just sank many minutes into reading the problem, going pretty much all the way to byoyomi time. A human, after seeing something like that, would figure out that their assessment of the situation and their opponent's is very different, and spend a lot of budget trying to figure out what was wrong. AlphaGo just didn't see the problem, and didn't just budgets its time to analyze the position to death. It moved slower than before, but not really that much, and ended up making moves a kyu player could see as terrible.

Either way, I'd love to see Deepmind giving us all a good postmortem of the 70-100 range of moves.

karmakaze · on March 13, 2016

I concur that better time management may have made a difference.

Beyond that, I think that AlphaGo may still be missing a type of component. From the descriptions of it, the policy network generates possible moves from board positions, and the value network evaluates the probability of desirable outcomes. How this is different than human play is that strategic assessment and planning are implicit in the middle layers rather than a 'conscious' element to searching and decision making. I'm not saying that this is a necessary component as AlphaGo has already done exceptionally well. I do believe this kind of 'middle-out' processing producing and evaluating strategic concepts could make it better handle unusual circumstances. Being trained on high amateur and pro games, it will best respond to the most conventional of those types of games, more unconventional the game becomes, the worse it would fare in terms of efficiency of move generation and choices of which to evaluate.

goPlayerJuggler · on March 14, 2016

I got something different from the AGA commentary. The pros did consider 78, but came to the conclusion that it didn't work! I'm just a 1kyu player but I too can't see any way to make 78 work after Black plays 79 as an atari at L10, as suggested by the pros. So I'd be very interested to see some analysis that shows how White could get a fair result after Black 79 at L10.

hguant · on March 13, 2016

It would be interesting to see a neural net evolve to take into account player state, not just game state. I play chess, and no where near professional levels, so I don't know if this anecdote is valuable, but if I see my opponent looking at a particular area of the board, I tend to take a second look.

I suppose beating humans isn't AlphaGo's primary motive though - learning to play a perfect game of Go in general is probably more difficult than playing the perfect game against a particular person.

CydeWeys · on March 13, 2016

Having the AI use eye-tracking to predict an opponent's moves is a truly terrifying thought. Just by tracking the eye with millisecond precision you could probably work out the strategy they were reading. It's so game-breaking that it shouldn't be allowed as an input, and indeed, is easily defeated with reflective sunglasses anyway (like a lot of pro Poker players wear).

The AI player can't give up information in this manner because it lacks eyes, so I'd say that it should not be able to use this information from the human player.

ludamad · on March 13, 2016

Seems like just another avenue to trick the computer

grumpopotamus · on March 14, 2016

This actually came up in the commentary. Michael Redmond actually mentioned that some amateurs watch to see where their opponent is looking, but he called it merely a "trick", and he said it is not useful in professional play.

TrevorJ · on March 14, 2016

That type of metagame seems useful though, in the sense that you can start to sense the state of mind of your opponent. If they are working harder than you think they need to, you might get away with a riskier move for instance.

karussell · on March 13, 2016

Just in case someone wants the commentary around this move 78

https://www.youtube.com/watch?v=yCALyQRN3hw&t=11413

panic · on March 13, 2016

Here's the Myungwan Kim commentary around that same move: https://www.youtube.com/watch?v=SMqjGNqfU6I&t=1h33m. He had been looking with Hajin Lee at variations involving the move beforehand (https://www.youtube.com/watch?v=SMqjGNqfU6I&t=1h28m1s), so he immediately noticed that AlphaGo made a possible error in its response.

thomasahle · on March 13, 2016

But Kim also jumped a bit backwards and forward for a few moves, until he finally decided it was a mistake. It didn't seem 100% clear.

kerkeslager · on March 13, 2016

Professionals are understandably hesitant to call AlphaGo's plays mistakes after the first few games, because AlphaGo had played a few moves that seemed like obvious mistakes (the eyestealing tesuji under the avalanche and the fifth line shoulder hit) but later proved to be very strong moves. Remember that when go is your career, your reputation is based partly on your ability to analyze games accurately, so calling a move a mistake and finding out later that it was strong would be a big embarrassment.

leereeves · on March 13, 2016

Any chance of explaining the mistake on move 79 to someone who's only played a few casual games of Go?

chillydawg · on March 13, 2016

AlphaGo enlarged a group that was at risk of atari (loss to opponent) instead of correctly trying to divert attention to elsewhere on the board or starting a new fight or simply writing it off. If basically added stones to a group that was almost certainly dead and allowed the opponent to punish it heavily, thus giving away more stones than it had to.

SilasX · on March 13, 2016

OT: What's with using monotype for quotes? That's breaks line wrapping and makes it hard to read on mobile or small screens. I don't get why people do it.

ScottBurson · on March 13, 2016

"People" don't do it; HN does it. Indented lines in a post are always monospaced, on the assumption, I suppose, that they're likely to be code examples. There's no way to turn this off AFAIK.

And no, HN does not use Markdown.

SilasX · on March 13, 2016

I know the formatting system. My question is why the users are formatting their text (e.g. with 4 places leading each line) *suc that HN renders it as monospace, which then forces a sideways scroll on mobile and small screens. That seems to make it harder to read for some with no corresponding upside.

ScottBurson · on March 14, 2016

Oh. Well, I think they just don't realize that HN will monospace it.

I agree with you, it's annoying.

SilasX · on March 14, 2016

They don't realize it even after seeing their post go live? And seeing what other people's monotype quotes look like?

I never like to assume malice, but ...

maxerickson · on March 13, 2016

The people indent the text (or fail to remove the leading white space when copying).

It isn't a huge time sink to wrap a paragraph in stars.

mst · on March 13, 2016

I think mostly people forget that > works and that therefore you can do

> quotes like this

since it's markdown - often you get commentish things that allow limited HTML plus indented code blocks, and people get used to using the latter as the only thing that works everywhere

ScottBurson · on March 13, 2016

'>' doesn't "work"; it doesn't do anything on HN. It's just a convention that readers recognize as indicating quotation.

HN does not use Markdown.

DanBC · on March 13, 2016

At least ">" doesn't break anything.

    Pre is broken on mobile, forcing users to scroll a horizontal line which is incredibly annoying if it's a long line.

Here's a screen shot: http://imgur.com/Ru56wMK

And pre with very long unbroken lines is even worse.

ScottBurson · on March 14, 2016

I wasn't expressing a value judgment, just trying to explain how HN works.

I agree with you that it's annoying when people post long indented lines.

choosername · on March 14, 2016

it can be touch-scrolled here, there just aren't scroll bars. Took me a while to find, but maybe your browser is buggy, eve.

krapp · on March 13, 2016

I noticed people were doing this and wrote a userscript a while ago to make it work. I kind of wish HN supported it.

https://gist.github.com/kennethrapp/7a21c0187fedd6f47e7c

mintplant · on March 13, 2016

I wrote in a while ago to ask that it be officially supported, but apparently that would break something that HN's spam filter relies on.

jkyle · on March 13, 2016

Looks good in desktop browsers.

nightpool · on March 13, 2016

not really: http://i.imgur.com/uxqERat.png (you have to be very careful about what indentation level you're at, how long your lines are, etc, etc, things that people just don't do most of the time)

hguant · on March 13, 2016

It's also customary when quoting large sections of text. I'm not sure if this carried over from email, or from academic texts (where if your quote is more than a line or two, it needs to be formatted differently).

mst · on March 13, 2016

I think mostly people forget that > works and that therefore you can do

> q

zouhair · on March 13, 2016

How frigging smart Sedol is?

Tempest1981 · on March 13, 2016

Kind of cool to know you're as smart as algorithms that require 1920 CPUs and 280 GPUs

avar · on March 13, 2016

They've said it doesn't require that, AlphaGo running on a single machine beats the cluster they're using 25% of the time.

sltkr · on March 13, 2016

So based on current data, Lee Sedol is exactly as good AlphaGo running on a single machine.

emcq · on March 13, 2016

Doubtful; I don't think comparing such small W/L distributions will be illustrative.

On the other hand, the Nature paper shows the single 8 GPU machine performs similar to the 64 GPU cluster, but the larger clusters perform a comfortable margin better. [0]

By a single machine winning many games relative to the distributed version, really it's just saying that the value/policy network is more important than the monte carlo tree search. The main difference is the number of tree search evaluations you can do; it doesn't seem like they have a more sophisticated model in the parallel version. The figure suggests that there are systematic mistakes that the single 8 GPU machine makes compared to the distributed 280 GPU machine, but MCTS can smooth some of the individual mistakes over a bit.

[0] http://www.milesbrundage.com/uploads/2/1/6/8/21681226/877172...

Houshalter · on March 13, 2016

Probably better. If we assume a uniform distribution across possible win rates of Sedol vs AlphaGo, then update it with bayes rule, we get 33% chance that Sedol will win the next match.

That's not factoring in other information, like Sedol now being familiar with alphaGo's strategies and improving his own strategies against it.

So there is a good chance he is now evenly matched with AlphaGo, and likely much better than the single machine version.

cjbprime · on March 14, 2016

There's a bit of a slight of hand in this statistic -- yes, they can do runtime on a single machine, but it took the compute power of a small country to train the neural nets that are loaded onto that one machine.

taneq · on March 14, 2016

That's not really sleight of hand. Lots of things take more energy to produce than to run. It's like saying a 400W electric motor can put out as much power as a fairly fit human, but it's 'sleight of hand' because it took a whole factory to make the motor.

Obi_Juan_Kenobi · on March 14, 2016

And it took decades of play for Sedol to become a top player. I find the similarities a mix of satisfying and amusing.

massemphasis · on March 14, 2016

You're conveniently forgetting that this "AI" is a representation of tens of millions of amateur plays which is far more than a few decades in total time. Not that Lee Sedol needs someone like me to defend him, but remarks like yours are very misleading.

A human is far, far, leagues, more efficient at learning than today's AIs. These AI requires millions of hours of man time of data to even come close to competing at the level of an expert person which did the same, and even arguably far better, in a "few decades".

lettergram · on March 13, 2016

Even on a single machine it has the "memory" of virtually any Go game AlphaGo could be fed.

massemphasis · on March 14, 2016

If you look at DeepMind's numbers, AlphaGo is meant to win more than 50% of the time against human players (expert). Right now, my opinion is that its actual capability in wins is less than 50%.

CydeWeys · on March 13, 2016

We're in the very infancy of the field though. Give it a year and it's likely you'll have a neural-net-based Go program that runs on a single machine that can defeat every single human professional. Human Go players have been getting better for millennia; don't assume that a program that's only two years old has no room for improvement left.

It took some serious hardware for Deep Blue to defeat Garry Kasparov. Now there are smartphone apps with that same level of Chess-playing skill. And if anything the AlphaGo approach is more amenable to running on lower-specced hardware without requiring help from Moore's Law (because you simply train it better).

massemphasis · on March 14, 2016

It is believed now that Lee Sedol has a really good ability in reading AlphaGo's moves now (see: Micheal Redmond and AGA Game 4 commentary). Game 5 is going to be very interesting. I'm pretty sure the DeepMind team is just as excited. After game 4, they were noticeably far more ecstatic in the after-game press review.

Hassabis and Silver kind of reminded me of developers given the details of a bug that was notoriously difficult to find.

edit; I can't wait for the reviews of the entire 5 game series. If a book came out I'd very likely buy it. A book discussing both Go and AlphaGo AI at the same time consisting of people among the top of their respective fields would be amazing.

techdragon · on March 13, 2016

I wish some of these had been re-tweeted by the @DeepMindAI account. I'll definitely be keeping an eye on Demis' account during game 5.

denim_chicken · on March 13, 2016

It feels really weird to see someone being showered with congratulations for beating a computer program. What exactly is he being congratulated for? For probably triggering and then capitalizing on a bug in AlphaGo's AI? For showing that human resolve, perseverance and a "fighting spirit" can trump a flawed AI, at least until the AI gets fixed? For giving DeepMind extremely valuable test data that will only accelerate the refinement of AlphaGo's AI? For helping to advance an amoral field of study that can potentially delegitimize everything that currently makes humans unique and extraordinary?

sheepdestroyer · on March 13, 2016

As noted by the live commentator, this has the opportunity to be the third revolution of the game of Go, opening again new ways of playing for the whole of humanity.

Imagine what AI, in different fields, means for humanity if it has so much to teach us, just by being able to "think" differently. I sure hope that one day they start writing philosophy and, doing so, potentially legitimize everything that currently makes humans unique and extraordinary.

ant6n · on March 13, 2016

What are the other two revolutions of game of Go?

sheepdestroyer · on March 13, 2016

cf post game 3 conference commentary : https://youtu.be/qUAmTYHEyM8?t=20639

(also during the actual game with more details, but I can not find the exact time again). Edit : can not find game3 commentary but during game 4 he is coming back to it again with interesting details as for why this is a great opportunity to inspire human players : https://youtu.be/yCALyQRN3hw?t=7097

Basically, once in ancient Japan and more recently a Chinese origin player in Japan, both incredibly strong players, surprised everyone with never before seen moves that subsequently where integrated in modern game theory. The hope is that the same could happen here.

choosername · on March 13, 2016

a computer that loves wisdom?

Bud · on March 13, 2016

Perhaps you'd like a shot at explaining why AI is an "amoral field of study", and why it could "delegitimize everything that currently makes humans unique and extraordinary"?

Dolphins are about as intelligent as us, too. Are dolphins amoral? Do they delegitimize Beethoven, Tesla, Gödel, Einstein, and da Vinci?

betenoire · on March 13, 2016

> Dolphins are about as intelligent as us, too. Are dolphins amoral? Do they delegitimize Beethoven, Tesla, Gödel, Einstein, and da Vinci?

If we invented dolphins, and they began to replace and displace us, then maybe? I don't think it is about the intelligence, but in how we use it of course. Dolphins don't really have any control of my life or those around me.

That sounded a lot more paranoid than I meant, I actually agree with what I think you are saying

thaumasiotes · on March 13, 2016

> Dolphins are about as intelligent as us, too

How are you making this comparison?

eru · on March 13, 2016

I think Bud wanted to make a different point. Ie that AI doesn't delegitimize us---whatever that even means. Dolphins were just an example. He could have used Neanderthals, or Aliens.

stray · on March 13, 2016

Dolphins definitely delegitimize Beethoven.

/s

esun · on March 13, 2016

Not really related, but dolphins are assholes!

http://www.deepseanews.com/2013/02/10-reasons-why-dolphins-a...

simonh · on March 13, 2016

Frankly many pre-modern, pre-civilizarion humans were assholes. Read many studies or accounts of life in relatively isolated tribal societies and hair curlingy awful accounts of violent and sometimes institutionalised abuse are uncomfortably common. Leave the safety catches off basic human urges for long and it can get pretty ugly.

ceejayoz · on March 13, 2016

> Frankly many pre-modern, pre-civilizarion humans were assholes.

While it's somewhat news to me, I'm very glad to hear we've moved past that.

Sharlin · on March 13, 2016

I hope that was irony.

ceejayoz · on March 14, 2016

Very much so. Many modern, "civilized" humans are assholes, and now we've got global audiences for it!

eru · on March 13, 2016

See https://en.wikipedia.org/wiki/The_Better_Angels_of_Our_Natur...

Sharlin · on March 13, 2016

Yes, I know, it's just completely insane to declare that we've somehow qualitatively moved past our brutish origins as long as torture, rape, and murder remain de facto extensions to politics.

I have no reason to doubt that quantitatively we have made some progress - although Pinker's arguments are most definitely not universally accepted in the scientific community.

gcr · on March 13, 2016

Lee Sedol will forever be known as the first human to defeat AlphaGo in Game 4, and potentially the last one too.

That's quite an accomplishment.

platz · on March 13, 2016

Maybe they should fork version 18 as the hard but human betable version of alphago

grumpy-buffalo · on March 13, 2016

How about just for playing a good game of Go?

denim_chicken · on March 13, 2016

So I suppose the fact that the opponent is a computer program should not be factored into the reaction to Sedol's win at all.

I hope your reductionist explanation is not accurate because it would imply that we are already so highly conditioned to machines and to AI that this match is thought to be no different from a match between two humans.

llamaimperative · on March 13, 2016

The machine beat him 3 times and was more or less expected to win again. It didn't, which would imply that Sedol played an exceptionally good game. Seems pretty obvious to me.

jdmichal · on March 13, 2016

You're acting like following arbitrary sets of rules to maximize a value function isn't something that computers excel at. The only thing that was holding up computers for Go was simply time; there were too many possibilities to consider. Humans are really, really good at heuristically trimming possibilities, but sometimes to the detriment of finding maxima.

marcus_holmes · on March 13, 2016

You know that computers beating us at chess allowed us to get better at chess, right?

eru · on March 13, 2016

Fan Hui, the professional player they beat late last year, already improved his game by quite a bit from the occasional match with AlphaGo (deepmind hired him as a consultant to test). He's won every single game in the last European championship, and moved from around top 600th player in the world to around top 300.

marvy · on March 14, 2016

Wow. I know nothing about go, but 600--->300 sounds like quite an improvement.

yongjik · on March 13, 2016

A quote by Lee Sedol from years ago (might be apocryphal, couldn't find the original source):

Q: We heard there's now an anti-Lee Sedol website in Korea?

A: I don't even have time for my fans. I don't care about haters. ("나를 좋아하는 팬들에게도 신경을 못 쓰는데 그들에겐 당연히 신경 끈다.")

whybroke · on March 13, 2016

>For helping to advance an amoral field of study that can potentially delegitimize everything that currently makes humans

Or, writing from the point of view of our mechanical successors to this world, for helping to advance a highly ethical field that could exterminate that genocidaly murderous evolutionary abomination that was the human race. Who incidentally thought they were extraordinary but couldn't even play Go.

Noge_Sako · on March 13, 2016

I don't think this comment should be so shouted down. Besides the politeness that typically comes with these events, it does feel strange to congratulate, on top of normal politeness, Se-dol's victory.

And for a game that was thought to be many decades away in terms of computer capabilities, this probably should be a time to think of the possible consequences of AI improvement and take a close look at it.

Artoemius · on March 13, 2016

I think (or at least hope) most of the downvotes have nothing to do with congratulations, and only refer to the part where AI is called "an amoral field of study that can potentially delegitimize everything that currently makes humans unique and extraordinary".

That line of reasoning would be worth a laugh, if it wasn't so widespread in the general population.

eru · on March 13, 2016

Perhaps still better than my friend, who thinks suffering is what makes humans special. Discussions with him are, well, interesting.

ehsanu1 · on March 13, 2016

Maybe introduce your friend to buddhist ideas of ending suffering via meditation. Might be another interesting discussion.

eru · on March 13, 2016

I think Buddhism is where he got that idea from.

Noge_Sako · on March 13, 2016

amoral

>Both have to do with right and wrong, but amoral means having no sense of either

Well that's true then! A nice quote is from Feynman in regards to scientific research.

"To every man is given the key to the gates of heaven; the same key opens the gates of hell. — Richard Feynman "

We really have no idea yet how this AI research will play out.

I don't quite like the line "Delegitimize all that makes people unique and special" But that's my only real disagreement.

aninhumer · on March 13, 2016

If developments in AI can delegitimise human intelligence, then it was never "legitimate" in the first place (whatever that means).

noobermin · on March 13, 2016

>For giving DeepMind extremely valuable test data that will only accelerate the refinement of AlphaGo's AI?

You're treating Lee Seedol as if he is a fixed dataset to be trained on. Why can't he also be a "NN" who can also "refine" his AI[0] and thus be hearder for Alpha Go to compete with?

You're putting too much faith in the machine and dropping the person, that might not be completely fair.

[0] Albeit not artificial in this case.

Artoemius · on March 13, 2016

> For helping to advance an amoral field of study that can potentially delegitimize everything that currently makes humans unique and extraordinary?

If you care about being unique and extraordinary more than about reason, knowledge, truth, the observable reality, and the search for what it really means to be sentient, then and only then you may call AI "amoral".

Also, you are being racist against artificial sentient beings, and being racist is hopefully not what makes humans extraordinary.

drdeca · on March 14, 2016

racist is, imo, a strange word to use there

te0x · on March 14, 2016

Yes to all of the above, except the last.

argonaut · on March 13, 2016

If it's true that AlphaGo started making a series of bad moves after its mistake on move 79, this might tie into a classic problem with agents trained using reinforcement learning, which is that after making an initial mistake (whether by accident or due to noise, etc.), the agent gets taken into a state it's not familiar with, so it makes another mistake, digging an even deeper hole for itself - the mistakes then continue to compound. This is one of the biggest challenges with RL agents in the real, physical world, where you have noise and imperfect information to confront.

Of course, a plausible alternate explanation is that AlphaGo felt like it needed to make risky moves to catch up.

jacquesm · on March 13, 2016

The same happens to people, especially people that study theory. You can totally throw them off their game by making a non-standard move, even a relatively bad one as long as it breaks their existing pre-conceived notions about how the game should progress.

Of course against a really strong player you're going to get beaten after that but a weak player strong on theory will have a harder time.

imron · on March 13, 2016

That's what Gary Kasparov attributes as one of the reasons he lost to deep blue.

dave2000 · on March 13, 2016

Well also that the machine could study his style of play, whereas he was playing a stranger; one whose style changed during each game thanks to humans hand altering the logic. Not really fair. At least, different to how humans play chess.

bonzini · on March 13, 2016

To be fair, Lee Sedol was also changing style during the game. The opening for game 4 is heavily based on the one for game 2, except when he made a mistake.

agumonkey · on March 13, 2016

Do you have a link toward this interview?

diafygi · on March 13, 2016

http://www.npr.org/2014/08/08/338850323/kasparov-vs-deep-blu...

thomasahle · on March 13, 2016

Wait, in the interview he says the opposite: "That move had no impact on me whatsoever. Move on."

agumonkey · on March 13, 2016

Thank you very much.

kotach · on March 13, 2016

What you are talking about here is called "label bias". [2] It is present only if training is done badly.

When you have a game of Go, or Super Mario level. You don't want to make your decisions by just checking the local features and doing them, because it can be the case that by compounding errors you end up in a state you never saw, and all of the future decisions won't be good.

One can avoid these situations by training jointly over the whole game.

For example, maximum entropy models can work for decision making problems but their training leaves them in a "label bias" state because the training is trying to minimize loss of local decisions, instead of trying to minimize the future regret of current local decision.

The solution to these label bias problems are Conditional Random Fields, or Hidden Markov Models. You could accomplish the same with Recursive Neural Networks if you trained them properly. For example, there is no search part (monte carlo tree search, or dynamic programming [viterbi] like it is in CRFs or HMMs) in RNNs but they are completely adequate for decision based problems (sequence labeling etc.). Why is that the case? Because search results are present in the data, there's no need to search if you can just learn to search from the data.

If DeepMind open-sourced the hundreds of millions of games that AlphaGo played, it is quite possible to train a model that wouldn't need a Monte Carlo search and would work quite well, because you would learn the model to make local decisions to minimize future regret, not to minimize its local loss. [1]

The only reason why reinforcement learning is used is because there are too few human games of Go available for the model to generalize well. Reinforcement learning can be used in the setting of joint learning because you play out the whole game before you do the learning. This means that you can try to learn a classifier that will minimize the regret by making a proper local decision. Although, as far as I know, and can see from the paper, they didn't train AlphaGo jointly over the game sequence.

But! Now they have a lot of data and they can repeat the process.

[1]: http://arxiv.org/abs/1502.02206

[2]: http://repository.upenn.edu/cgi/viewcontent.cgi?article=1162...

argonaut · on March 13, 2016

I don't think we're talking about the same concept. I'm not familiar with the concept of label bias, and the literature I'm familiar with has not referred to label bias as the problem I'm talking about. Also, I'm not sure how a problem with probabilistic graphical models translates to the neural net policies of AlphaGo.

I fail to see how a "per-state normalization of transition scores" translates to there being a bias in value networks towards states with fewer outgoing transitions.

kotach · on March 13, 2016

Yes, the "label bias" is more of a structured learning / joint learning term that is present in natural language processing. But reinforcement learning suffers only if you do the learning to minimize local loss of the decision (label) - if you try to build a classifier that minimizes its loss on local decisions, instead on sequence of decisions.

Their value policy network isn't trained jointly and can compound errors. There are approaches with deep neural networks that don't have a joint training but work pretty well. The reason is that networks have a pretty good memory/representation and by that they avoid much of the problems. But for huge games like Go it is quite possible that more games need to be played for these non-structured models to work well.

argonaut · on March 13, 2016

Again, I don't think we're talking about the same concept. I also fail to see how training over an entire trajectory is going to help you with trajectories you've never seen. Also, these nets are definitely trained with discounted long-term rewards.

kotach · on March 13, 2016

They train using trajectories but train them to guess the trajectory locally, not globally. Discounted long-term rewards are just a hack, they aren't joint learning.

The concept of label bias, or decision bias is a joint/structured learning concept. It is a machine learning concept, it has nothing to do with the application. There are training modes with mathematical guarantee that the local decisions will minimize the future regret.

Joint learning is done not on the whole permutation but on the markov-chain of decisions, which is sometimes a good enough assumption. For example, the value policy network of AlphaGo is percisely a Markov chain, given a state, tell me which next state has the highest probability of victory. The search then tries to find the sequence of moves that will maximize the probability, and then it makes the best local decision (one move). It works like limited depth min-max or beam search. They do rollouts (play the whole game) to train the value network, but it is now a question if they train it to minimize the local loss of the made decisions, or if they train it to minimize the future regret of a local decision. As I've stated before, minimizing joint loss over the sequence, or minimizing local loss over each of made decisions, is exactly influencing if there will be bias or not.

The whole point of reinforcement learning is to create a huge enough dataset to overcome the trajectories-not-seen problem. The training of the models for playing Go is entirely a whole different kind of a problem.

Now when they have hundreds of millions of meaningful games they can skip the reinforcement learning and just learn from the games.

The illustration of the "label bias" problem is available in one source I referenced. Terms like compounding errors and unseen state are there. The "label bias" is present only in discriminative models not generative ones. Which means that AlphaGo - being a discriminative model, can suffer from "label bias" if it wasn't trained to avoid it.

argonaut · on March 13, 2016

Yes, I'm pretty sure we're not talking about the same thing. I'm precisely talking about the trajectories not seen problem. Nothing is going to save you from the fact that the net has not seen a certain state before.

kotach · on March 13, 2016

That's not really a problem. Given a large enough dataset you want to generalize from it - there are always states not present in the dataset - the whole point is now to extract features out of your dataset to allow generalization on unseen states. Seeing all of the Go games isn't possible.

The compounding errors problem that stems from decision bias isn't because you haven't seen the trajectory, it is because the model isn't trained jointly.

We're talking about the same thing. You just aren't familiar with the difference present between joint learning discriminative models and local decision classifiers (Markov entropy model vs conditional random fields - or recursive CNNs trained on joint loss over the sequence or recursive CNNs trained to minimize the loss of all local decisions).

In the case of Go, one would try to minimize the loss over the whole game of Go, or over the local decisions made during the game of Go. The latter will result in decision bias - that will lead to compounding errors. The joint learning has a guarantee that the compounding error has a globally sound bound. (proofs are information theory based and put mathematical guarantees on discriminative models applied to sequence labelling (or sequence decision making))

edit:

Checkout the lecture below, around the 16 minute mark it has a Super Mario example and describes exactly the problem you mentioned. The presenter is one of leading figures in joint learning.

https://www.cs.umd.edu/media/2015/12/video/17235-daume-stuff...

It is completely supervised learning problem. But, look at reinforcement learning as a process that has to have a step of generating a meaningful game from which a model can learn. After you have generated bazillion of meaningful games you can discard the reinforcement and just learn. You now try to get as close to the global "optimal" policy as you can, instead of trying to go from an idiot player to a master.

Of course, the data will have flaws if your intermediate model plays with a decision bias. So, instead of training the intermediate to have a bias, train it without :D

argonaut · on March 13, 2016

Yes, Hal Daume is referring to the issue I brought up. I'm not interpreting his comments as referring to issues with training the model jointly - he's referring to exactly what I'm describing - never having even seen the expert make a mistake. The only solution is to generate trajectories more intelligently (which is in line with Daume's comments).

kotach · on March 13, 2016

Yes, it is true. In the case of Super Mario he does the learning by simulating level-K BFS from positions that resulted in errors (unseen states) and thus minimizes the regret for the next K moves.

Although, if you checkout his papers, the problems I've talked about, when you have more than enough data and when you know you should be able to generalize well you still can get subpar performance if you don't optimize jointly. AlphaGo model isn't optimizied jointly but its power mostly lies in the extreme representation ability of deep neural networks.

markeroon · on March 13, 2016

This is great stuff, thank you.

dasyatidprime · on March 13, 2016

Both are plausible for humans as well, I would say (in a more general sense). Certainly mistake spirals from not quite knowing how to deal with the consequences of the first mistake have happened to me personally, and I recall descriptions of people in poverty having to play more aggressively to get a shot at the part of civilization they want to be in, though unfortunately I don't have a source.

Queribus · on March 13, 2016

May well also depend to a not so "exact" forward pruning

julie1 · on March 13, 2016

It is just a complex system leaving one of this basin of attraction to try to reach another one. (probably because of an evaluation of risk that deemed it).

And for a complex system transitions happens to be sensitive to the conditions and with quite a lot of impredictability.

AI cannot do smooth transitions because they lack the intuition of what smooth means, and that's how to win against them.

1) identify a basin of attraction(apparition of a bounded domain of evolution in a space phase) 2) set the AI in a well known basin by tricking it; 3) imbalance the AI by throwing garbage behaviour that kick him out of the basin in a random direction 4) let the human win in the chaos that ensues.

Of course it is better done with a software to help you. A real time spase phase analysor.

The point is like in a lot of domain, construction of an AI requires more energy than a software for sabotaging it.

But once you get the framework of thoughts to win against an AI you can get all the AI.

diskcat · on March 13, 2016

There is a possibility that the AlphaGo team, knowing that it has already won a decisive victory, wanted to spare the champion a crushing 5-0 defeat and so commanded the AI to lose on purpose.

amalcon · on March 13, 2016

The AlphaGo team has stated that they no longer have the ability to make changes to the AI used for the match. The rules of the exhibition match prohibit that.

They maybe could have anyway, but it would be cheating: just the same as if they'd Mechanical Turk'ed it by e.g. having Ke Jie actually choose the moves to play.

Udik · on March 13, 2016

Yesterday I kind of entertained the thought that they could have tried to make Alphago waste a move somewhere at the beginning of the game to give Sedol the equivalent of a stone advantage and see how Alphago could handle that. But it's clear that any move like that would have been immediately detected by the professionals who can read the game as I read the morning newspapers, so no.

mmf · on March 13, 2016

I don't know why you're being down voted. The fact that the human won, makes it now so much more interesting! It surely is the best outcome possible both for human go players and Google as well (having won the first 3 already).

aliakhtar · on March 13, 2016

Nah, they let the european champion lose 5-0.

argonaut · on March 13, 2016

Incredibly unlikely.

dannysu · on March 13, 2016

In the post-game press conference I think Lee Sedol said something like "Before the matches I was thinking the result would be 5-0 or 4-1 in my favor, but then I lost 3 straight... I would not exchange this win for anything in the world."

Demis Hassabis said of Lee Sedol: "Incredible fighting spirit after 3 defeats"

I can definitely relate to what Lee Sedol might be feeling. Very happy for both sides. The fact that people designed the algorithms to beat top pros and the human strength displayed by Lee Sedol.

Congrats to all!

rwallace · on March 13, 2016

Yep! As someone who was rooting for DeepMind, I like this result, for two reasons: Lee Sedol earned it - he's behaved like a true sportsman all the way - and it gives us some interesting information (yesterday we only had a lower bound on AlphaGo's strength; today we also have an upper bound).

vl · on March 13, 2016

>yesterday we only had a lower bound on AlphaGo's strength; today we also have an upper bound

I think it's premature, establishing bounds with good confidence interval requires tens or hundreds of games. Specifically, 3:2 result would be really inconclusive.

CydeWeys · on March 13, 2016

The first data point tells you more about bounds than any individual subsequent data point. Knowing that a human can defeat AlphaGo is enormously important.

To use an analogy, having confirmation of contact by even a single alien species would be hugely important, way more so than exactly nailing down the number of alien species. Knowing that something is even possible is oftentimes the most important aspect that needs to be ascertained, and contact (or a win, in Lee's case) does that unequivocally.

cLeEOGPw · on March 13, 2016

At the time chess was "the" problem, humans could beat computer too from time to time. Now the chess is solved. Same will likely happen to go in the years to come. But the point of this was not the go itself. It was the demonstration of potential capabilities of neural network and other machine learning algorithms.

CydeWeys · on March 13, 2016

Chess isn't solved. It's just that computers have gotten sufficiently better at it than humans that humans don't have a chance.

If Chess were truly solved, then you wouldn't be able to make a new AI program that could do better than even odds against the existing ones. But that's not the case, and incremental advancements in Chess-playing programs are made all the time. There are even tournaments where Chess programs play each other. If Chess were solved, such a thing wouldn't make any sense, just like how there are no Tic-Tac-Toe tournaments because that game is solved.

nullc · on March 14, 2016

Even that isn't strong enough. The game is determinstic.

If chess were _solved_ we'd know a strategy to allow one of the players (likely white) to always win, or for either of them to always force a draw. (and we'd know which of these strategies were possible for chess).

Consider, say there is a first move white could choose such that no matter what moves black makes, white will win. Then the first would be true. Instead consider, that there is no such move, and any first move has choices where either could win-- if some of those are ones which would force a black win, then the first is again true but for black. Otherwise, a draw can always be forced. These are the only possible outcomes for a solved game of chess.

dba7dba · on March 13, 2016

I've read some articles in Korean press that suggested AlphaGo team picked Lee as their match and not Ke Jie (currently #1 ranked, 19 years young) because there's a lot more public records of Lee's plays over his much longer pro career (nearly 20 years now). Thus more material for AlphaGo to train with and against.

So it's not totally arrogant of Ke Jie to suggest he could beat AlphaGo. AlphaGo has not much 'experience' in dealing with Ke Jie.

And Lee winning of game 4 shows a human is still indeed more capable than any AI. He basically reprogrammed his game on his own. Sorta.

kybernetikos · on March 13, 2016

That was specifically discussed in the after match conference today. Demis said that it wasn't trained specifically against Lee but was trained against strong amateurs on the Internet initially before playing against itself. He said that even if they'd wanted to, alpha go requires a much larger number of games than are available by Lee Seoul to train so it wouldn't have been possible anyway.

phatfish · on March 13, 2016

It was discussed pre-match for game 2 or 3 as well, and Lee's games were described as "a drop in the ocean" I believe.

Came up because of an assertion from the interviewer that chess AI had been trained against it's opponent specifically.

Angostura · on March 13, 2016

Err, so if AlphaGo wins game 5, does that show 'an AI is still indeed more capable than any human?'

silasdavis · on March 13, 2016

Exactly. The faint whiff of bullshit.

CydeWeys · on March 13, 2016

Ke Jie just happens to be #1 right now. Lee Sedol is more impressive in that he's been a world-class Go player for much longer. Twenty years from now, if Ke Jie is as dominant as Lee Sedol is now, then he'd be a good pick then. But either way it's not a big deal; there's not that much daylight between the two, so picking the more famous and well-known one to go up against is a safe bet.

Personally I want to see a discussion game with the top Go experts (including both Lee Sedol and Ke Jie amongst others) competing against the next version of AlphaGo, in a game with much longer time limits.

fspeech · on March 13, 2016

I don't think top pros view it that way. He is of a new generation that play thousands of games over the internet; that study the game socially, while Lee is much more a loner. Whether Ke Jie will dominate as much as Lee did will say more about the strength of his competitors than his strength vs Lee.

bluecalm · on March 13, 2016

Another reason could be that Lee is a much bigger name in go than Ke Jie. Sure, Ke Jie is stronger now but Lee was a dominant/top player for a long long time.

Cacti · on March 14, 2016

Many of the reinforcement learning techniques require data sets in the billions to make much sense of it all. This is likely the reason why the AI had to play against itself so much, because there simply wasn't enough data in general, even including all recorded games.

So Lee's games are sort of a drop in a bucket as far the performance of the AI goes.

TrevorJ · on March 14, 2016

Interesting point here: if you are the best Go player in the world, and you play against an AI that can beat you, it seems really possible that it makes you that much better. It would be interesting to see what the win ratio would be after match 10, match 50 or so on.

fhe · on March 13, 2016

My friends and I (many of us are enthusiastic Go lovers/players) have been following all of the games closely. AlphaGo's mid game today was really strange. Many experts have praised Lee's move 78 as a "divine-inspired" move. While it was a complex setup, in terms of number of searches I can't see it be any more complex than the games before. Indeed because it was a very much a local fight, the number of possible moves were rather limited. As Lee said in the post-game conference, it was the only move that made any sense at all, as any other move would quickly prove to be fatal after half a dozen or so exchanges.

Of course, what's obvious to a human might not be so at all to a computer. And this is the interesting point that I hope the DeepMind researchers would shed some light on for all of us after they dig out what was going on inside AlphaGo at the time. And we'd also love to learn why did AlphaGo seem to go off the rails after this initial stumble and made a string of indecipherable moves thereafter.

Congrats to Lee and the DeepMind team! It was an exciting and I hope informative match to both sides.

As a final note: I started following the match thinking I am watching a competition of intelligence (loosely defined) between man and machine. What I ended up witnessing was incredible human drama, of Lee bearing incredible pressure, being hit hard repeatedly while the world is watching, sinking to the lowest of the lows, and soaring back up winning one game for the human race.. Just incredible up and down in a course of a week. Many of my friends were crying as the computer resigned.

phasmantistes · on March 13, 2016

Given the 78 was a "divine inspired" move, and Demis tweeted that AlphaGo's mistake was move 79, my guess is that Black 78 was not in AlphaGo's search tree at all. It had pruned that move as being too "unlike what a human pro would do", and had not considered play beyond that point. Then it had to suddenly backtrack, drop its entire search tree, and start rebuilding from scratch. It took only a couple minutes to do so, but that's still not enough time to build up the depth and breadth of a tree that it had been constructing up to that point in the game. So it ended up playing a suboptimal move simply because Lee Sedol played a move that was simultaneously brilliant/strong and novel/unexpected, and it had to respond somehow.

CydeWeys · on March 13, 2016

The problem may well have occurred before then, too. If move 78 was incorrectly pruned from the tree for being unlikely, then AlphaGo had many turns in which it should have been considering counter-moves to move 78 (and thus prevent the situation that allowed move 78 to even happen in the first place), but it didn't.

hatsunearu · on March 13, 2016

>"divine-inspired" move

Just letting others know, this expression is a rather common way of saying a single move that changed the course of the game. There were "divine-inspired moves" that AlphaGo made in the first three games too.

omphalos · on March 13, 2016

Well it's not exactly common. It's used to describe the very best moves at the highest level of play.

hatsunearu · on March 13, 2016

I mean yeah, the decisive move _I_ make when I play my noob games is hardly divine, but in top level play, it happens every game.

ktRolster · on March 13, 2016

AlphaGo's mid game today was really strange. Many experts have praised Lee's move 78 as a "divine-inspired" move.

Add to that the moves where AlphaGo basically threw away stones by adding to formations that would be removed from the table. Even I, a complete, lousy, amateur, could see that they were a mistake.

hibikir · on March 13, 2016

To be fair, those moves were made when AlphaGo was already behind. It's just not any good at dealing with being that far back. The AI just has no concept of what to do while behind: What a human would do is to go for positions that are very complicated, making the chances of sloppy play much higher. Instead, it makes moves that have to be answered in only one or two ways, but that are very easy to read by even an amateur human.

Training an ai to make good play in a bad situation would require it to train in ways that are very different than the AlphaGo vs AlphaGo training that it spent a lot of time doing. And why do that, instead of trying to make itself good while the game is even, or when it's winning?

It's a bit like how it's different to train in chess to play in pro games, vs training to hustle amateurs in the park: You are not making the best move, but a good move that will confuse the opponent the most. You are trying to exploit a bad opponent: Very different play.

nutate · on March 14, 2016

Crying tears of joy I hope... or are your friends computers?

jballanc · on March 13, 2016

So AlphaGo is just a bot after all...

Toward the end AlphaGo was making moves that even I (as a double-digit kyu player) could recognize as really bad. However, one of the commentators made the observation that each time it did, the moves forced a highly-predictable move by Lee Sedol in response. From the point of view of a Go player, they were non-sensical because they only removed points from the board and didn't advance AlphaGo's position at all. From the point of view of a programmer, on the other hand, considering that predicting how your opponent will move has got to be one of the most challenging aspects of a Go algorithm, making a move that easily narrows and deepens the search tree makes complete sense.

comex · on March 13, 2016

Or maybe it was just the computer version of grasping at straws. None of the future gamestates looked good, so it ended up picking whatever could at least theoretically lead to a comeback, even if that would require Lee Sedol to miss a completely obvious move.

simonh · on March 13, 2016

This looks like a possibility to me. I'm not sure that Alphago has any idea how strong an opponent it is facing at any given moment. It's played games against venues and it's played games against professionals. But if a set of complex advanced moves are still likely to lead to a narrow defeat, it might well pick some stupid moves that in theory could allow it to win back a dominant position if the opponent plays badly.Since it doesn't have a model of the competence of its opponent, that might appear to be a viable strategy because against some opponents and in some past games it's played, that could work.

This is really interesting, because forming a model of our opponent and tailoring our strategies appropriately is fundamental to how humans approach competitions.

scotty79 · on March 13, 2016

> Toward the end AlphaGo was making moves that even I (as a double-digit kyu player) could recognize as really bad.

Maybe its training was focused on winning, not loosing narrowly. So as soon as it became obvious it can't win. It was just making silly moves because it was less researched scenario.

Some people when they see they can't win they do silly moves just for fun.

nanofortnight · on March 13, 2016

They're not really silly moves, it's just that in order to get into a good state for AlphaGo, Lee Sedol would have to miss the obvious response to those moves.

DannoHung · on March 13, 2016

If he had made a dumb move and AlphaGo capitalized on it we would be talking about how the AI goaded him into a sense of safety.

ktRolster · on March 13, 2016

If he had made a dumb move in response then we would be talking about how he made a really really dumb move, that even an amateur could have seen.

The moves made by AlphaGo there were very bad.