Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I dont think they were bugs in a traditional sense. I think AlphaGo picked moves to try and maximize the probability of winning, and at some point that was only by the opponent making a suboptimal response. I remember reading somewhere most of it's training data is from amateur games. The model doesn't have a prior that AlphaGo is playing a professional who won't make a bad response. It probably would have resigned a lot earlier with that prior :)

Another thing to keep in mind is that AlphaGo has no "memory", so every turn it looks at the board fresh. This means if the probabilities are very close you could have it jump around a bit either due to numerical noise from floating point calculations, model errors, or just tiny differences in probability making the behavior appear erratic and quick to change "strategy".



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: