Image credit: Dota 2
This morning, on The International 2017 main stage, Dendi played a 1v1 with perhaps the toughest opponent he has ever faced.
His lane opponent in this special Shadow Fiend only-mid mirror match was no 10k MMR player, no TI winner, no Julz De Leon. Dendi instead faced off against a bot.
From the get go the OpenAI bot had a perfect creep block, and we don't mean just a really good creep block -- we mean a perfect 100% creep block all the way until just before the river.
And then the bot proceeded to toy with Dendi. It got near perfect CS and denies while also trading blows with Dendi's Shadow Fiend, using animation cancelling and positioning just as a pro player would. It got a whole level ahead of Dendi, and the moment everyone's favorite Dota 2 player made a misstep, the bot capitalized on it and won the round.
Dendi called it "terrifying" to play against.
The developers have said that the AI wasn't "taught" anything, and instead built in a way that allows it to learn from its own mistakes needing only minimal guidance. By playing essentially hundreds of 1v1s against itself, the OpenAI learned from its own mistakes and was constantly fighting against an opponent of equal skill.
After an hour it could beat the regular Dota 2 bots. After a week it beat Dendi.
People watching the 1v1 both in the arena and at home were left with mixed emotions. On the one hand, it's extremely exciting to see how far the art of programming artificial intelligence has come, and there are so many applications for a self-teaching AI even outside of Dota 2. On the other hand, does this mean that Dota 2 is solved?
Watching your favorite Dota 2 pro lose to a mere program is sort of disheartening. It can devalue the hundreds of hours of sacrifice that the pros put into perfecting their play (though to be fair, the programmers of OpenAI also put in the same amount of sweat, blood, and tears).
But there is an important lesson that can be learned from the OpenAI. As much as it seems to have an infinite potential, that doesn't mean that we don't have infinite potential as well, even as flawed humans.
The OpenAI didn't just practice randomly -- it always practiced against an opponent of equal skill. It can't learn from fighting against a weaker opponent, and more importantly, doens't need a stronger opponent than itself to improve. It just needs to find its fair match and isn't that what the MMR system already does, on a large statistical scale?
Sure you sometimes get matched against filthy 2k MMRs and frustrating 6k MMRs but on average you will actually fight against players at your skill level. If you ever start to become a better player, your MMR increases and you continue to be matched against your equals. You're getting the same training regimen the bot does.
Secondly - and this is absolutely crucial - the OpenAI adjust its performance based on its mistakes. It has to lose for it to learn. In fact, the more games it loses the faster it learns.
You can only learn from your mistakes, not from your victories, apparently.
And so Dendi lost to a bot, but should they ever face each other again, they'll both be better players. Dota 2 is secretly a win-win game if you have the right attitude and the strength to learn from a loss.
Now the real problem is when the bot starts to learn how to be toxic.