Hi there,

This is a short demo of an artificial intelligence bot, which learns to play tic-tac-toe using reinforcement learning.
I originally wrote it for Node.js, but rewrote it a little to make it function in a browser as well.

When you load the page, the matches are automatically played. If your browser feels a bit slow for a short time, it's probably because of all the calculations done in the background...

In a match of "player 1 vs player 2", the graphs represent the percentage of games a player has won. The line with the X's marks the scores of the first player, the line with the O's marks the scores of the second player, and the line with the diamonds marks the percentage of draws.

Every round, the starting player is changed, to keep things fair.

AI - AI

Battle of the brains; two AI bots combating each other.
We can clearly see that both bots play roughly equally well, and they want to win, not just play a draw.

Rematch? How many games?

AI - random

The random bot choses a random (free) field, the AI remains the same.
We clearly see that even with very little training, the bot already has slightly more chance on winning a game. Training rapidly improves this. Mind you; this graph has a little more input data than the previous, to show the continuing learning curve of the bot.

Rematch? How many games?

AI - dumb

The dumb bot choses the next free field (in reading order; left top first, right bottom last).
This graph varies a lot every time you render it. The AI always wins on the long term, but the first few games strongly influence the graph. Since the dumb bot plays very predictibaly, the bot can quickly learn.
Now what causes this difference? Well, if the bot wins a few times at the start, he quickly learns and keeps winning most future games. If the bot loses a few matches or plays very suboptimally at the start, his knowledge is "polluted" and his training becomes harder.

Rematch? How many games?