from Guide to Hacking on Jan 28, 2024
Designing the TicTacToe from hell
TicTacToe is definitely a timetested puzzle, apparently having existed since ancient Egypt according to an ethnomathematician named Zaslavsky. For most of us, it's at least been an integral part of our childhood years.
However, there's one critical and famous downfall to TicTacToe: Optimal players will always end in a standstill, with the game coming to a draw as both players fill the board without a single 3inarow.
Goal: Let's make victory possible
Partly to address this issue, many TicTacToe variants surfaced over the years, with various changes to the game's objective, board size, and even the movement mechanics. However, what if we didn't need more complexity? What if we could make a tiny tweak, so that TicTacToe rewards the best players with victory, even against an optimal AI?
It turns out that we can: All we need to do, is to change the starting state, so that even against an optimal enemy, the first player could guarantee victory if they played cleverly.
"TicTacToe evolved" is not new
There are actually a large number of twists on the traditional TicTacToe concept. In short, these TicTacToe variants increase the complexity. With orders of magnitude more states and actions, there are more possibilities in the game — and ideally, more fun in the game too.
 Make the board bigger: Quixo expands the board to a 5x5, Ultimate TicTacToe expands the board to a 9x9 by composing nine tictactoe grids in a 3x3 supergrid, Gomoku expands the board to a 15x15, and Unrestricted ninarow^{1} expands the board to an infinite one.
 Change movement mechanism: In Ultimate TicTacToe, each player's moves determines which smaller board the other player can play in^{2}. In Quixo, players make moves by sliding entire rows or columns^{3}. In Notakto, both players use the same mark — e.g., both players use "X".
 Modify the objective: Quixo and Amöba require a 5inarow to win, Unrestricted ninarow requires an ninarow (as the name suggests), and ToeTacTic and Notakto inverts the objective so that players avoid making 3inarow.
The above are all creative variants of TicTacToe — they vastly change the dynamics of the game. However, the complexity of the games above are their downfall: The more complex, the harder it is to explain. I liked the simplicity of the original TicTacToe, so let's honor that and restrict our attention to TicTacToe modifications that are simple. This rules out any modifications that change movement mechanisms.
Since the alternative objectives and larger game boards are pretty simple to explain in a single sentence, we'll allow that. However, there's a different issue with these variants, and even the original TicTacToe: These games assume a suboptimal opponent to win.
 TicTacToe itself famously always ends in a draw when both players are playing optimally. In fact, Wikipedia even describes the optimal strategy for both sides in its TicTacToe article. In this way, a player with an optimal opponent can at best tie and at worst lose.
 Notakto will always result in victory for the first player on an odd number of boards and for the second player on an even number of boards. Like with TicTacToe, the Notakto Wikipedia article describes the optimal strategy. As a result, a player an optimal opponent and the ‘wrong' number of boards will definitely lose.
 Unrestricted ninarow, according to Erde et al^{4}, can be forever stalled by the second player if n ≥ 8.
In some of these scenarios, the game's setup alone determines the game's outcome, assuming your opponent is moving optimally. I'm interested in a different set of scenarios: where even if your opponent plays optimally, you could force a victory. This is where some of TicTacToe's variants present a solution.
 For Notakto, even with an optimal opponent, you could force a victory with the ‘right' number of boards. The unfortunate part is that the optimal Notakto strategy is too simple.
 For Unrestricted ninarow, you can force a victory or the first player if n ≤ 4.
These variants, where the player can achieve victory despite an optimal opponent, are the kinds that interest me. The question is: Can I give TicTacToe the same property? Without changing the board size, movement mechanisms, or objectives.
How many ways can you win?
TicTacToe is wellknown as a "solved" game. This means that it's trivial to explore all possible game states and make optimal algorithms play as either side. Fortunately, this means that I can actually enumerate all possible states.
Let's do just that, and understand the probability of winning from any possible TicTacToe state in the game. Given a tree of all possible moves, here's a quick snippet that counts up the total number of terminal states and their outcomes.
def set_win_counts(node):
"""Mark any paths where one winner is guaranteed"""
if node.winner > 1:
node.counts[node.winner] += 1
for child in node.children.values():
set_win_counts(child)
for i in range(len(node.counts)):
node.counts[i] += child.counts[i]
Using the script, we can now tally up the number of each possible outcome in a match of TicTacToe. There are 255,168 possible outcomes^{5}. Below, we use ‘O' to denote the first player:
===== Outcomes =====
* Player 'O' win: 131184
* Player 'X' win: 77904
* Both tie: 46080
These numbers are corroborated by a separate article on "Amöba"^{6} that independently computed the number of states. This gives me confidence that my game tree is correct. Using these counts, let's look at the probability of winning or losing, from certain boards. Start with this board:
['X', ' ', ' ']
['O', 'O', ' ']
['O', 'X', ' ']
By inspection, we can see that O is definitely about to win. If X stops a middlerow victory by O, O can simply place a piece in the topright — and vice versa. However, from this state, our game tree produces some unintuitive outcome counts^{7}.
===== Outcomes =====
* Player 'O' win: 12
* Player 'X' win: 2
* Both tie: 4
This seems incorrect; we know the X player is definitely about to lose. Why is there any outcome where they tie, much less where X wins? The problem is that to compute all of the possible game states, we considered all possible moves. We even considered moves where O next places a piece in the bottomright.
These aren't moves that any sane player in the right mind would take. So, let's recompute our game tree, with the two following assumptions of a sane player:
 If a 3inarow is possible, take it. Don't miss an obvious opportunity to win.
 If the opponent can win in the next move, make a move that blocks their victory. Don't blindly let the opponent win.
Now, we recompute our game tree assuming that players will always follow the above two rules, and this is what we get.
===== Outcomes =====
* Player 'O' win: 6
* Player 'X' win: 0
* Both tie: 0
This is looking much better. Now, we see that player ‘O' is guaranteed to win, just like our intuition says.
Hard TicTacToe states exist
Using this game tree, we can now look for states where one player forces a victory. There are unsurprising states where this happens — but there are also more surprising states, like this one:
['O', ' ', 'X']
[' ', ' ', ' ']
[' ', ' ', 'O']
Let's try playing this game. Very quickly, we'll see how this leads to a definite victory.
# X *must* place a value in the center to stop O
['O', ' ', 'X']
[' ', 'X', ' ']
[' ', ' ', 'O']
# O *must* place a value in the bottom left to stop X
['O', ' ', 'X']
[' ', 'X', ' ']
['O', ' ', 'O']
Now, we see the definite victory state for O. If X stops a bottomrow victory by O, O can simply claim a leftcolumn victory — and vice versa. Interestingly, it only took three moves to guarantee a definite victory for O. There's more than just one definitevictory state too.
 Of the
9*8*7 = 504
possible threemove games, 80 of them guarantee definite victory for the first player. This is 15.8% of threemove states that guarantee victory.  Of the
9*8 = 72
possible twomove games, 44 of them could lead to a definite victory state for the first player, as long as the first player picked optimally. This is 61.1% of twomove games where the first player has a chance to guarantee victory.  After two moves, there are seven remaining places to pick from. From the 44 unique twomove states, probabilities of making a move that guarantees victory range from 14.3% (only 1 of 7 moves) to 42.9% (3 of 7 moves).
Now, by picking this initial game state according to its guaranteedvictory probability we can control the difficulty of the puzzle.
The guaranteedvictory states
To share the 80 threemove states above, I'll need to share one way to represent the board state above, compactly. One approach is to first label all of the cells in the TicTacToe grid 0 through 8, like this:
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
Then, we represent moves from both players with a sequence of numbers, such as 028
. The oddnumbered numbers correspond to moves made by the first player, the ‘O'. The evennumbered numbers correspond to moves made by the second player, the ‘X'. Drawing this on a board, we get the state from above.
['O', ' ', 'X']
[' ', ' ', ' ']
[' ', ' ', 'O']
Using this, we can now represent any game state with a sequence of numbers. Using this mechanism, I can now write the 80 definitevictory states using the following notation:
023,028,051,054,056,061,068,072,073,074,107,127,132,150,160,167,182,187,205,206,231,234,238,270,274,275,281,286,305,316,320,325,365,370,385,386,416,418,432,438,450,456,470,472,502,503,518,523,563,568,572,583,602,607,613,614,618,650,654,657,682,683,701,706,721,728,738,756,761,781,814,815,816,820,827,832,834,837,860,865
These states are the shortest definitevictory states possible, using the simple agent rules above: (a) Complete 3inarow when possible and (b) when the opponent has a 3inarow opportunity, block it.
By picking this initial game state according to its definitevictory probability we can control the difficulty of the puzzle.
Without modifying the game objective, movement mechanics, or even the board size, we can now both (1) increase and control puzzle difficulty and (2) ensure victory is possible even against an optimal player. This accomplishes the simple TicTacToe modification we set out for.
Takeaways
With that said, there are only 28 possible initial game states to pick from. This is even ignoring symmetries in the game, so we really only have a small pool of possibilities to take from. Fortunately, even subject to the same constraints above — same board size, same movement mechanics, and same objective — we could still fix this.
TicTacToe has a large number of unexplored states that are "impossible" to reach, given the game's current rules^{8}. This means there's a rich pool of possible game states that we can further explore, for some interesting TicTacToe challenges.
This blog post has gotten fairly long, so we'll have to explore these unexplored states another time. For now, we now have a much more rewarding and challenging TicTacToe, even against the best of puzzleplaying AI.

Unrestricted ninarow is described in Wikipedia's entry on TicTacToe variants and in more detail, with a corresponding analysis, in "Combinatorial Games: TicTacToe Theory" by Jozsef Beck on page 59. ↩

In Ultimate TicTacToe, the player's move's relative location in the smaller board determines which of the smaller boards the opponent can play in. For example, if the first player places a mark in the bottomleft of any smaller board, the second player must now place a mark in the small board located in the bottomleft. ↩

In Quixo, players make moves by removing pieces — blanks, or marked with their symbol — from the periphery. The piece is then inserted back in at the periphery, but with the player's symbol on it, which slides either a row or column of pieces to fill the original gap. Figure 4 in a Quixo rulebook explains it nicely. ↩

Erde et al in "An ninarow type game" published by the Electronic Journal of Combinatorics specifically cites another paper for proving this "known" result. However, I couldn't find the original text for the cited text, which is
T. G. L. Zetters. 8 (or more) in a row. The American Mathematical Monthly, 87(7):575–576, 1980.
. The only table of contents I could find, which includes the American Mathematical Monthly for Volume 87, Number 7, published in 1980 doesn't include a publication from Zetters. ↩ 
To compute the number of each possible outcome in the standard TicTacToe game, use
python puzzle.py agent any
. ↩ 
Amöba is the Hungarian name given to TicTacToe. ↩

To compute the number of each possible outcome from this initial state, use
python puzzle.py agent any start 30674
. ↩ 
For example, any board with 3 O's and 1 X is impossible. Clearly, since players alternate, the number of O's and X's are at most one off from the other. ↩
Want more tips? Drop your email, and I'll keep you in the loop.