Though they are quite different from each other Chess, Checkers, Go, Othello, Connect-4
and Tic-Tac-Toe also have similarities. They are all two-player zero sum
perfect information games. The
term two-player just means that there must be two opposing sides (football is
considered two-player, for example). A zero sum game is one where an advantage
for one player is an equally large disadvantage for the other. Perfect
information basically rules out any game that has an element of chance. Yatzee? Right
out the window. Poker? Forget about it. Jenga? Not even close.
For games that have these properties it is possible to set up a game tree to aid in the
selection of the next move. For simplicity, consider the starting state of Tic-Tac-Toe to
be the root of a tree. The root has nine branches, each leading to a successor state. Each
of these has 8 branches leading to its successor states and so on. Some of the
paths through the tree will end before others (a winning state is reached before all the
slots have been filled) but some paths continue until depth 9 (or ply 9 in
game-tree terminology), when all the slots have been filled.
After having exhausted the search space of the game, it is easy to find the paths that
will lead to victory for either player. Knowing the path that X can take to the fastest
victory is generally of little use, however, because O can thwart X’s plans of a swift
victory any time it is her turn to move. Instead of traversing the path leading to the
fastest possible victory, X’s best aim is to pick a path where her worst outcome
will be victory (the best worst-case path). The Minimax game-tree search
algorithm is designed to do just this, and alpha-beta pruning improves on it. This article
tries to explain how both works.
The Minimax algorithm
Let us use Tic-Tac-Toe as an example when explaining how the Minimax algorithm works. The
player whose turn it is to move at the root is called Max, and all even plies in the
game-tree (i.e. the states where it is Max’ turn to move) are labelled accordingly. Max’
opponent is called Min. These names are adapted from their actions; Max is always trying
to maximise her score, Min is always trying to minimise Max’ score.
To determine who won, a function to evaluate a game state is needed. The evaluation
function takes a game state as its argument and should return a value indicating whether
the state is a win, loss or draw for the player whose turn it is at that state. The
function could for example return 1 if the state is a win, -1 if the state is a loss and
zero if the state is a draw. The evaluation function is applied to all the leaf states.
Leafs are game states without children, i.e. either a draw, or a win for one of the
After exhausting the tree and evaluating all leaf states, Minimax values are assigned to
the internal states (all non-leaf states) of the game-tree. The Minimax values found in
the leaf nodes are inverted and returned to their parent. Each parent picks the highest
value returned to it, negates it, then returns the result to its parent, et cetera. The
Minimax values returned to the root denote Max’ worst outcome for each corresponding move.
All that remains for Max is to pick a move where her worst outcome is a win (if such a
move exists–it does not in Tic-Tac-Toe).
The following method shows how the Minimax algorithm can be implemented in Objective-C. A
noticeable difference from the description above is that the Minimax value is negated on
return from the recursive call instead of before the return; this is how the Minimax
algorithm is normally implemented.
The next listing shows the special Minimax function applied to the root of the tree (i.e.
the position for which a move is sought). In discussions of the algorithm the notion of
any special treatment of the root is often omitted; it is included here for the sake of
Only when the search space is sufficiently small, like in our Tic-Tac-Toe example, is it
possible to exhaust it fully using the Minimax algorithm. For practical applications this
is almost never the case. Computers are not powerful enough to exhaust game-trees for
practical applications where game-tree search would be desired. For example, it has been
claimed that the game of Chess has more states than there are atoms in the known universe.
Suffice to say that waiting for a search of that magnitude to finish becomes impractical.
A simple way of ensuring that the search will terminate in a practical timespan is to set
a maximum limit on the depth of the search. The following method shows how the Minimax
algorithm presented earlier can be amended to unconditionally stop after reaching a
certain depth. Notice that the initial value for score has changed.
Since the search may be terminated before it has reached the leaf nodes, the end states of
many paths are lost. Thus the evaluation function will have to be enhanced: it must now be
able to indicate how good non-terminal states in the game-tree are, in contrast to simply
determining a win, loss or draw for an end state. Instead of returning -1, 0 or 1 the
evaluation function must now return a value in a certain range (say, -1000 to 1000)
indicating how good the state is. Performance of depth-limited Minimax algorithms greatly
depends on how well the evaluation function identifies strong states.
In the late 50s it was realised that it was not necessary to visit all the nodes in a
game-tree to correctly deduce its Minimax value. Uninteresting branches of the tree can be
pruned away. Remember that the Minimax algorithm produces the value of the best
worst-case. Alpha-Beta pruning terminates the search of a subtree as soon as it knows that
the worst-case for the subtree is worse than previously searched paths. The idea is that
if a path is worse than the current best path, time is not wasted trying to find out
how bad it is.
To accomplish the pruning mentioned above two bounds are passed to a modified Minimax
algorithm. The bounds are the highest (beta) and lowest (alpha) value that can affect the
Minimax value at that point, and are continually updated as the search progresses. Since
the Minimax value is negated at each step, the states of the bounds must also be negated
and their states switched as they are passed on to the next level. If the Minimax value
returned from a path is greater than or equal to the high bound, the path is pruned.
Here’s an example:
In a worst-ordered tree (where the paths are ordered so that no pruning occurs) the
Alpha-Beta algorithm visits the same number of leaf nodes as Minimax. On average it
performs a lot better. Given a perfectly ordered tree, where the branches are pruned as
early as possible, the Alpha-Beta algorithm can search twice as deep as the Minimax
algorithm in the same timespan.
This post has been adapted from a section of my 2003 BSc Artificial Intelligence report
on Generalised Game-Tree Search at the University of Westminster.