Using rules of thumb we created a pretty good AI in the post before. However it is still easy to beat the AI once you understood that the AI is not able to detect incoming “Trap-Situations” where the Player sets up a situation in which the AI is not able to win or deny four “X”s in a row by using only one “O”.
The following img shows such an situation.
Besides this fact, you may also already suspect, that heuristic-driven AIs perform quite poor in complex games like chess.
They are fast and easy to construct and give better results than completely random agents. So if you just want to have a light-weight enemy, heuristic-driven agents may be your choice. In the next two posts, we have a look at a more advanced approach by combining rules of thumbs (= heuristics) and “Searching”.
The first step is to think about a new strategy. Using rules of thumb we were only able to look one action in the future. That is why the AI is not able to detect “Traps” because it is not looking far enough in the future.
This looking in the future can be easily done by using an algorithm called MinMax. The basic idea of MinMax-algorithm is that we want to search through all possible future states and chose the one next action which for sure leads to a victory.
This is important to note: We want to find the next action which for sure gives us a victory, no matter how perfect the enemy plays.
Of course there are situations where it is not possible to find a way to win, no matter how good the enemy is playing. But for now we keep to this precondition.
Imagine you are “X” and you are in the following situation:
In this situation we clearly see that placing our cross in the third row, third column, will win this game. Maybe our heuristic may see this, too. But we need a strategy to see traps coming in the future and our heuristics are not able to find these traps. The MinMax approach tackles this problem, whereas the basic idea is quite intuitive.
We just go through all possible game situations, which are reachable from the current situation and find the next action, which will guarantee us a victory. The whole gametree for our situation as described above looks like this:
The next step is to extract the next best action, which will always yield in a victory. We as human know the best next action quite instantly by just looking at the cells. However if we want to determine the best action by a computer, we have to think about a systematic way. This is where the MinMax-Algorithm kicks in. It consists of basically three steps:
- Give every End-Situation a number (Win=+1, Draw=0, Loose=-1)
- Beginning from the End-Situations pick this number, which will be chosen if both players play perfectly
- Propagate these numbers through tree until you reached the current situation and chose the next action which has highest number
This may sound a bit confusing but as always, an example will make things clear. The tree after giving every End-Situation a number depending on win/draw/loose and annotating it with a few more labels (we will explain the added labels in a second) may look like this:
Now we want to find the best action. The labels at the left side of the tree show which turn it is and what number this player chooses. This means if you are “X” and you want to find the best action, you search for the maximum number (Max) and if you are “X” and its “O”s turn, O will choose the minimum number (Min) available (Remember our pre-conditon: O and X always play perfect!).
In the following the tree is processed step by step. In the next post we then finally implement MinMax-Algorithm and build an AI based on it.