In part one of this series, we created a simple TicTacToe framework and implemented a dumb AI.
Today we want to make this AI-Agent intelligent, at least a bit. We do this by using so called “heuristics”.
Heuristics can be summarized as “algorithms/methods which calculate in very short time a quite well solution”.
This may sound a bit theoretically and fuzzy, but if we compare heuristics to well-known “rules of thumb” the principal becomes clear.
Rules of thumb in general provide a simplification of a problem and/or its solution.
Let us have a look at some rules of thumb:
- Sleeping between 7-9 hours is healthy
- 1000ml water weights 1000g
At first we have a look at the first rule of thumb. Everyone knows that 7-9h of sleeping time is healthy for the majority of people, but maybe a minority only needs 6h of sleep or maybe even 10h.
These people may be rare for sure, but they exist. The same holds for 1kg water = 1l water. This equation is only true for “perfect conditions” (perfect air-pressure, temperature, dissolved gasses, …) but we still use this approximation to calculate different kind of things or to estimate the weight.
Obviously rules of thumb are approximating the real world and in most cases do well enough for our needs. However, they are still only approximations of reality!
Heuristics can be seen as rules of thumb in the field of AI. For TicTacToe some heuristics (= rules of thumb) could be:
- If the enemy needs only one X left to win, place O to deny it
- Always place in this row/col/diagonal which has the most O of you
These heuristics cover many different cases. Of course not all cases (e.g. “If the enemy needs only one X left to win, place O to deny it” –> Only if you cannot win by yourself), but enough to create a reasonably AI.
So an AI using heuristics is basically nothing more than an AI using different rules of thumb to play against you. This approach may seem very dumb, but actually it works quite well for bounded effort.
In most cases when using heuristics in the field of AI, you do not want to model them by hand using a lot of Ifs and Whiles. Instead you want to calculate for every next action a “quality value”, which often is called “fitness” (of an solution/action).
An AI-Agent calculates for every next possible action the fitness by using heuristics and choses this action, which has the highest fitness.
We will have a short look at Pseudo-Code to get a better insights of heuristics for a 4×4 TicTacToe example.
Imagine we are in the current situation:
And we use these rules of thumb:
- The more Os in one col/diag/row the better
- If you are able to get the mid, try to get it
- If there is only one X left for lose/win, always deny it
Implementing this as Peudo-Code may look like this:
FOREACH c IN cells
if(cell==midCell AND cell==empty)
// Add 10 because we really want the center cell
if(countXsInRow(cell.row)==3 OR countXsInCol(cell.col)==3 OR countXsInDiagonals(cell.diags)==3)
// Add 100 because we ALWAYS want to deny it
Now we calculate for every cell in our situation the fitness (small number in bottom right of each cell) and select the empty-cell with highest fitness.
You can try this new AI in the following 4×4 TicTacToe and you will see that these easy rule of thumbs make the AI playing already pretty well.
In next step we will make this AI even better by combining rules of thumb and searching for the perfect action.