Nim is a mathematical strategy game in which two players take turns removing items from one of several heaps. For this practice activity, we will focus on a simple variant of the game (sometimes called 21) in which a single heap starts with 21 items and the players then alternate having a choice of removing up to three items from the heap. The winner is the player to remove the last item from the heap. This game has a very simple optimal strategy that we will not reveal until you have completed your implementation.
Your task for this activity is to write a Python function evaluate_position(num_items)
that uses a Monte Carlo simulation to compute a good move for a given number of items in the heap. We suggest that you start from this template that provides a function play_game(start_items)
which plays a game of Nim using evaluate_position
to compute the move for the computer and input to allow the player to specify a move.
The function evaluate_position
should choose an initial move in range(1, MAX_REMOVE + 1)
using the following algorithm:
num_items
the specified amount),TRIALS
games of Nim using randomly generated moves,To keep the performance of your code reasonable, we suggest that you use a while
loop that iterates until the number of items in the current game reaches zero for each trial. In particular, avoid storing the random moves for each game in a list since this will reduce the number of trials that evaluate_position
can process in a reasonable amount of time.
We suggest that you test your function using play_game(start_items)
where the value of start_items
is less than \(10\). If your Monte Carlo simulation is implemented correctly, evaluate_position
should reliably return the optimal move as described below when the number of item is less than 10.
After you have played several games, the optimal strategy for Nim may be apparent to you. If it is your turn to move, you should remove exactly enough items so that the number of items remaining in the heap has remainder zero when divided by four. For example, if the heap contains \(10\) items, you should remove \(2\) items to leave \(8\) items remaining since \(8\) divided by \(4\) has remainder zero. Then, if your opponent removes n items where \(1 \leq n \leq 3\), you should remove \(4 - n\) items to restore heap to a state where the number of items remaining has remainder zero when divided by four. Repeating this process ensures that you can always remove the last item(s).
If the number of items in the heap happens to have remainder zero when it is your turn, you should remove a random number of items between one and three and hope that your opponent doesn’t know the optimal strategy for Nim. Then, if the remainder on your next turn happens not to be zero, choose your next move to make the remainder zero.
In our implementation using \(10000\) trials for each possible move, the Monte Carlo simulation does a poor job of determining the optimal move until the number of items lies in the range \(12 - 15\). At that point, simulation begins returning the optimal answer more frequently. By the time the number of items reaches \(10\), the algorithm reliably returned the optimal move.
The reason for the decay in the performance of the method lies in the behavior of evaluate_position
for small numbers of items. For \(1 \leq n \leq 3\) items remaining, the method computes that the initial move of n has a \(100 \%\) win rate. For \(4\) items remaining, the method chooses to remove \(1\) item since this choice results in the most random games won at that point. For \(5\) items remaining, the method removes \(1\) item since the random games starting with 4 items remaining tend to favor the player that goes second. More generally, the effect of small games being won or lost fades as the number of starting items increases leading to the decreased performance for larger numbers of starting items.
"""
A simple Monte Carlo solver for Nim
http://en.wikipedia.org/wiki/Nim#The_21_game
"""
import random
import codeskulptor
codeskulptor.set_timeout(20)
MAX_REMOVE = 3
TRIALS = 10000
def evaluate_position(num_items):
"""
Monte Carlo evalation method for Nim
"""
best_percentage = 0.0
best_move = 0
for first_move in range(1, MAX_REMOVE + 1):
wins = 0
loses = 0
for _ in range(TRIALS):
total = first_move
win = True
while total < num_items:
total += random.randrange(1, MAX_REMOVE + 1)
win = not win
if win:
wins += 1
current_percentage = float(wins) / TRIALS
print first_move, current_percentage
if current_percentage > best_percentage:
best_percentage = current_percentage
best_move = first_move
return best_move
def play_game(start_items):
"""
Play game of Nim against Monte Carlo bot
"""
current_items = start_items
print "Starting game with value", current_items
while True:
comp_move = evaluate_position(current_items)
current_items -= comp_move
print "Computer choose", comp_move, ", current value is", current_items
if current_items <= 0:
print "Computer wins"
break
player_move = int(input("Enter your current move"))
current_items -= player_move
print "Player choose", player_move, ", current value is", current_items
if current_items <= 0:
print "Player wins"
break
# play_game(21)
print evaluate_position(21)