We have previously seen Tic-Tac-Toe in part 1 of this class. In this assignment, we are going to revisit the game and develop an alternative strategy to play the game.
For this assignment, your task is to implement a machine player for Tic-Tac-Toe that uses a Minimax strategy to decide its next move. You will be able to use the same console-based interface and graphical user interface to play the game as you did before. Although the game is played on a \(3 \times 3\) grid, your version should be able to handle any square grid (however, the time it will take to search the tree for larger grid sizes will be prohibitively slow). We will continue to use the same grid conventions that we have used previously.
This project does not require you to write a lot of code. It does, however, bring together a lot of concepts that we have previously seen in the class. We would like you to think about how these concepts are coming together to enable you to build a relatively complex machine player with very little code. Further, you should think about the situations in which Minimax or Monte Carlo might produce better/worse machine players for games other than Tic-Tac-Toe.
We have provided a TTTBoard
class for you to use. This class keeps track of the current state of the game board. You should familiarize yourself with the interface to the TTTBoard
class in the poc_ttt_provided
module. The provided module also has a switch_player(player)
function that returns the other player (PLAYERX
or PLAYERO
). The provided module defines the constants EMPTY
, PLAYERX
, PLAYERO
, and DRAW
for you to use in your code. The provided TTTBoard
class and GUI use these same constants, so you will need to use them in your code, as well.
At the bottom of the provided template, there are example calls to the GUI and console game player. You may uncomment and modify these during the development of your machine player to actually use it in the game. Note that these are the same calls we used previously for your Monte Carlo strategy, so they take an ntrials
parameter. You can pass anything you want as ntrials
, since you will not be using it for Minimax. In order to allow us to use the same infrastructure, we have also provided a move_wrapper
function in the template that you can pass to play_game
and run_gui
. This wrapper simply translates between the inputs and outputs of your function and those that were expected if you were implementing a Monte Carlo player.
Your machine player should use a Minimax strategy to choose the next move from a given Tic-Tac-Toe position. As the objective of this assignment is to help you bring together what you have learned, we ask that you do not search for pseudo-code to implement Minimax. At this point in the class, we hope that you can use the examples in the lectures and an English language description and be able to implement Minimax.
The general idea on Minimax is to search the entire game tree alternating between minimizing and maximizing the score at each level. For this to work, you need to start at the bottom of the tree and work back towards the root. However, instead of actually building the game tree to do this, you should use recursion to search the tree in a depth-first manner. Your recursive function should call itself on each child of the current board position and then pick the move that maximizes (or minimizes, as appropriate) the score. If you do this, your recursive function will naturally explore all the way down to the bottom of the tree along each path in turn, leading to a depth first search that will implement Minimax. The following page describes the process in more detail.
As you recursively call your minimax function, you should create a copy of the board to pass to the next level. When the function returns, you no longer need that copy of the board. In this manner, you are dynamically constructing the part of the game tree that you are actively looking at, but you do not need to keep it around.
For this mini-project, you need only implement one function:
mm_move(board, player)
: This function takes a current board and which player should move next. The function should use Minimax to return a tuple which is a score for the current board and the best move for the player in the form of a (row, column)
tuple. In situations in which the game is over, you should return a valid score and the move (-1, -1)
. As (-1, -1)
is an illegal move, it should only be returned in cases where there is no move that can be made.You should start from this code that imports the Tic-Tac-Toe class and a wrapper function to enable you to play the game. You may add extra helper functions if so desired.
You do not need to write a lot of code to implement Minimax, but it can be difficult to get everything working correctly. Here are some hints that may help you out:
SCORES
dictionary is useful. You should use it to score a completed board. For example, the score of a board in which X has won should be SCORES[provided.PLAYERX]
. If the game is a draw, you should score the board as 0
.SCORES
that we have provided you with, player X is always the maximizing player and play O is always the minimizing player. You can use an if-else
statement to decide when to maximize and when to minimize. But, you can also be more clever by noticing that if you multiply the score by SCORES[player]
then you can always maximize. Why? Because this has the effect of negating player O’s scores allowing you to maximize instead of minimize for player O."""
Provided Code for Tic-Tac-Toe
"""
# Constants
EMPTY = 1
PLAYERX = 2
PLAYERO = 3
DRAW = 4
# Map player constants to letters for printing
STRMAP = {EMPTY: " ",
PLAYERX: "X",
PLAYERO: "O"}
class TTTBoard:
"""
Class to represent a Tic-Tac-Toe board.
"""
def __init__(self, dim, reverse = False, board = None):
"""
Initialize the TTTBoard object with the given dimension and
whether or not the game should be reversed.
"""
self._dim = dim
self._reverse = reverse
if board == None:
# Create empty board
self._board = [[EMPTY for dummycol in range(dim)]
for dummyrow in range(dim)]
else:
# Copy board grid
self._board = [[board[row][col] for col in range(dim)]
for row in range(dim)]
def __str__(self):
"""
Human readable representation of the board.
"""
rep = ""
for row in range(self._dim):
for col in range(self._dim):
rep += STRMAP[self._board[row][col]]
if col == self._dim - 1:
rep += "\n"
else:
rep += " | "
if row != self._dim - 1:
rep += "-" * (4 * self._dim - 3)
rep += "\n"
return rep
def get_dim(self):
"""
Return the dimension of the board.
"""
return self._dim
def square(self, row, col):
"""
Returns one of the three constants EMPTY, PLAYERX, or PLAYERO
that correspond to the contents of the board at position (row, col).
"""
return self._board[row][col]
def get_empty_squares(self):
"""
Return a list of (row, col) tuples for all empty squares
"""
empty = []
for row in range(self._dim):
for col in range(self._dim):
if self._board[row][col] == EMPTY:
empty.append((row, col))
return empty
def move(self, row, col, player):
"""
Place player on the board at position (row, col).
player should be either the constant PLAYERX or PLAYERO.
Does nothing if board square is not empty.
"""
if self._board[row][col] == EMPTY:
self._board[row][col] = player
def check_win(self):
"""
Returns a constant associated with the state of the game
If PLAYERX wins, returns PLAYERX.
If PLAYERO wins, returns PLAYERO.
If game is drawn, returns DRAW.
If game is in progress, returns None.
"""
board = self._board
dim = self._dim
dimrng = range(dim)
lines = []
# rows
lines.extend(board)
# cols
cols = [[board[rowidx][colidx] for rowidx in dimrng]
for colidx in dimrng]
lines.extend(cols)
# diags
diag1 = [board[idx][idx] for idx in dimrng]
diag2 = [board[idx][dim - idx -1]
for idx in dimrng]
lines.append(diag1)
lines.append(diag2)
# check all lines
for line in lines:
if len(set(line)) == 1 and line[0] != EMPTY:
if self._reverse:
return switch_player(line[0])
else:
return line[0]
# no winner, check for draw
if len(self.get_empty_squares()) == 0:
return DRAW
# game is still in progress
return None
def clone(self):
"""
Return a copy of the board.
"""
return TTTBoard(self._dim, self._reverse, self._board)
def switch_player(player):
"""
Convenience function to switch players.
Returns other player.
"""
if player == PLAYERX:
return PLAYERO
else:
return PLAYERX
def play_game(mc_move_function, ntrials, reverse = False):
"""
Function to play a game with two MC players.
"""
# Setup game
board = TTTBoard(3, reverse)
curplayer = PLAYERX
winner = None
# Run game
while winner == None:
# Move
row, col = mc_move_function(board, curplayer, ntrials)
board.move(row, col, curplayer)
# Update state
winner = board.check_win()
curplayer = switch_player(curplayer)
# Display board
print board
print
# Print winner
if winner == PLAYERX:
print "X wins!"
elif winner == PLAYERO:
print "O wins!"
elif winner == DRAW:
print "Tie!"
else:
print "Error: unknown winner"
"""
Mini-max Tic-Tac-Toe Player
"""
import poc_ttt_gui
import poc_ttt_provided as provided
# Set timeout, as mini-max can take a long time
import codeskulptor
codeskulptor.set_timeout(60)
# SCORING VALUES - DO NOT MODIFY
SCORES = {provided.PLAYERX: 1,
provided.DRAW: 0,
provided.PLAYERO: -1}
def mm_move(board, player):
"""
Make a move on the board.
Returns a tuple with two elements. The first element is the score
of the given board and the second element is the desired move as a
tuple, (row, col).
"""
# base case
if board.check_win() is not None:
return SCORES[board.check_win()], (-1, -1)
# recursion
res = (-10, (-1, -1))
for move in board.get_empty_squares():
working_board = board.clone()
working_board.move(move[0], move[1], player)
score, _ = mm_move(working_board, provided.switch_player(player))
if score * SCORES[player] == 1:
return score, move
elif score * SCORES[player] >= res[0]:
res = (score * SCORES[player], move)
return res[0] * SCORES[player], res[1]
def move_wrapper(board, player, trials):
"""
Wrapper to allow the use of the same infrastructure that was used
for Monte Carlo Tic-Tac-Toe.
"""
move = mm_move(board, player)
assert move[1] != (-1, -1), "returned illegal move (-1, -1)"
return move[1]
# Test game with the console or the GUI.
# Uncomment whichever you prefer.
# Both should be commented out when you submit for
# testing to save time.
# provided.play_game(move_wrapper, 1, False)
# poc_ttt_gui.run_gui(3, provided.PLAYERO, move_wrapper, 1, False)