{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Negamax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Last time we discussed the minimax search method for searching trees\n", "in adversarial games. The alternating maximizing and minimizing steps\n", "can be replaced with the same maximizing step if we negate the values\n", "returned each time **and** the game is truly a zero-sum game with the\n", "utilty in terminal states for one player always being the negative of\n", "the utility for the other player.\n", "\n", "Here is an illustration of Negamax applied to Tic-Tac-Toe." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import IFrame\n", "IFrame(\"http://www.cs.colostate.edu/~anderson/cs440/notebooks/negamax.pdf\", width=800, height=600)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is a python implementation." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "### Assumes that the argument 'game' is an object with the following methods\n", "# game.get_moves()\n", "# game.make_move(move) changes lookahead player\n", "# game.unmake_move(move) changes lookahead player\n", "# game.change_player() changes next turn player\n", "# game.get_utility()\n", "# game.is_over()\n", "# game.__str__()\n", "\n", "inf = float('infinity')\n", "\n", "def negamax(game, depth_left):\n", " if debug:\n", " print(' '*(10 - depth_left), game, end='')\n", " # If at terminal state or depth limit, return utility value and move None\n", " if game.is_over() or depth_left == 0:\n", " if debug:\n", " print('terminal value', game.get_utility())\n", " return game.get_utility(), None\n", " if debug:\n", " print()\n", " # Find best move and its value from current state\n", " bestValue = -inf\n", " bestMove = None\n", " for move in game.get_moves():\n", " # Apply a move to current state\n", " game.make_move(move)\n", " # print('trying',game)\n", " # Use depth-first search to find eventual utility value and back it up.\n", " # Negate it because it will come back in context of next player\n", " value, _ = negamax(game, depth_left-1)\n", " value = - value\n", " # Remove the move from current state, to prepare for trying a different move\n", " game.unmake_move(move)\n", " if debug:\n", " print(' '*(10 - depth_left), game, \"move\", move,\n", " \"backed up value\", value)\n", " if value > bestValue:\n", " # Value for this move is better than moves tried so far from this state.\n", " bestValue = value\n", " bestMove = move\n", " if debug:\n", " print(\"new best\")\n", " else:\n", " if debug:\n", " print\n", " return bestValue, bestMove" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we can apply `negamax` to Tic-Tac-Toe using the following\n", "`game` class definition." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "class TTT(object):\n", "\n", " def __init__(self):\n", " self.board = [' '] * 9\n", " self.player = 'X'\n", " if False:\n", " self.board = ['X', 'X', ' ', 'X', 'O', 'O', ' ', ' ', ' ']\n", " self.player = 'O'\n", " self.player_look_ahead = self.player\n", "\n", " def locations(self, c):\n", " return [i for i, mark in enumerate(self.board) if mark == c]\n", "\n", " def get_moves(self):\n", " moves = self.locations(' ')\n", " return moves\n", "\n", " def get_utility(self):\n", " where_X = self.locations('X')\n", " where_O = self.locations('O')\n", " wins = [[0, 1, 2], [3, 4, 5], [6, 7, 8],\n", " [0, 3, 6], [1, 4, 7], [2, 5, 8],\n", " [0, 4, 8], [2, 4, 6]]\n", " X_won = any([all([wi in where_X for wi in w]) for w in wins])\n", " O_won = any([all([wi in where_O for wi in w]) for w in wins])\n", " if X_won:\n", " return 1 if self.player_look_ahead == 'X' else -1\n", " elif O_won:\n", " return 1 if self.player_look_ahead == 'O' else -1\n", " elif ' ' not in self.board:\n", " return 0\n", " else:\n", " return None\n", "\n", " def is_over(self):\n", " return self.get_utility() is not None\n", "\n", " def make_move(self, move):\n", " self.board[move] = self.player_look_ahead\n", " self.player_look_ahead = 'X' if self.player_look_ahead == 'O' else 'O'\n", "\n", " def change_player(self):\n", " self.player = 'X' if self.player == 'O' else 'O'\n", " self.player_look_ahead = self.player\n", "\n", " def unmake_move(self, move):\n", " self.board[move] = ' '\n", " self.player_look_ahead = 'X' if self.player_look_ahead == 'O' else 'O'\n", "\n", " def __str__(self):\n", " s = '{}|{}|{}\\n-----\\n{}|{}|{}\\n-----\\n{}|{}|{}'.format(*self.board)\n", " return s" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's try an example." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def play_game_negamax(game):\n", " print(game)\n", " while not game.is_over():\n", " value, move = negamax(game, 9)\n", " if move is None:\n", " print('move is None. Stopping')\n", " break\n", " game.make_move(move)\n", " print('\\nPlayer', game.player, 'to', move, 'for value', value)\n", " print(game)\n", " game.change_player()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " | | \n", "-----\n", " | | \n", "-----\n", " | | \n", "\n", "Player X to 0 for value 0\n", "X| | \n", "-----\n", " | | \n", "-----\n", " | | \n", "\n", "Player O to 4 for value 0\n", "X| | \n", "-----\n", " |O| \n", "-----\n", " | | \n", "\n", "Player X to 1 for value 0\n", "X|X| \n", "-----\n", " |O| \n", "-----\n", " | | \n", "\n", "Player O to 2 for value 0\n", "X|X|O\n", "-----\n", " |O| \n", "-----\n", " | | \n", "\n", "Player X to 6 for value 0\n", "X|X|O\n", "-----\n", " |O| \n", "-----\n", "X| | \n", "\n", "Player O to 3 for value 0\n", "X|X|O\n", "-----\n", "O|O| \n", "-----\n", "X| | \n", "\n", "Player X to 5 for value 0\n", "X|X|O\n", "-----\n", "O|O|X\n", "-----\n", "X| | \n", "\n", "Player O to 7 for value 0\n", "X|X|O\n", "-----\n", "O|O|X\n", "-----\n", "X|O| \n", "\n", "Player X to 8 for value 0\n", "X|X|O\n", "-----\n", "O|O|X\n", "-----\n", "X|O|X\n" ] } ], "source": [ "debug = False\n", "play_game_negamax(TTT())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Negamax with Alpha-Beta Pruning" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a negamax version, the meanings of *alpha* and *beta* must be swapped as players alternate, and their values must be negated." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IFrame(\"http://www.cs.colostate.edu/~anderson/cs440/notebooks/negamax2.pdf\", width=800, height=600)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Modify Negamax to perform alpha-beta cutoffs by making these changes:\n", "\n", " * Two new arguments, `alpha` and `beta`, whose initial values are $-\\infty$ and $\\infty$. \n", " * In the for loop for trying moves, negate and swap the values of `alpha` and `beta`, and the returned value from recursive calls must be negated.\n", " * Do early return if `best_score` is greater than or equal to `beta`.\n", " * Update `alpha` to maximum of `best_score` and current `alpha`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What if you cannot search to end of game?\n", "\n", "Apply an **evaluation function** to non-terminal states. It must\n", "*estimate* the expected utility (function applied to terminal states\n", "only) of the game from the current position. \n", "\n", "A good evaluation function\n", " * orders the terminal states in the same way as the utility function,\n", " * cannot take too much execution time (can't search the whole remaining tree!),\n", " * should be strongly correlated with actual expected utility.\n", "\n", "An evaluation function is often a simple function of **features** of\n", "the current game position. Choice of good features is key. Requires\n", "considerable knowledge of the game and of good strategies.\n", "\n", "A strict cutoff of search at a specific depth with the evaluation\n", "function applied there, can lead to problems. What if the advantage\n", "in the game swings quickly just after the cutoff? If a state can be\n", "judged this way, then additional search such be performed for that\n", "state (a non-**quiescent** state).\n", "\n", "Current methods allow computers to search about 14 plies in chess." ] } ], "metadata": { "anaconda-cloud": {}, "jupytext": { "formats": "ipynb,py:light" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }