Solution to Heads-Up Limit Hold `Em Poker

Solution to Heads-Up Limit Hold ’Em Poker
A.J. Bates
Antonio Vargas
Math 287
Boise State University
April 9, 2015
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
1 / 17
Outline
Introduction
Solving Imperfect-Information Games
Normal-Form Linear Programming
Sequence-Form Linear Programming
Counterfactual Regret Minimization
Solving heads-up limit holdem
The Solution
Conclusion
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
2 / 17
Intro to Heads-up Limit Hold ’Em Poker
HULHE
Non-Perfect Information Game
Two Player
fixed bet size
fixed number of raises (limit)
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
3 / 17
Solving Imperfect-Information Games
Extensive-form game
Game tree’s depict possible moves
Zero-Sum
Nash Equilibrium: The optimal move taking into account the
opponents choices of play
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
4 / 17
Normal-Form Linear Programming
Earliest Method of Solving was converting the Extensive-Form game
to normal-form
A matrix of values for every pair of possible strategies
Number of possible deterministic strategies is exponential
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
5 / 17
Sequence-Form Linear Programming
First algorithm to solve imperfect-information extensive-form through
computation
Representing strategy through sequence form
This technique was used to create the first poker playing program
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
6 / 17
Counterfactual Regret Minimization
Iterative method for approximating a Nash Equilibrium
repeated self play between two regret-minimizing algorithms
stores and minimizes a modified regret for each information set and
subsequent action
By averaging each player’s strategy over all the iterations, the Nash
Equilibrium can be found
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
7 / 17
Solving heads-up limit holdem
The full game of HULHE has 3.19e14 information sets.
With CFR, this requires 262 TB of storage and an impractical amount
of computation!
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
8 / 17
Solving heads-up limit holdem
CFRplus
Does exhaustive iterations across entire game tree
Favorable actions repeated immediately
Exploitability of players strategies approach zero (no need for
averaging of strategies)
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
9 / 17
Solving heads-up limit holdem
Exploitability
Expoitability: the amount less than the game value that the strategy
achieves against the worst-case opponent strategy in expectation
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
10 / 17
Solving heads-up limit holdem
Exploitability
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
11 / 17
Solving heads-up limit holdem
”Essentially” solved
We define a game to be essential solved if a lifetime of play is unable
to statistically differentiate it from being solved at 95 percent
confidence.
A lifetime of play is defined as someone playing 200 games of poker
an hour for 12 hours a day without missing a day for 70 years.
Threshold is an exploitability of 1 milli-big-blinds per game.
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
12 / 17
The solution
Computation
CFRplus executed on cluster of 200 computation nodes with 24
2.1-GHz AMD cores, 32 GB of RAM, and a 1-TB local disk.
Computation ran for 1579 iterations, taking 68.5 days!
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
13 / 17
The solution
Action probabilities
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
14 / 17
The solution
Tips from strategy
Dealer advantage
’Limping’ (passing on first raise) discouraged
Almost never ’caps’(making final allowed raise) in first round
Most importantly, nondealer plays much more often
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
15 / 17
Conclusion
Game theory has been used to analyze Cold War politics, with
potential for CFR to be applied in security and in the medical field
It would be disingenuous of us to disguise the fact that the principal
motive which prompted the work was the sheer fun of the thing
-Turing
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
16 / 17
The paper referenced
Bowling, M., Burch, N., Johanson, M., Tammelin, O. (January 08,
2015). Heads-up limit hold’em poker is solved. Science, 347, 6218,
145-149.
A.J. Bates, Antonio Vargas (Boise State University)
Solution to Heads-Up Limit Hold ’Em Poker
April 9, 2015
17 / 17