About the Algorithm
The first computer program to outplay human professionals at
heads-up no-limit Hold'em poker
In a study completed December 2024 🍊 and involving 44,000
hands of poker, DeepStack defeated 11 professional poker players with only one outside
the margin of statistical 🍊 significance. Over all games played, DeepStack won 49 big
blinds/100 (always folding would only lose 75 bb/100), over four standard 🍊 deviations
from zero, making it the first computer program to beat professional poker players in
heads-up no-limit Texas hold'em poker.
Games 🍊 are serious business
Don’t let the name
fool you, “games” of imperfect information provide a general mathematical model that
describes how 🍊 decision-makers interact. AI research has a long history of using parlour
games to study these models, but attention has been 🍊 focused primarily on perfect
information games, like checkers, chess or go. Poker is the quintessential game of
imperfect information, where 🍊 you and your opponent hold information that each other
doesn't have (your cards).
Until now, competitive AI approaches in imperfect
information 🍊 games have typically reasoned about the entire game, producing a complete
strategy prior to play. However, to make this approach 🍊 feasible in heads-up no-limit
Texas hold’em—a game with vastly more unique situations than there are atoms in the
universe—a simplified 🍊 abstraction of the game is often needed.
A fundamentally
different approach
DeepStack is the first theoretically sound application of heuristic
search methods—which 🍊 have been famously successful in games like checkers, chess, and
Go—to imperfect information games.
At the heart of DeepStack is continual 🍊 re-solving, a
sound local strategy computation that only considers situations as they arise during
play. This lets DeepStack avoid computing 🍊 a complete strategy in advance, skirting the
need for explicit abstraction.
During re-solving, DeepStack doesn’t need to reason
about the entire 🍊 remainder of the game because it substitutes computation beyond a
certain depth with a fast approximate estimate, DeepStack’s "intuition" – 🍊 a gut feeling
of the value of holding any possible private cards in any possible poker
situation.
Finally, DeepStack’s intuition, much 🍊 like human intuition, needs to be
trained. We train it with deep learning using examples generated from random poker
situations.
DeepStack 🍊 is theoretically sound, produces strategies substantially more
difficult to exploit than abstraction-based techniques and defeats professional poker
players at heads-up 🍊 no-limit poker with statistical significance.