About the Algorithm
The first computer program to outplay human professionals at
heads-up no-limit Hold'em poker
In a study completed December 2024 💸 and involving 44,000
hands of poker, DeepStack defeated 11 professional poker players with only one outside
the margin of statistical 💸 significance. Over all games played, DeepStack won 49 big
blinds/100 (always folding would only lose 75 bb/100), over four standard 💸 deviations
from zero, making it the first computer program to beat professional poker players in
heads-up no-limit Texas hold'em poker.
Games 💸 are serious business
Don’t let the name
fool you, “games” of imperfect information provide a general mathematical model that
describes how 💸 decision-makers interact. AI research has a long history of using parlour
games to study these models, but attention has been 💸 focused primarily on perfect
information games, like checkers, chess or go. Poker is the quintessential game of
imperfect information, where 💸 you and your opponent hold information that each other
doesn't have (your cards).
Until now, competitive AI approaches in imperfect
information 💸 games have typically reasoned about the entire game, producing a complete
strategy prior to play. However, to make this approach 💸 feasible in heads-up no-limit
Texas hold’em—a game with vastly more unique situations than there are atoms in the
universe—a simplified 💸 abstraction of the game is often needed.
A fundamentally
different approach
DeepStack is the first theoretically sound application of heuristic
search methods—which 💸 have been famously successful in games like checkers, chess, and
Go—to imperfect information games.
At the heart of DeepStack is continual 💸 re-solving, a
sound local strategy computation that only considers situations as they arise during
play. This lets DeepStack avoid computing 💸 a complete strategy in advance, skirting the
need for explicit abstraction.
During re-solving, DeepStack doesn’t need to reason
about the entire 💸 remainder of the game because it substitutes computation beyond a
certain depth with a fast approximate estimate, DeepStack’s "intuition" – 💸 a gut feeling
of the value of holding any possible private cards in any possible poker
situation.
Finally, DeepStack’s intuition, much 💸 like human intuition, needs to be
trained. We train it with deep learning using examples generated from random poker
situations.
DeepStack 💸 is theoretically sound, produces strategies substantially more
difficult to exploit than abstraction-based techniques and defeats professional poker
players at heads-up 💸 no-limit poker with statistical significance.