Saturday, 13 July 2019 14:58

AI-based poker player beats human competitors


Playing top poker professionals, an AI-based bot Pluribus defeated them in six-player, no-limit Texas Hold-em over 10,000 hands.

Lasting 12 days and playing against 12 top poker players, Pluribus not only beat all these top players, but also earned winnings at a much greater rate than most professionals would expect.

According to a Facebook post, "Pluribus defeated pro players in both a 'five AIs + one human player' format and a 'one AI + five human players' format. If each chip was worth a dollar, Pluribus would have won an average of about $5 per hand and would have made about $1,000/hour playing against five human players. These results are considered a decisive margin of victory by poker professionals."

Most poker players learn to read their opponents, understanding the 'tells' that will indicate whether they're holding a good hand or instead are bluffing. Grimaces, smirks, blinking, all manner of 'odd' behaviour. Of course the bot has access to none of this, instead relying of its ability to learn from individual hand outcomes and to develop appropriate strategies for play.

Developed by Noam Brown, an AI researcher who recently started working full-time at Facebook, and Tuomas Sandholm, a computer science professor at Carnegie Mellon University in Pittsburgh, Pluribus is based on a "regret" algorithm that attempts to determine how much it 'regrets' not taking some specific action (different to the one it took in the game).

Initially, like many similar bots (Google's Go-playing bot, for instance), Pluribus played against itself, millions of times, in order to develop a strong sense of what a winning strategy might look like.

Poker performance

This graph shows how Pluribus' blueprint strategy improves during training on a 64-core CPU. Performance is measured against the final snapshot of training. We do not use search in these comparisons. Typical human and top human performance are estimated based on discussions with human professionals. The graphic also notes when Pluribus stops "limping," a passive strategy that advanced players typically avoid.  Source: Brown's Facebook post.

Unlike Chess or Go, where all information is visible to all players, much of the game is hidden from players - you never know what cards your opponent is holding until they are revealed (if they are ever revealed). This gives rise to the underlying intentions of the research – the ability to maximise outcomes in situations where the opponents are permitted to lie, and also where no entity has access to all of the information. According to Brown, "I think this is really the final major challenge in poker A.I. We don't plan to work on poker going forward. I think we're really focused on generalizing beyond."

The Facebook post adds, "Even though Pluribus was developed to play poker, the techniques used are not specific to poker and need not require any expert domain knowledge to develop. This research gives us a better fundamental understanding of how to build general AI that can cope with multi-agent environments, both with other AI agents and with humans, and allows us to benchmark progress in this field against the pinnacle of human ability."

Brown and Sandholm also published a more formal paper "Superhuman AI for multiplayer poker" in Science magazine. 


26-27 February 2020 | Hilton Brisbane

Connecting the region’s leading data analytics professionals to drive and inspire your future strategy

Leading the data analytics division has never been easy, but now the challenge is on to remain ahead of the competition and reap the massive rewards as a strategic executive.

Do you want to leverage data governance as an enabler?Are you working at driving AI/ML implementation?

Want to stay abreast of data privacy and AI ethics requirements? Are you working hard to push predictive analytics to the limits?

With so much to keep on top of in such a rapidly changing technology space, collaboration is key to success. You don't need to struggle alone, network and share your struggles as well as your tips for success at CDAO Brisbane.

Discover how your peers have tackled the very same issues you face daily. Network with over 140 of your peers and hear from the leading professionals in your industry. Leverage this community of data and analytics enthusiasts to advance your strategy to the next level.

Download the Agenda to find out more


David Heath

David Heath has had a long and varied career in the IT industry having worked as a Pre-sales Network Engineer (remember Novell NetWare?), General Manager of IT&T for the TV Shopping Network, as a Technical manager in the Biometrics industry, and as a Technical Trainer and Instructional Designer in the industrial control sector. In all aspects, security has been a driving focus. Throughout his career, David has sought to inform and educate people and has done that through his writings and in more formal educational environments.



Recent Comments