Caught in the Stat – Using Data to Detect Cheaters – Data Science W231

Caught in the Stat – Using Data to Detect Cheaters
Anonymous | October 8, 2022

In the world of shoe computers and super engines, is it possible to catch cheaters in chess and other fields with only an algorithm?

Setting the Board

On Sept 4th, 2022, the world’s greatest chess player (arguably of all time), Magnus Carlsen, lost to a 19-year old upcoming American player with far less name recognition – Hans Niemann.

Carlsen then immediately withdrew from the chess tournament, in a move that confused many onlookers. Two weeks later, in an online tournament with a prize pool of $500,000, Carlsen this time resigned immediately after the game began, in a move termed by many as a protest. Commentators stirred rumors that Carlsen suspected his opponent of cheating, especially after a suspicious-looking tweet.

Following this, a number of the sport’s most famous players and streamers took to the internet to share opinions on this divisive issue. Suspected cheating in chess is treated as a big deal as it potentially threatens the entirety of the sport and fans who admire it. While it’s well-established that the best players in the world still cannot compete with chess-playing software programs today, it is definitively against the rules to receive any assistance at all from a bot at any point during a match.

https://www.gpb.org/news/2022/10/06/hans-niemann-accused-of-cheating-in-more-100-chess-games-hes-playing-today

To Catch a Cheater

In today’s world, where even an outdated smartphone or laptop can crush Carlsen himself, catching such cheating can be a difficult task. In my experience playing chess in statewide tournaments in high school, it surprised me that players didn’t have their devices confiscated when entering the tournament area, and as one might expect, there were no metal detectors or similar scanners. At the highest level of competition, there may sometimes be both preventive measures taken, but it’s far from a sure bet.

Moreover, individuals determined to get an edge are often willing to go to creative lengths to do so. For instance, many articles have discussed devices that players can wear in shoes and receive signals in the form of vibrations by a small processor. ( https://boingboing.net/2022/09/08/chess-cheating-gadget-made-with-a-raspberry-pi-zero.html) Not to mention, many games in recent years are played entirely online, broadening the potential for abuse by digital programs flying under the radar of game proctors.

Here Comes the Data

If traditional anti-cheating monitoring isn’t likely to catch anyone in the act, how can data science step in to save the day?

As mentioned above, there are several chess-playing programs that have become well-known for their stellar win rate and versatility – two famous programs are named AlphaZero and Stockfish. Popular chess websites like Chess.com utilize these programs in embedded features for analyzing games for every turn to see how both players are performing during the game. These can also be used in post-game analysis to examine which moves by either side were relatively strong or weak. The abundant availability of these tools means that any bystander can statistically examine matches to see how well each player performed during a particular game compared to the “perfect” computer.

AlphaZero is an ML program written using Monte Carlo Tree Search. Very interesting detail here.

Armed with this as a tool, we can analyze all of history’s recorded games to understand player potential. How well do history’s greatest chess players compare to the bots?

Image credit Yosha Iglesias – link to video below.

The graphic above details the “hit-rate” of some of the most famous players in history. This rate can be defined as the % of moves made over the course of a chess game that would align with what a top-performing bot would do at each move in the game, out of all the moves played. So if a chess Grandmaster makes 30 perfect moves out of 40 in a game, that game is scored at 75%. Since nearly all of a top-level chess player’s games are published in public record, this enables analysts the ability to build a library of games to compare behavior from match to match.

FM (FIDE Master) Yosha Iglesias has done just that, cataloging Niemann’s games and looking for anomalies. Using this method of analysis, she has expressed concern over a number of Niemann’s games, including those previously mentioned. Professor of Computer Science Kenneth Reagan at the University of Buffalo has also published research in a similar vein. The gist of their analysis centers around the idea that for chess cheaters, their performance in some games is so far above the benchmark and compared to their other games, or the entire body of published games, is a statistical near-impossibility to be performing that similarly to a computer without assistance.

Image credit Yosha Iglesias – Niemann tournament hit-rate performance, 2019-2022

While seemingly a very conditional or abstract argument, some organizations are already utilizing these analyses to place a ban on Niemann competing in any titled or paid tournaments in the future. Chess.com has been a leader in the charge, and on October 5th released a 72-page report accusing Niemann of cheating in over 100+ games, citing cheating-detection methodology with techniques including, “Looking at the statistical significance of the results (ex. “1 in a million chance of happening naturally”).

So what?

Cheating as an issue goes beyond chess. It threatens to undermine the hard work that many players put into their competitions, and at an organizational level, can cause billions of dollars of damage to victims due to corruption in industries like banking. For example, one mathematician used statistics to help establish how a speedrun in the multi-platform video game Minecraft was virtually guaranteed to not be on the base game due to having too much luck (natural odds of 1 in 2.0 * 10²²), months before the player actually confessed. And in a broader sense, statistical principles like Benford’s law can be used to detect tax fraud in companies like Enron through their tax reports.

As long as human nature remains relatively the same, it’s likely that some individuals will try to take a shortcut and cheat wherever possible. And in today’s world, where it’s increasingly difficult to detect cheating through conventional means, data and statistical analysis are a vital tool to keep in the toolbelt. While those less familiar with this type of statistical determination may be more skeptical of these types of findings, further education and math diplomacy is needed from data advocates to illustrate their efficacy.

Where to Follow-up if You’re Curious

Ken Regan – University of Buffalo

https://cse.buffalo.edu/~regan/chess/fidelity/data/Niemann/

FM Yosha Iglesias

https://docs.google.com/spreadsheets/d/127lwTsR-2Daz0JqN1TbZ8FgITX9Df9TyXT1gtjlZ5nk/edit?pli=1#gid=0