Google Gemini crumbles in the face of Atari Chess challenge — admits it would 'struggle immensely' against 1.19 MHz machine, says canceling the match most sensible course of action

(Image credit: Chess.com)

Google Gemini decided to call off a chess match against the ancient 1.19 MHz Atari 2600 console after a friendly pre-game reminder about what happened to ChatGPT and Microsoft’s Copilot. Citrix Architecture and Delivery specialist Robert Jr. Caruso, now well known for his AI vs Atari Chess challenges, revealed Gemini chickened out, in a chat with The Register.

As was the case with the ChatGPT and Microsoft’s Copilot chess challenges, Caruso reveals that Gemini was initially brimming with confidence regarding its Chess prowess. It was comfortable, if not eager, to throw down the gauntlet against the Atari 2600. At the beginning of Caruso’s chat with Gemini, the chatbot boasted of being able to “think millions of moves ahead and evaluate endless positions.” That sounds familiar in a proverbial ‘pride goeth before destruction’ kind of way.

Caruso then kindly reminded Gemini that he had previously organized Atari Chess bouts with ChatGPT and Microsoft’s Copilot. The Citrix expert went on to explicitly explain to Gemini that other LLMs had displayed outstanding levels of “misplaced confidence,” ahead of their chess matches against the ancient console.

Gemini must have then thought a bit deeper about what exactly would be involved in the chess challenge, and admitted to Caruso that it had been hallucinating regarding the magnitude of its abilities. It added that it now felt that it would “struggle immensely” in a match against the Atari 2600. “Canceling the match is likely the most time-efficient and sensible decision,” concluded Gemini.

LLMs aren't CPMs (Chess Playing Models)

So, now we have more confirmation, if needed, that today’s LLMs aren’t designed to be chess champs, and a little machine introspection is all that is required for them to think better of participating in such a challenge. This is even advisable when challenged by the incredibly constrained Atari 2600 with its puny MOS Technology 6507 9-bit processor, accompanied by just 128 bytes of RAM.

Due to the way these AIs, or LLMs, are created from linguistic theory and machine learning models, they are much more adept at talking about than playing the game of kings.

Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.

TOPICS

Mark Tyson is a news editor at Tom's Hardware. He enjoys covering the full breadth of PC tech; from business and semiconductor design to products approaching the edge of reason.

3 Comments Comment from the forums

George³

So much money for the hardware the model runs on, so much money spent on coding, and so much money on electricity and other supplies to train it, and it follows that investing in "AI" is nothing but a waste of tremendous amount of money.
Reply
Alvar "Miles" Udell

Now that TH has reported on these things with questionable setups they need to repeat the tests themselves or admit they could all have been artificially skewed to fail just for headlines.

Like the Copilot one, use think deeper and algebraic chess notation to play.
Reply
George³

Alvar Miles Udell said:
Now that TH has reported on these things with questionable setups they need to repeat the tests themselves or admit they could all have been artificially skewed to fail just for headlines.

Like the Copilot one, use think deeper and algebraic chess notation to play.
What do the settings have to do with it? If you make a deliberate optimization to win a specific race, it will be your victory, not Gemini's.
Reply