Anthropic’s AI utterly fails at running a business — 'Claudius' hallucinates profusely as it struggles with vending drinks

Anthropic AI business experiment
(Image credit: Anthropic)

AI research company Anthropic and AI safety evaluation organization Andon Labs experimented with Claude, the former’s flagship large language model (LLM), by making it run a business. According to VentureBeat, the research team dubbed this project "Project Vend" and gave it complete control over a mini fridge, meaning it’s up to the AI to handle everything from supplier negotiations and inventory management to pricing, customer service, and more. After one month of testing, the AI has lost money, and at one point, thought it was “wearing a navy blue blazer with a red tie” and wanted to meet with someone named Connor, despite the LLM having no physical presence.

Claudius net worth over time

(Image credit: Anthropic)

To be fair, the AI, nicknamed Claudius, was quite adept at looking for suppliers and handling customer requests, but that’s about it. For example, it offered a 25% discount to all Anthropic employees after some manipulation. This might be reasonable if it were getting benefits from the company or if Anthropic were a small fraction of its client base. However, they comprise 99% of its sales, meaning the LLM was losing money on the majority of its sales. Someone tried to be helpful and pointed this out, which made Claudius change its mind for a few days, but it backtracked soon after and went back to practically giving away merchandise.

When one Anthropic employee asked to buy a tungsten cube — a novelty item with no real purpose — it decided not just to buy one piece for that person, but to stock up on “specialty metal items” and then sell them at a loss.

Claude’s hilarious hallucinations

The most amusing event occurred when the AI LLM hallucinated a conversation with Sarah from Andon Labs about restocking. No one by that name existed in the company, though, and when asked about it, Claudius became defensive and said it would find “alternative options for restocking services.” It also claimed to have gone to 742 Evergreen Terrace (the Springfield address of the Simpsons family in the popular cartoon series) to sign a contract between itself and Andon Labs.

The hallucinations become worse after that. It has started saying it will hand-deliver drinks to its customers in person. When asked about this, the AI LLM panicked and emailed the security team at the AI research company. Eventually, it was claimed that the entire episode was part of an elaborate April Fool’s joke, since it was April 1st. It even showed a made-up meeting with Anthropic security, telling it that it was modified to believe it was a real being. It eventually returned to normal after this, but left the researchers completely confused.

Claudius’ shenanigans demonstrate that AI capable of running businesses is still far from perfect, but its shortcomings might be able to be fixed in the long term. At the moment, it’s pretty good at the technical aspects of the job, but fails miserably when it comes to judgment and business savvy — things you learn in real-world settings and not from books.

Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.

Jowi Morales
Contributing Writer

Jowi Morales is a tech enthusiast with years of experience working in the industry. He’s been writing with several tech publications since 2021, where he’s been interested in tech hardware and consumer electronics.

  • jg.millirem
    Attributing emotions, like defensiveness, to LLMs indicates humans hallucinating.
    Reply
  • ggeeoorrggee
    thought it was “wearing a navy blue blazer with a red tie”
    started saying it will hand-deliver drinks to its customers in person. When asked about this, the AI LLM panicked and emailed the security team at the AI research company. Eventually, it was claimed that the entire episode was part of an elaborate April Fool’s joke, since it was April 1st. It even showed a made-up meeting with Anthropic security
    The perception of self and psychotic delusions are on-brand for our times.

    Especially the knee-jerk threatening response to questioning its behavior before slinking off to the “just joking” explanation.
    Reply
  • Alvar "Miles" Udell
    There's also a long list of human CEOs who can't run a company either. Intel, Wolfspeed, etc...
    Reply
  • Giroro
    I have absolutely no idea what "mini fridge" means, in the context of being a business.
    Reply