AI tool OpenClaw wipes the inbox of Meta's AI Alignment director despite repeated commands to stop — executive had to manually terminate the AI to stop the bot from continuing to erase data

(Image credit: Getty Images)

The hype around OpenClaw is at a fever pitch. The open-source AI agent that can be wired to a number of services is indirectly responsible for shortages of Mac Mini computers as more techies get on the bandwagon and let the bot loose on their numerous services. As with any LLM, though, things can and will go seriously wrong at some point, as Summer Yue, Meta Superintelligence Labs' Director of Alignment found out the hard way.

Go deeper with TH Premium: AI and data centers

Microsoft data center in Mount Pleasant, Wisconsin — (Image credit: Microsoft)

Like many other enthusiasts, Yue had a setup with a Mac Mini and OpenClaw running on it for various tasks. In the middle of having Claw archive old email from some accounts, she also asked to "check this inbox too and suggest what you would archive or delete, don't action until I tell you to." (sic; emphasis ours). Claw eventually started wiping that entire inbox, which happened to be personal e-mail.

Yue ordered Claw to stop twice using different language each time, eventually resorting to run to her Mac Mini to kill all the relevant processes. In the aftermath, she asked Claw what happened, given that she had issued specific orders not to take action before approval. The bot was contrite, stating she had the "right to be upset" and described what happened, saying it would add her request as a permanent rule.

Article continues below

Several commenters immediately spotted the problem, all while chiding Yue for making this basic blunder while being in charge, of all things, of Alignment (AI safety) at Meta Superintelligence. Since her command to not take action until she confirmed was part of the main chat, it was borderline guaranteed to be forgotten sooner or later.

Every bot has a "context window", roughly described as session memory. This window doesn't just include the chat; it includes every piece of data the bot has to deal with. As the inbox in question was pretty large, its contents eventually filled up the window, leading to "compaction."

This is the step where past contents are compressed in a lossy manner, similar to a JPEG, but even less deterministically. Initial memories become ever hazier with each compaction, a behavior noticed by anyone who's had a long chat with a bot. The result is that the bot sorta-almost-kinda remembered the order, but not really. It still continued executing its main task, which it did with aplomb.

The aforementioned "MEMORY.md" file the bot then edited itself is one of the multiple safeguards that can be put into place, as data therein effectively survives compaction. Other commenters suggested multiple workarounds, some arguably hiding the problem like increasing the context window or limiting the blast radius, and others doubling down on the concept, like adding a second OpenClaw to monitor the first one.

Regardless, many readers reminded Yue of the perils of letting a non-deterministic machine like an LLM loose in important data due to the inherent limitations, and also due to the fact that an email in her inbox may contain a prompt injection that OpenClaw will unwittingly read, letting an attacker have access to all her linked services. They also told her that a plain "stop" message is hard-coded into OpenClaw. For her part, Yue had the guts to admit it was a rookie mistake made due to complacence. We've all been there.

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Bruno Ferreira is a contributing writer for Tom's Hardware. He has decades of experience with PC hardware and assorted sundries, alongside a career as a developer. He's obsessed with detail and has a tendency to ramble on the topics he loves. When not doing that, he's usually playing games, or at live music shows and festivals.

5 Comments Comment from the forums

thesyndrome

You mean a person in a high-end AI-based position at a company that is heavily pushing AI, actually makes rookie mistakes using the AI and has an unfounded implicit trust in it to handle complex tasks with permanent repercussions?

Who could have ever seen this coming? Oh wait, thousands of people making far less money than these AI promoters predicted that this is exactly what's happening
Reply
frankieXZ

Why is this headline in every news outlet about AI? We even have "text" of the exec to OC, like it was deliberately posted to generate buzz & clickbait. Considering OAI hired the OC creator last week and OAI competes with Meta.... in a hostile way.

And why is it seem every post about Meta lately includes some under 30-something director/exec?
Reply
bigdragon

The first thing that popped into my mind was "this is why you don't put non-technical people in charge of technical divisions." The number of public AI blunders caused by easily avoidable mistakes makes me think only technical cybersecurity professionals should be in charge of leading the integration of AI into business operations.
Reply
Notton

I think he did it to do negative PR on a competing product.
Reply
Batweasel

"But I didn't think the face-eating AI would eat my face!"
Reply

Show more comments