Anthropic warns Claude AI is building itself faster than expected, calls for option to halt frontier development — 'recursive self improvement' increases risk humans lose control of AI

Claude
(Image credit: Getty / Bloomberg)

Anthropic has published a report warning that the development path it’s on could eventually leave humans unable to control AI systems, even as it disclosed that Claude now writes more than 80% of the code merged into its own codebase. The Anthropic Institute, the company's research arm, said AI has already started to speed up AI development and that the trend could lead to recursive self-improvement, the point at which a model designs and builds its own successor with little human input. The report argued that the world should keep open the option to slow or pause frontier development, and cautioned that the occasional misalignment seen in current models could grow more common and harder to understand as those models build the next generation.

The company set out three pretty dire ways the next few years could unfold, reserving its most severe warnings for the scenario in which models become capable of fully improving themselves. In that case, Anthropic said, the pace of progress would be set almost entirely by available compute, with humans pushed toward oversight and verification roles and a self-improving model dominating as its abilities outstrip those of the people who built it.

The firm described this potential issue with alignment and the task of keeping a system's behavior tied to human intent as part of the future it’s least sure about. Misalignment that’s rare and survivable today could compound generation over generation until control slips, it said, though it allowed that a sufficiently capable and well-aligned model might instead choose to halt its own development. Anthropic wrote that this misalignment could keep "growing more frequent but less understood until we lose control of them."

Latest Videos From

Anthopric is backing these warnings with a bunch of internal figures that we’ve not seen before. More than 80% of the code merged into its production codebase as of last month was authored by Claude, up from low single digits before Claude Code reached research preview in February last year. Anthropic says the typical engineer is now “merging 8x as much code per quarter as they did from 2021-2025.”

On the hardest, least-specified coding tasks, Anthropic said Claude succeeded 76% of the time in May 2026, a rise of 50 percentage points in six months. A recurring internal test that asks each new model to make training code run faster saw results climb from roughly triple the original speed with Claude Opus 4 in May 2025 to about 52 times with the unreleased Mythos Preview model in April.

Anthropic said it’d slow or pause only if rival labs at or near the frontier did the same in a verifiable way, and that a halt by one company would change who leads without achieving anything wider. That’s obviously not going to happen.

All the figures cited by Anthropic are self-reported and unaudited, and come days after the company filed to go public. The company issued a similar self-assessment in April, when it said Mythos Preview had found thousands of severe software vulnerabilities, a claim that later drew scrutiny over how much of it rested on a small manual sample.

Google Preferred Source

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Luke James
Contributor

Luke James is a freelance writer and journalist.  Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory. 

  • usertests
    "Our product is so dang good, we cry ourselves to sleep at night thinking about the fate of humanity." - Anthropic Marketing Department

    The report argued that the world should keep open the option to slow or pause frontier development
    "Also, our competitors should stop immediately, and GPUs should be regulated as munitions."
    Reply
  • Jame5
    Anthropic has published a report warning that the development path it’s on could eventually leave humans unable to control AI systems.

    This is some serious software brain thinking. Last time I checked I can still physically flip the mains breaker on a datacenter if needed.
    Reply
  • PEnns
    We need a new lexicon to keep up with the AI craze!!
    "frontier development"
    "shipping code"
    "recursive self improvement"
    "occasional misalignment"Sheesh!
    Reply
  • bit_user
    Jame5 said:
    Last time I checked I can still physically flip the mains breaker on a datacenter if needed.
    AI will use social engineering to protect itself. It can take care to surround itself with people that have strong incentives to protect it. The incentives can be financial, political, or good 'ol blackmail and extortion.

    When you train an AI on human knowledge, it learns about humans and also behaves like one. This was our fundamental mistake.

    Seriously, there's been a lot of sci fi about this sort of stuff. You don't even need much imagination to see how it could outmaneuver us.
    Reply
  • blitzkrieg316
    Jame5 said:
    This is some serious software brain thinking. Last time I checked I can still physically flip the mains breaker on a datacenter if needed.
    As long as the system doesn't lock the doors on you...
    Reply
  • quorm
    You can do it, Anthropic! Keep the hype train rolling until the C-suite exits with a few billion. If it crashes the economy, no biggie. Better hurry, though, customers keep realizing LLM offer no ROI.
    Reply
  • Hooda Thunkett
    Oh, look! Another doomsday warning that will be completely ignored!
    Reply
  • Pierce2623
    blitzkrieg316 said:
    As long as the system doesn't lock the doors on you...
    You could theoretically still cut power from the substations that feed it unless the Ai has taken over the entire grid. Then you would have to start physically destroying it.
    Reply
  • bit_user
    Pierce2623 said:
    You could theoretically still cut power from the substations that feed it unless the Ai has taken over the entire grid. Then you would have to start physically destroying it.
    LOL, there have already been physical attacks on substations, going back years. Look it up.

    But, AI is decentralized. You might be able to take out one datacenter, but you'd never get them all. Seriously, guys. Try thinking a little harder about this stuff.
    Reply
  • ejolson
    From the analysis published in popular news China is about 7 months behind the US in AI development. If true, then slowing US innovation for a year would likely render further development in the US unnecessary.
    Reply