AI disruptor DeepSeek's next-gen model delayed by Nvidia GPU export restrictions to China — short supply of AI GPUs hinders development

Nvidia Hopper HGX H200
(Image credit: Nvidia)

DeepSeek attracted a lot of attention with its R1 AI model earlier this year, but it looks like development of the next-generation R2 model has stalled due to shortage of Nvidia's H20 processors in China, reports The Information. DeepSeek itself has not commented on when its R2 model is set to be available.

DeepSeek used a cluster consisting of 50,000 Hopper GPUs — including 30,000 H20s, 10,000 H800s, and 10,000 H100s — obtained by its investor High-Flyer Capital Management — to train its R1 model. It is unclear whether R2 has already been fully pre-trained. The Information reports citing two individuals familiar with the project that DeepSeek team has been working intensively on the model, but CEO Liang Wenfeng is not yet satisfied with its capabilities. Work continues internally to improve performance before the model is cleared for deployment.

R1 was quickly and widely adopted by a range of users, including private startups, big companies, and government-affiliated groups. Most of these users ran the model on Nvidia's H20 processors. Now that H20 shipments are restricted, it is already causing problems, limiting how R1 is used today and making it harder to get ready for the launch of R2, according to The Information report.

Should DeepSeek's upcoming R2 model surpass the capabilities of currently available open alternatives, usage is expected to surge beyond what Chinese cloud platforms can handle, according to staff at those firms cited by The Information. Most organizations relying on the earlier R1 model are said to operate it using Nvidia's H20 processors, which are now in short supply.

The U.S. government restricted sales of Nvidia's H20 processors for AI training and inference in mid-April. While the unit is a severely cut-down version of the popular H100 GPU, due to reliance of Chinese AI companies on Nvidia's CUDA software stack, H20 was a quite popular product among such entities in the People's Republic with Nvidia selling billions of dollars' worth of H20 processors every quarter. 

DeepSeek's AI software is reportedly optimized for Nvidia's hardware, which makes the company particularly vulnerable to U.S. policy decisions. Although the company claims to have developed its models using far fewer resources than U.S. companies like OpenAI, the recent export curbs highlight a critical weakness: China's top AI companies remain heavily dependent on American hardware. Meanwhile, OpenAI has unofficially accused DeepSeek of using its proprietary models during the development of R1, although the company has not addressed these claims publicly.

Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.

TOPICS
Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • ThisIsMe
    Thought they claimed they didn’t need NVidia hardware. Weird…
    Reply
  • rm12
    Well, hope they don't have incentives to develop/copy nvidia's hardware.
    So rare metals and magnets will become very rare outside of china?
    Reply