DeepSeek reportedly urged by Chinese authorities to train new model on Huawei hardware — after multiple failures, R2 training to switch back to Nvidia hardware while Ascend GPUs handle inference

Deepseek logo on an iPhone
(Image credit: Getty / iStock Editorial)

A new report claims that after successfully training its R1 model on Nvidia hardware, DeepSeek was urged by Chinese authorities to switch to using Huawei Ascend-based hardware for its next model. However, according to the Financial Times, training for R2 was met with persistent Huawei hardware failures, delaying the release of the model. DeepSeek was reportedly forced to switch back to Nvidia chips for training while using Huawei's for inference.

Following the success of R1, Chinese authorities allegedly encouraged DeepSeek to rely on Huawei's Ascend-based platforms instead of Nvidia for training, according to three individuals with knowledge of the matter cited by FT. DeepSeek followed that advice during R2's development, but the move quickly ran into a bunch of issues, including unstable performance, slower chip-to-chip connectivity, and limitations of Huawei's CANN software toolkit.

DeepSeek reportedly trained its R1 model on a cluster of 50,000 Hopper-series GPUs — made up of 30,000 HGX H20 units, 10,000 H800s, and 10,000 H100s — that were supplied through its investor, High-Flyer Capital Management. For natural reasons, R2 will require a substantially more powerful cluster for training, so DeepSeek and its backer will have to land them somewhere (which may not be that hard, given plenty of AI data centers in China).

There might be another issue, though. Reports indicate that DeepSeek's AI platform is tuned specifically for Nvidia hardware, which not only leaves the company vulnerable to the availability of Nvidia GPUs but also makes its clients depend on the supply of AI accelerators like Nvidia's HGX H20. To that end, it is crucial for DeepSeek to make R2 inference work on domestic hardware platforms, such as Huawei's Ascend.

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.