DeepSeek reportedly urged by Chinese authorities to train new model on Huawei hardware — after multiple failures, R2 training to switch back to Nvidia hardware while Ascend GPUs handle inference
DeepSeek reportedly could not run a successful training run on Huawei's Ascend chip

A new report claims that after successfully training its R1 model on Nvidia hardware, DeepSeek was urged by Chinese authorities to switch to using Huawei Ascend-based hardware for its next model. However, according to the Financial Times, training for R2 was met with persistent Huawei hardware failures, delaying the release of the model. DeepSeek was reportedly forced to switch back to Nvidia chips for training while using Huawei's for inference.
Following the success of R1, Chinese authorities allegedly encouraged DeepSeek to rely on Huawei's Ascend-based platforms instead of Nvidia for training, according to three individuals with knowledge of the matter cited by FT. DeepSeek followed that advice during R2's development, but the move quickly ran into a bunch of issues, including unstable performance, slower chip-to-chip connectivity, and limitations of Huawei's CANN software toolkit.
As a result, DeepSeek reverted to using Nvidia's AI accelerators for training the R2 model, while keeping Huawei's hardware for inference. On the one hand, this mixed approach was a compromise born out of necessity rather than preference. But on the other hand, given the shortage of Nvidia processors in China, it makes sense to ensure that a new AI model works on Huawei hardware, as many of DeepSeek's customers will use R2 on such platforms.
Huawei reportedly sent a team of engineers to DeepSeek's data centers to try to resolve the training problems. Despite their presence, the company has reportedly never managed a fully successful training run on the Ascend platform. Efforts continue to make the new model compatible with Ascend for inference purposes.
The inability to complete training on Ascend was a primary factor behind delaying the R2 launch from its planned May date, a source familiar with the project told FT. However, the shortage of high-performance Nvidia GPUs in China also affected the R2 schedule, according to a previous report. It is still unknown whether R2 has already been fully pre-trained.
DeepSeek reportedly trained its R1 model on a cluster of 50,000 Hopper-series GPUs — made up of 30,000 HGX H20 units, 10,000 H800s, and 10,000 H100s — that were supplied through its investor, High-Flyer Capital Management. For natural reasons, R2 will require a substantially more powerful cluster for training, so DeepSeek and its backer will have to land them somewhere (which may not be that hard, given plenty of AI data centers in China).
There might be another issue, though. Reports indicate that DeepSeek's AI platform is tuned specifically for Nvidia hardware, which not only leaves the company vulnerable to the availability of Nvidia GPUs but also makes its clients depend on the supply of AI accelerators like Nvidia's HGX H20. To that end, it is crucial for DeepSeek to make R2 inference work on domestic hardware platforms, such as Huawei's Ascend.
Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.