DeepSeek’s new AI model debuts with support for China-native chips and CANN, a replacement for Nvidia's CUDA — Chinese chipmakers Huawei, Cambricon, and Hygon get first-class support

Deepseek logo on an iPhone
(Image credit: Getty / iStock Editorial)

Chinese AI firm DeepSeek has released its latest large language model, DeepSeek-V3.2-Exp, with first-day optimizations for Huawei’s Ascend hardware and CANN software stack. The launch marks a shift in priorities to ensure leading-edge models run on domestic accelerators rather than relying on Nvidia’s CUDA ecosystem.

DeepSeek announced the model on September 29, posting code and checkpoints to Hugging Face alongside a technical report. The company describes V3.2-Exp as an “intermediate step toward our next-generation architecture,” designed to cut costs on long-context inference. It features a sparse attention mechanism that trims memory and compute requirements while maintaining output quality.

Huawei’s Ascend team and the wider vLLM-Ascend community moved swiftly to integrate DeepSeek-V3.2-Exp. In the vLLM-Ascend repo, a new issue outlines custom operator installation steps and kernel packaging for Ascend NPUs to support V3.2-Exp. The CANN team also published an inference recipe, positioning the model for immediate deployment across Huawei hardware.

Meanwhile, SGLang confirmed V3.2-Exp support across multiple back ends, including Ascend, while DeepSeek’s GitHub notes suggest parity with vLLM at launch. DeepSeek itself publicly references both TileLang and CUDA kernels in its announcements, urging researchers to use TileLang for prototyping. Practically, that means the same model artifact can be deployed across Nvidia and Chinese accelerators with only minimal graph changes.

The sheer speed of adoption here illustrates how China’s AI ecosystem is undeniably preparing for a future in which access to Nvidia hardware cannot be taken for granted. Nvidia’s CUDA remains dominant for both training and inference, but DeepSeek’s latest release is one of the first from a major Chinese company to apparently arrive optimized for non-CUDA stacks on day one.

The coordinated effort across Ascend, Cambricon, and Hygon is the clearest sign to date that Chinese firms are taking Beijing’s demands for AI sovereignty seriously, not just making their hardware compatible after the fact, but positioning domestic platforms as first-class targets.

Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.

Luke James
Contributor

Luke James is a freelance writer and journalist.  Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory. 

  • redgarl
    That is going to be a catastrophe... let`s talk about this in 6 to 12 months. This is nothing more than a gamble from the CCP in a fake it until you make it kind of situation.

    The only good news I see is that CUDA will soon not be a consideration anymore for training.
    Reply
  • nookoool
    redgarl said:
    That is going to be a catastrophe... let`s talk about this in 6 to 12 months. This is nothing more than a gamble from the CCP in a fake it until you make it kind of situation.

    The only good news I see is that CUDA will soon not be a consideration anymore for training.

    Literally every tech company "fake it till you make it" . If china maintains the nvdia ban, companies have no choice but to eventually move on.
    Reply
  • zsydeepsky
    Not only DeepSeek, though.

    Recently, there have been quite a few Chinese models released, and almost all of them implemented the "first citizen support" for Chinese-made GPU/NPU & software stacks. Qwen3-Max, GLM-4.6...just name a few.

    So...the ecosystem migration isn't just starting; it's been quite a while now.

    Like nookoool mentioned above, they don't really have a choice; that's the only way they can go.
    Reply