Moore Threads GPUs allegedly show 'excellent' inference performance with DeepSeek models
But no performance numbers published.

One of the breakthroughs of DeepSeek's open source AI models is that they can be run locally using relatively inexpensive hardware, like the Raspberry Pi.
As it turns out, the DeepSeek V3 and R1 models can even be run on Moore Threads GPUs developed in China, reports ITHome. If true, this is a major achievement for DeepSeek, the hardware designer, and China as this potentially opens new doors for Moore Threads and reduces reliance of DeepSeek and China on Nvidia hardware.
Moore Threads reportedly says it had successfully deployed the DeepSeek-R1-Distill-Qwen-7B distilled model on its own MTT S80 client graphics card and MTT S4000 datacenter-grade graphics cards. The company used the Ollama lightweight framework that enables users to run large language models directly on their MacOS, Linux, and Windows machines as well as an optimized inference engine to achieve 'high' performance.
Although the report claims 'excellent' and 'high' performance when describing how the MTT S80 and MTT S4000 performance with the DeepSeek-R1-Distill-Qwen-7B distilled model, it does not specify actual performance numbers or make comparisons to other hardware. To that end, it is impossible to evaluate the claims. Furthermore, given the fact that the MTT S80 is barely available outside of China, it is impossible to verify them.
Ollama supports models like Llama 3.3, DeepSeek-R1, Phi-4, Mistral, and Gemma 2, enabling their efficient local execution without relying on cloud-based services. Ollama is developed primarily for macOS and uses Metal for Apple GPU acceleration, CUDA for Nvidia GPU acceleration, and ROCm for AMD GPU acceleration.
Officially, Ollama does not support Moore Threads's GPUs, but the company claims that its graphics processors can execute code compiled for CUDA GPUs. The results confirmed that Moore Threads's GPUs are indeed compatible with CUDA and suitable for AI workloads, particularly in Chinese-language applications.
To further enhance performance, Moore Threads employed a proprietary inference engine featuring custom computational optimizations and improved memory management. This software-hardware integration significantly boosts computing performance and resource efficiency and ensures smooth deployment process and supporting future AI models, according to the report. Of course, we are talking about a distilled model, so for now we cannot really compare performance of Moore Threads GPUs with performance of solutions from AMD, Apple, or Nvidia.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.
-
hotaru251
that Moore Threads's GPUs are indeed compatible with CUDA
Nvidia about to have a panic attack as they really hate CUDA being used on anything but their GPU's -
YSCCC Yea right, suddenly all Chinese made hardware excels in Deepseek and is catching up Nvidia, very legit, how about stop scalping foreign 5090s and sell to China?Reply -
Gururu Now it’s clear why Jensen has been intent on fighting against the sanctions prohibiting sales to China. It was only a matter of time until nVidia hardware was assimilated.Reply -
AkroZ On the paper the MTT S80 is a good hardware for only $250, like Intel the issue is on the drivers, the game support is not good if you are not only playing the major competitves games like LoL. This have improved, netherless their main focus is on datacenters and AI, they made many partnerships and improvements to provide a full software suite and hardware compatibilities like with ARM cpu and various cloud solutions.Reply -
regs01
You don't need gaming drivers for computations.AkroZ said:On the paper the MTT S80 is a good hardware for only $250, like Intel the issue is on the drivers, the game support is not good if you are not only playing the major competitves games like LoL. This have improved, netherless their main focus is on datacenters and AI, they made many partnerships and improvements to provide a full software suite and hardware compatibilities like with ARM cpu and various cloud solutions.