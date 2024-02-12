One of the mysteries surrounding China's Tianhe 3 (Xingyi) supercomputer is what hardware is used to build it. A recent article by The Next Platform has shed light on the MT-3000 processor designed by the National University of Defense Technology (NUDT). As it turns out, the MT-3000 features a unique heterogeneous architecture that includes general-purpose CPU cores, control cores, and matrix accelerator cores.

NUDT's MT-3000 processor features a multi-zone structure that packs 16 general-purpose CPU cores with 96 control cores and 1,536 accelerator cores, according to The Next Platform. The MT-3000 processor reportedly achieves 11.6 FP64 TFLOPS of peak performance and demonstrates a power efficiency of 45.4 GigaFLOPS/Watt at an operational frequency of 1.20 GHz.

The key aspect of the MT-3000 architecture is that it packs both general-purpose and matrix acceleration cores into the same piece of silicon. To some degree, this integration mirrors the design philosophy behind AMD's Instinct MI300A CPU-GPU hybrid, suggesting a shift away from conventional discrete CPU-GPU systems towards more cohesive and efficient designs. Meanwhile, unlike AMD's multi-chiplet Instinct MI300A, the MT-3000 looks to be a monolithic design.

Being a heavily packed CPU, the MT-3000 has to be made on an advanced process technology, which could range from 14nm to 10nm down and to potentially 7nm technologies, The Next Platform suggests. The chip is likely made by Chinese foundry SMIC, which also produces a rather advanced HiSilicon Kirin 9000S processor for Huawei using its second-gen 7nm process technology. Meanwhile, it is unclear whether SMIC has enough 7nm production capacity to make chips both for Huawei and NUDT. As a result, it is possible that the MT-3000 uses a different production node.

With the MT-3000 at its core, the Tianhe-3 is believed to achieve unprecedented computational performance, potentially reaching 1.57 ExaFLOPS on LINPACK benchmarks. This projection not only highlights the processor's central role in advancing China's supercomputing capabilities but also signals a significant improvement for the country. In comparison, Frontier, the fastest supercomputer in the US, reaches 1.102 ExaFLOPS of performance.

The MT-3000 processor represents a leap forward in high-performance computing (HPC) technology for China. With its hybrid architecture, high-performance efficiency, and potentially a very sophisticated production node, the MT-3000 looks to be a competitive chip, potentially positioning China at the forefront of global HPC development.