China's secretive Tianhe 3 supercomputer uses homegrown hybrid CPU — rivals US systems with 1.57 Exaflops of performance: Report
Tianhe 3 could achieve peak performance of 1.57 ExaFLOPS.
One of the mysteries surrounding China's Tianhe 3 (Xingyi) supercomputer is what hardware is used to build it. A recent article by The Next Platform has shed light on the MT-3000 processor designed by the National University of Defense Technology (NUDT). As it turns out, the MT-3000 features a unique heterogeneous architecture that includes general-purpose CPU cores, control cores, and matrix accelerator cores.
NUDT's MT-3000 processor features a multi-zone structure that packs 16 general-purpose CPU cores with 96 control cores and 1,536 accelerator cores, according to The Next Platform. The MT-3000 processor reportedly achieves 11.6 FP64 TFLOPS of peak performance and demonstrates a power efficiency of 45.4 GigaFLOPS/Watt at an operational frequency of 1.20 GHz.
The key aspect of the MT-3000 architecture is that it packs both general-purpose and matrix acceleration cores into the same piece of silicon. To some degree, this integration mirrors the design philosophy behind AMD's Instinct MI300A CPU-GPU hybrid, suggesting a shift away from conventional discrete CPU-GPU systems towards more cohesive and efficient designs. Meanwhile, unlike AMD's multi-chiplet Instinct MI300A, the MT-3000 looks to be a monolithic design.
Being a heavily packed CPU, the MT-3000 has to be made on an advanced process technology, which could range from 14nm to 10nm down and to potentially 7nm technologies, The Next Platform suggests. The chip is likely made by Chinese foundry SMIC, which also produces a rather advanced HiSilicon Kirin 9000S processor for Huawei using its second-gen 7nm process technology. Meanwhile, it is unclear whether SMIC has enough 7nm production capacity to make chips both for Huawei and NUDT. As a result, it is possible that the MT-3000 uses a different production node.
With the MT-3000 at its core, the Tianhe-3 is believed to achieve unprecedented computational performance, potentially reaching 1.57 ExaFLOPS on LINPACK benchmarks. This projection not only highlights the processor's central role in advancing China's supercomputing capabilities but also signals a significant improvement for the country. In comparison, Frontier, the fastest supercomputer in the US, reaches 1.102 ExaFLOPS of performance.
The MT-3000 processor represents a leap forward in high-performance computing (HPC) technology for China. With its hybrid architecture, high-performance efficiency, and potentially a very sophisticated production node, the MT-3000 looks to be a competitive chip, potentially positioning China at the forefront of global HPC development.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.
-
bit_user The key aspect of the MT-3000 architecture is that it packs both general-purpose and matrix acceleration cores into the same piece of silicon. To some degree, this integration mirrors the design philosophy behind AMD's Instinct MI300A CPU-GPU hybrid, suggesting a shift away from conventional discrete CPU-GPU systems towards more cohesive and efficient designs.
I'd argue there's no imitation or shift, here.
I'm immediately reminded of the Sunway SW26010, powering their TaihuLight supercomputer:
https://www.nextplatform.com/2021/02/10/a-sneak-peek-at-chinas-sunway-exascale-supercomputer/
I had even guessed the Tianhe-3 is using a derivative or descendant of these Sunway processors, although it looks like the MT-3000 is of a different lineage, in spite of its apparent similarities.
I would also point out that Japan's Fugaku supercomputer also opted for a pure CPU approach, utilizing Fujitsu's entirely-custom A64FX ARM processors, with 512-bit SVE pipelines, instead of GPUs.
https://www.anandtech.com/show/13258/hot-chips-2018-fujitsu-afx64-arm-core-live-blog -
Evildead_666
45GFlops/watt amd 1.5ExaFlops total.gg83 said:Those must be very expensive and energy hungry chips.
I cant do this on my phone right now, but its doable to find out the total wattage of the system approximately.
Then you just have to know how namy of these Mt-3000's there are in the system. -
P1nky
You can use your phone to ask any AI LLM asistent to calculate the answer for you. The supercomputer uses 34.6 MW.|Evildead_666 said:45GFlops/watt amd 1.5ExaFlops total.
I cant do this on my phone right now -
Gururu Do any other countries other than U.S. and China have their own homemade supercomputers or does everyone else like Russia just use U.S. hardware?Reply -
Pierce2623 I see the new Chinese “7nm” called an advanced node and it’s pretty funny. Its performance/efficiency falls in between TSMC 10nm and 14nm. It’s also MUCH more expensive than even TSMC 7nm. Then the “5nm” is going to be 50% more expensive than TSMC 5nm yet it will perform like TSMC 10nm if they’re lucky.Reply -
The Historical Fidelity Im gonna wait until actual verified figures come out for this supercomputer. As of right now, this articles filled with hearsay numbers and napkin math.Reply -
bit_user
Well, I did just link to the Fujitsu A64FX, powering Japan's Fugaku supercomputer. After the USA and maybe China, I think Japan has probably the most self-sufficient HPC industry, and it's not just Fujitsu. There's also:Gururu said:Do any other countries other than U.S. and China have their own homemade supercomputers or does everyone else like Russia just use U.S. hardware?
https://www.preferred.jp/en/projects/supercomputers/ https://www.pezy.co.jp/en/products/pezy-sc3/However, PEZY - the most interesting, with an exotic RF-coupled alternative to HBM - got hit with some massive corruption case that seems to have derailed their business, at least for quite a while. As for Preferred Networks, I'm not sure if their processors are for sale outside Japan.
I think Japan has really struggled to build a homegrown computer industry. Mainframes and HPC seem to have been the areas of greatest success, perhaps largely because they tend to be heavily-subsidized and the purchasing is more nationalistic.
Europe is attempting to achieve its own degree of self-sufficiency, here:
https://www.european-processor-initiative.eu/general-purpose-processor/
Specs-wise, I think it's fairly underwhelming. That doesn't mean it's pointless, however.
The desire to "buy local" might also be one of the main things keeping Tachyum still alive, who is claiming to be building a HPC-capable cloud processor:
https://www.tomshardware.com/pc-components/cpus/tachyum-prodigy-production-starts-in-2024 -
bit_user
China has stopped submitting entries to the Top500 list. So, their developments are going to remain more in the unofficial domain.The Historical Fidelity said:Im gonna wait until actual verified figures come out for this supercomputer. As of right now, this articles filled with hearsay numbers and napkin math. -
aldaia
Strictly speaking no country in the world has a home made supercomputer. China is probably the one closer to that idea, although I doubt that DRAM and NAND are produced locally.Gururu said:Do any other countries other than U.S. and China have their own homemade supercomputers or does everyone else like Russia just use U.S. hardware?
Lets take Frontier, #1 Top500 supercomputer. Its based on AMD EPYC CPU's and AMD Instinct GPUs. AMD may be a US company, but the actual chips are fabbed by TSMC in Taiwan. Not to mention the huge amount of DRAM and NAND chips required, that most probably are produced in South Korea.
If we take number #2, things get a bit better, Intel Xeon CPU is probably fabbed in the US , however, as far as I know, the Intel GPUs that power it are fabbed by TSMC, not Intel. And again don't forget the DRAM and NAND chips.
CPU/GPU may be the glamorous part of a supercomputer, but it's only the top of the iceberg. DRAM (for both CPU and GPU) and NAND for storage, represent the majority of the silicon.