Chinese chipmaker tapes out 16-core DragonChain-powered CPU, 64-core coming — Loongson LS3C6000 server processor will rival Zen 3 CPUs

Loongson
(Image credit: Loongson)

Loongson has taped out its 16-core LS3C6000 processor, designed primarily for servers, reports MyDrivers. The new CPU is based on the latest iteration of the LoongArch instruction set architecture, and Loongson believes that its latest LA664 can offer performance comparable to AMD's Zen 3 cores.

The Loongson LS3C6000 processor packs 16 LA664 cores supporting simultaneous multithreading technology (SMT) that operate at an unknown frequency. The new cores are interconnected using the company's proprietary DragonChain technology, which is supposed to resolve bottlenecks associated with expanding the number of processor cores. This technology is supposed to enable the company to build 32-core and 64-core processors in the future. 

Loongson expects that its upcoming CPUs based on the LoongArch 6000 architecture will match the instructions per clock (IPC) performance of AMD's Zen 3 cores. Achieving IPC performance comparable to AMD's Zen 3 would be a significant milestone for Loongson, as its current CPUs trail behind those from AMD and Intel. Matching the IPC performance of AMD's Zen 3 could potentially position these processors as viable competitors for AMD's 3rd Generation EPYC processors and Intel's Xeon CPUs.

(Image credit: EET-China)

However, achieving IPC parity alone does not ensure competitiveness for the 3A6000/3C6000/3D6000 processors against AMD's Ryzen 5000-series and 3rd Gen EPYC parts, even with similar core counts. Factors like clock speed and other platform features, like the memory subsystem, will be crucial in determining the overall performance. 

Meanwhile, Loongson is doing a lot to make its LA664 cores, in particular, and LoongArch-based CPUs, in general, more competitive. In addition to enabling SMT on its latest cores, Loongson's last year also introduced support for 128-bit vector processing extension instructions (LSX) and 256-bit advanced vector processing extension instructions (LASX) for their new CPUs. While LSX and LASX are part of the LoongArch microarchitecture found in the current 3A5000-series processors, it is blurred if they were previously activated and, if so, what impact they had on performance. 

Taping out of the 16-core LS 3C6000 processor is a big deal for Loongson as the company needs to offer higher-performing CPUs to compete against offerings from AMD and Intel on the Chinese market. Meanwhile, what remains to be seen is whether Loongson's 6000-series processors with 16, 32, and 64 cores can offer decent performance in the high-performance computing (HPC) space.

Anton Shilov
Freelance News Writer

Anton Shilov is a Freelance News Writer at Tom’s Hardware US. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • Shirley Marquez
    Another question is whether compilers for Loongson's instruction set (which from the reports I have seen is a fork of MIPS) are mature yet. Even if their CPUs are theoretically equal in performance to x86 and ARM, they could be held back by inadequate compilers that do not optimize well.

    Nonetheless, it's a significant achievement for Loongson.
    Reply
  • bit_user
    Shirley Marquez said:
    Another question is whether compilers for Loongson's instruction set (which from the reports I have seen is a fork of MIPS) are mature yet.
    They've been upstreaming GCC patches, for a couple years at least. Decent GCC support is needed to build the Linux kernel (though, it also compiles with LLVM, on at least some architectures). I don't know where LLVM support for Loongson stands, but LLVM doesn't have a major performance lead over GCC, overall.

    Shirley Marquez said:
    Even if their CPUs are theoretically equal in performance to x86 and ARM, they could be held back by inadequate compilers that do not optimize well.
    I think their ISA is similar enough to what has come before that GCC should be pretty good at optimizing for it.
    Reply