Russian Company Tapes Out 16-Core Elbrus CPU: 2.0 GHz, 16 TB of RAM in 4-Way System

(Image credit: MCST)

MCST, a microprocessor developer from Russia, has demonstrated the first engineering sample of its 16-core Elbrus-16C CPU. The processor is an evolution of MCST's proprietary VLIW architecture that adds features like virtualization. The Elbrus-16C is designed primarily for desktops and servers that have to comply with Russia's governmental requirements for security and reliability. 

The MCST Elbrus-16C is based on the company's 6th-gen VLIW microarchitecture that supports hardware virtualization, but it apparently doesn't feature any instructions-per-cycle (IPC) enhancements over the 5th-gen Elbrus microarchitecture. The chip consists of 12 billion of transistors and will be fabbed with a 16nm process technology.

The system-on-chip packs 16 cores running at 2 GHz, has an eight-channel DDR4 memory controller, supports 32 PCIe Gen 3 lanes, four SATA 3.0 ports, and integrated 2.5 GbE as well as 10 GbE interfaces. The CPU can address up to 4TB of DDR4 memory, the same capacity as AMD's EPYC 7002-series processors, but the developer does not disclose which memory modules — RDIMMs or LRDIMMs — it uses. The CPU has a 110W TDP.

As far as performance is concerned, the manufacturer says that its 16-core processor can offer 1.5 FP32 TFLOPS as well as 0.75 FP64 TFLOPS. This is considerably faster than MCST's previous-gen Elbrus-8CB, which reached 576 FP32 TFLOPS and 288 FP64 TFLOPS. However, this is significantly lower than today's leading-edge CPUs that top at 2.3 FP64 TFLOPS, or GPUs that can hit 9.7 FP64 TFLOPS. 

One interesting feature of the MCST Elbrus-16C processor is its support for 4-way symmetric multiprocessor configurations, a first for the company. Since every CPU in the system supports up to 4 TB of DDR4 ECC memory, a 4-way Elbrus-16C server can carry up to 16 TB of DRAM in total, something that modern AMD EPYC platforms cannot support (as they don't support 4-socket configurations). Considering the price of DRAM, it is unlikely that Elbrus-16C-powered machines with $232,000-worth of memory ($3,224*18*4) will ever be built in more or less mass quantities. Still, at least this will be possible if someone in a Russian government agency needs to run an application with a huge dataset on a CPU designed in Moscow.  

Virtualization and 4-way SMP support will allow Russian server makers to build cloud servers based on the MCST Elbrus-16C parts over time when there are appropriate operating systems available.  

So far, MCST has managed to run its Elbrus Linux operating system on a prototype Elbrus-16C-based server. The company will evaluate samples of the CPU in the coming quarters and expects the chip to be ready for mass production by late 2021.

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • hotaru.hino
    The choice of VLIW for this long makes me curious about its inner workings and how Elbrus managed to succeed where others (namely Intel and Transmeta) failed.

    But alas, I don't think we'll ever know unless someone at Elbrus is willing to defy the Russian government.
    Reply
  • mlee 2500
    Who is gonna fab it?
    Reply
  • AlB80
    hotaru.hino said:
    The choice of VLIW for this long makes me curious about its inner workings and how Elbrus managed to succeed where others (namely Intel and Transmeta) failed.
    TL;DR or read chapter 10.1
    But alas, I don't think we'll ever know unless someone at Elbrus is willing to defy the Russian government.
    Yep. This project is supported by government. But how a Russian rebel can say anything new about the failure of Intel and Transmeta?

    mlee 2500 said:
    Who is gonna fab it?
    Officially TSMC.
    Reply
  • Chung Leong
    hotaru.hino said:
    The choice of VLIW for this long makes me curious about its inner workings and how Elbrus managed to succeed where others (namely Intel and Transmeta) failed.

    There's little evidence Elbrus succeeded. Fundamental problems with VLIW remain: inability to find parallelism in everyday code, poor code density, low clock compare to OoO design.
    Reply
  • hotaru.hino
    Chung Leong said:
    There's little evidence Elbrus succeeded. Fundamental problems with VLIW remain: inability to find parallelism in everyday code, poor code density, low clock compare to OoO design.
    I'm seeing modern processor designs not all that much different from VLIW designs anyway, it's just they have a more complicated front-end to satisfy a wide execution unit. The problem with Intel + HP's EPIC was they demanded that software be re-written. Transmeta was just too small to bring their product to maturity. If anything, NVIDIA's Denver and Caramel carried on Transmeta's legacy.

    And by "succeeded", I'm alluding more to the fact they've kept up with that design for so long. If it wasn't successful, then I don't see why they'd bother for six generations. Russia doesn't want Elbrus to export their technology, so I have no delusions of it succeeding in the commercial market in the same vein Intel or ARM has.
    Reply
  • yaapelsinko
    Elbrus is not about "commercial success", it's a matter of technological independence.
    Thus, Elbrus doesn't exist in the same "dimension" with either ARM or x86. Imagine if Apple, having all their money and the access to IT-technologies, which mostly possessed by the West, develops a CPU of its own. Will it become just a third x86 CPU we'll be able to stick in our PCs just like Intel's or AMD's? Uh-huh, fat chance. Real chances are the Apple's proprietary CPU remains exclusively inside Apple's ecosystem.
    We don't actually see many companies trying to compete in any CPU market. There are no actual markets, but rather there are niches. Any new competitor for Intel/AMD? Any competition for ARM? Even Intel and NVIDIA failed miserably trying to get into the mobile market.
    General-purpose CPUs do not exist to please fanboys so that they could have it in their "PC-build" or to supply office workers with new electronic typewriters. They did some time ago, but not today. Now there is processing power everywhere: military equipment, industrial machinery, infrastructure management equipment, medicine, data-centers, data-centers, data-centers, consumer devices, oh, and did I mention data-centers?
    Having all that said, what is the actual success criteria for Elbrus?

    The development is funded by the government. The government presents the specs and requirements each CPU generation should meet. If Elbrus meets them => success. Because then it can be used to supply specified equipment with a specified amount of GFLOPs. As the architecture develops and CPU is becoming more and more capable, its possible applications widen up.

    To my view, Elbrus was quite successfully developed in that fashion step by step. Elbrus models 8C and 8CB are already good enough to build a workstation or a server for data storage and running web-services. Now, it may not be as fast, or as good, but it doesn't have to. It just needs to be exactly good enough. As for the price, it doesn't even matter. Elbrus CPU being a Russian property is bought with rubles, not USDs. You buy Elbrus - you put the money into the Russian economy, not someone else's.

    As the article says, Elbrus-16C is gonna be suitable for cloud servers which, again, widens up the application range for Elbrus. Now Russia can have Elbrus-based "grown-up" data-centers. And there are models for workstations (12C) and devices (2C3) in development too. All looks good, and again I will repeat: it doesn't have to compete with any CPUs in any benchmarks, it only has to be good enough. Good enough means you don't have to put an entire server rack where one other server would suffice. But if sometimes you have to put 2 CPUs instead of one into a server, then it's not a big deal.

    The next step is for the government to tighten up the restrictions for its own structures when purchasing IT-equipment. An American wouldn't want the FBI to run its servers and PCs on some Chinese CPU, right? The same here. Then the same for commercial companies in which the government has a major share. It may not be as profitable from a single entity's point of view but it benefits the economy as a whole in the end.

    As the production quantities will grow, the development price will spread across. Then it even could become affordable for a consumer to buy. But there is no universe for it to ever compete with the Wintel ecosystem even if it would be a miracle of a CPU. Well, maybe if Microsoft will port Windows for Elbrus and will promote it as a first-class platform so that any software and games would be optimized for the architecture, then maybe, he-he.
    Reply
  • mshigorin
    hotaru.hino said:
    The choice of VLIW for this long makes me curious about its inner workings and how Elbrus managed to succeed where others (namely Intel and Transmeta) failed.
    Others tried borrowing (the Transmeta guy previously worked at Sun where he learned about MCST -- which was founded as Moscow Center for SPARC Technology back in 1990s) or brute stealing (Intel hijacked Babayan and a bunch of others from MCST). It's not enough, and fortunately the real compiler guys stayed with MCST.

    Chung Leong said:
    There's little evidence Elbrus succeeded. Fundamental problems with VLIW remain: inability to find parallelism in everyday code, poor code density, low clock compare to OoO design.
    It has succeeded for me at least, I've moved to e2k from then-recent Intel i5 system back in 2018. Yes, crappy javascript will be slow -- but a multi-meg framework for each of the bells and whistles I didn't need in the first place is definitely not Elbrus' blame. Works good enough for me on any of my daily tasks, including occasional maps.yandex.ru.

    PS: no need to defy our government or our partners, they don't do anything wrong for that matter so it would be a traitor's act, not a whistleblower's one; and MCST has recently published a hefty official developer's guide that's quite useful to anyone optimizing their software, including SSOoO targets either: mcst.ru/elbrus_prog
    Reply
  • mshigorin
    yaapelsinko said:
    It just needs to be exactly good enough.
    Exactly. And it already is: sdelanounas.ru/blog/shigorin :-)

    yaapelsinko said:
    But if sometimes you have to put 2 CPUs instead of one into a server, then it's not a big deal.
    In fact, MCST tends to ship single and quad systems -- 4C, 8C and 16C are all Quad Capable in Xeon speak -- even if I've tested their 802 model and it worked too.

    yaapelsinko said:
    Then it even could become affordable for a consumer to buy.
    These are good enough for some of us already: 125 thousand roubles for E8C-mITX mobo is about $1500 and I can afford that any month, for example.

    Thank you for a decent overview, saved as en.altlinux.org/elbrus/faq if you don't mind.
    Reply
  • mshigorin
    Some article proofreading in case Anton sees this:
    One interesting feature of the MCST Elbrus-16C processor is its support for 4-way symmetric multiprocessor configurations, a first for the company.
    Definitely not the first: we've got 4.4 and 804 servers that employ quad 4C and 8C, accordingly; don't remember what was up with 2C, IIRC there were dual-CPU configurations possible.

    Virtualization and 4-way SMP support will allow Russian server makers to build cloud servers based on the MCST Elbrus-16C parts over time when there are appropriate operating systems available.
    These are available already, and we can port ALT Virtualization Server to e2kv6 just as we already did for aarch64 and ppc64le: getalt.org/en/alt-server-v

    Спасибо за статью, несмотря на эти ляпы. Привет из Москвы ;-)
    Reply
  • hotaru.hino
    yaapelsinko said:
    All looks good, and again I will repeat: it doesn't have to compete with any CPUs in any benchmarks, it only has to be good enough. Good enough means you don't have to put an entire server rack where one other server would suffice. But if sometimes you have to put 2 CPUs instead of one into a server, then it's not a big deal.
    I wish this point were driven harder in tech circles. Often times it's not about how many bogomarks you can get. It's whether or not you get enough.
    Reply