Amazon solidifies push into custom chips that replace Intel and AMD processors, unveils Graviton4 and Trainium2 cloud servers
Amazon's new Graviton4 and Trainium2 mark a new era in custom silicon development.
Amazon on Tuesday introduced two of its new processors, which will be used exclusively by its Amazon Web Service (AWS) division. The two new system-in-packages are the Graviton 4 processor for general-purpose workloads and the Trainium 2 processor for artificial intelligence training. The new devices will power AWS instances that will provide supercomputer-class performance to Amazon clients.
Graviton4
Amazon's Graviton4 — the company's 4th Generation general-purpose processor that competes against AMD EPYC and Intel Xeon CPUs — continues to use Arm's 64-bit architecture and packs 96 cores. The processor is said to showcase up to 30% higher compute performance, a 75% increase in memory bandwidth compared to its predecessor, Graviton3. AWS did not disclose whether its Graviton4 relies on Arm's Neoverse V2 cores, but this is a likely scenario.
Like the AWS Graviton3, the AWS Graviton4 relies on a multi-chiplet design composed of seven tiles. Keeping in mind that AWS has worked with Intel Foundry Services over packaging, this is expected. It is not completely clear which foundry produces silicon for the Graviton4.
Graviton4 will be included in memory-optimized EC2 R8g instances that are aimed at demanding databases, in-memory caches, and large-scale data analytics. These new R8g instances will provide up to three times more virtual CPUs (vCPUs) and triple the memory compared to the existing R7g instances. As a result, AWS clients will be able to handle more data, expand their projects more efficiently, and speed up processing speeds. Currently, Graviton4-equipped R8g instances are accessible for testing, and they are expected to be commercially available in the next few months.
"As part of the migration process of SAP HANA Cloud to AWS Graviton-based Amazon EC2 instances, we have already seen up to 35% better price performance for analytical workloads," said Juergen Mueller, CTO and member of the Executive Board of SAP SE. "In the coming months, we look forward to validating Graviton4, and the benefits it can bring to our joint customers."
Trainium2
In addition to the AWS Graviton4 processor for general-purpose workloads, Amazon also introduced its new Trainium2 system-in-package for AI training, which will compete against Nvidia's H100, H200, and B100 AI GPUs. Trainium2 is designed to offer training speeds up to four times faster, and with triple the memory capacity, than the original Trainium chips, which is a significant achievement. Additionally, Trainium2 has improved energy efficiency, achieving up to twice the performance per watt. Machines based on Trainium2 will be connected using the AWS Elastic Fabric Adapter (EFA), which offers petabit-scale performance.
What is particularly impressive about Trainium2 is that it will be available in EC2 Trn2 instances that can scale up to 100,000 Trainium2 chips that are set to provide computing power of up to 65 AI ExaFLOPS — enabling users to access performance akin to supercomputers on demand. Such scaling capabilities mean that training a large language model with 300 billion parameters, which previously took months, can now be accomplished in just weeks.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
From a design point of view, the AWS Trainium2 SiP also uses a multi-tile design featuring two compute chiplets, four HBM memory stacks, and two unknown chiplets (which likely enable high-speed connectivity).
"We are working closely with AWS to develop our future foundation models using Trainium chips," said Tom Brown, co-founder of Anthropic. "Trainium2 will help us build and train models at a very large scale, and we expect it to be at least 4x faster than first generation Trainium chips for some of our key workloads. Our collaboration with AWS will help organizations of all sizes unlock new possibilities, as they use Anthropic's state-of-the-art AI systems together with AWS’s secure, reliable cloud technology."
Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.
-
bit_user AWS did not disclose whether its Graviton4 relies on Arm's Neoverse V2 cores, but this is a likely scenario.
Yes, if we're to take their claim of 30% better performance as applying to the entire CPU, rather than per-core, in spite of 50% more cores, then that strongly points to a shift from V1 (i.e. in Graviton 3) to N2 cores.
To be honest, I was one of the many surprised by Graviton 3's adoption of the V-series cores. A user like Amazon seems like they'd be very interested in balancing efficiency with performance, which is what the N-series is all about. The V-series were performance-oriented and perhaps attractive to AI users for their incorporation of dual 256-bit SVE vector engines. Timing was also likely a factor, as the N2 cores might not have been ready at the point when Amazon wanted to build the Graviton 3.
Keeping in mind that AWS has worked with Intel Foundry Services over packaging, this is expected. It is not completely clear which foundry produces silicon for the Graviton4.
Oh, it's definitely not Intel. It would be too soon for that. ARM needs to port its designs to each manufacturing node, and I think they haven't been working with IFS long enough for the Graviton 4 to be fabbed by Intel.
90% likelihood it's TSMC, though it'd sure be a coup for Samsung, if they got the contract.