Ampere announces 256-core 3nm CPU, unveils partnership with Qualcomm

Ampere
(Image credit: Ampere)

Ampere Computing today introduced its roadmap for the coming years, including new CPUs and collaborations with third parties. In particular, the company said it would launch its all-new 256-core AmpereOne processor next year, made on TSMC's N3 process technology. Also, Ampere is teaming up with Qualcomm to build AI inference servers with the company's accelerators. Apparently, Huawei is also looking at integrating third-party UCIe-compatible chiplets into its own platforms.

256-core CPUs incoming

Ampere has begun shipping 192-core AmpereOne processors with an eight-channel DDR5 memory subsystem it introduced a year ago. Later this year, the company plans to introduce 192-core AmpereOne CPUs with a 12-channel DDR5 memory subsystem, requiring a brand-new platform. 

(Image credit: Ampere)

Next year, the company will use this platform for its 256-core AmpereOne CPU, which will be made using one of TSMC's N3 fabrication processes. The company does not disclose whether the new processor will also feature a new microarchitecture, though it looks like it will continue to feature 2 MB of L2 cache per core. 

"We are extending our product family to include a new 256-core product that delivers 40% more performance than any other CPU in the market," said Renee James, chief executive of Ampere. "It is not just about cores. It is about what you can do with the platform. We have several new features that enable efficient performance, memory, caching, and AI compute." 

The company says that its 256-core CPU will use the same cooling system as its existing offerings, which implies that its thermal design power will remain in the 350-watt ballpark. 

Teaming up with Qualcomm for AI servers

While Ampere can certainly address many general-purpose cloud instances, its capabilities for AI are fairly limited. The company itself says that its 128-core AmpereOne CPU with its two 128-bit vector units per core (and supporting INT8, INT16, FP16, and BFloat16 formats) can offer performance comparable to Nvidia's A10 GPU, albeit at lower power. Ampere certainly needs something better to compete against Nvidia's A100, H100, or B100/B200. 

So, it teamed up with Qualcomm, and the two companies plan to build platforms for LLM inferencing based on Ampere's CPUs and Qualcomm's Cloud AI 100 Ultra accelerators. There is no word when the platform will be ready, but it demonstrates that Ampere's ambitions do not end with general-purpose computing.

Chiplet plans

Last but not least, Ampere announced the formation of a UCIe working group within the AI Platform Alliance. The company intends to leverage the flexibility of its CPUs with the UCIe open interface technology and incorporate customer-developed IPs into future CPUs, which essentially enable it to build custom silicon for its clients.

TOPICS
Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • bit_user
    That's indeed a lot of cores, but I'd point out that AMD Bergamo already has 256 threads (using 128 Zen 4C cores).

    So far, what little performance data I've seen on AmpereOne performance suggests they don't perform markedly better than the ARM Neoverse V-series cores used by Amazon, Nvidia, and others.

    The article said:
    So, it teamed up with Qualcomm, and the two companies plan to build platforms for LLM inferencing based on Ampere's CPUs and Qualcomm's Cloud AI 100 Ultra accelerators.
    Yeah, but you could just put those Qualcomm PCIe cards in a server with any other kind of CPU for similar AI performance and efficiency. When scaling AI inference, it's really the AI accelerators that primarily drive the efficiency. What CPU sits at the heart of the system doesn't matter as much.
    Reply
  • thestryker
    bit_user said:
    So far, what little performance data I've seen on AmpereOne performance suggests they don't perform markedly better than the ARM Neoverse V-series cores used by Amazon, Nvidia, and others.
    They've been seemingly less open about these than their prior Neoverse based SoCs I wonder if that's why. Of course they'd also said they were targeting specific workload performance increases over general IPC so I'd love to see that in action.
    bit_user said:
    That's indeed a lot of cores, but I'd point out that AMD Bergamo already has 256 threads (using 128 Zen 4C cores).
    Not to mention GNR/Zen 5 will both be likely bringing that core count to high performance this year and also SRF/Zen 5c so the competition in this space ought to be massive.
    Reply
  • Pierce2623
    bit_user said:
    That's indeed a lot of cores, but I'd point out that AMD Bergamo already has 256 threads (using 128 Zen 4C cores).

    So far, what little performance data I've seen on AmpereOne performance suggests they don't perform markedly better than the ARM Neoverse V-series cores used by Amazon, Nvidia, and others.


    Yeah, but you could just put those Qualcomm PCIe cards in a server with any other kind of CPU for similar AI performance and efficiency. When scaling AI inference, it's really the AI accelerators that primarily drive the efficiency. What CPU sits at the heart of the system doesn't matter as much.
    Isn’t AmpereOne just a neoverse core like Ampere Altra was, but just a newer version?
    Reply
  • thestryker
    Pierce2623 said:
    Isn’t AmpereOne just a neoverse core like Ampere Altra was, but just a newer version?
    Nope it's supposed to be semi/full custom, but they haven't released much of any information about it still.

    Ian Cutress lamented the lack of concrete information in his post about this announcement: https://morethanmoore.substack.com/p/ampere-2024-one-more-step-forward
    Reply
  • bit_user
    Pierce2623 said:
    Isn’t AmpereOne just a neoverse core like Ampere Altra was, but just a newer version?
    What @thestryker said, but it's really not surprising if we consider that Ampere Computing released CPUs with in-house designed cores before Altra. From what I heard, Altra was something of a stopgap measure, due to AmpereOne running behind schedule.

    Before Altra, there was eMAG. However, that was a derivative of X-Gene 3, which was designed by AppliedMicro and acquired by Ampere either at its founding or shortly thereafter. As the name implies, X-Gene 3 wasn't their first rodeo.
    https://www.anandtech.com/show/5098/applied-micros-xgene-the-first-armv8-soc
    Reply