Yes, you can have too many CPU cores - Ampere's 192-core chips break ARM64 Linux kernel in two-socket systems, company requests higher core count support

Ampere
(Image credit: Ampere)

Ampere's new AmpereOne data center CPUs come bursting at the seams with up to 192 cores, but that plentiful helping has caused difficulties with Linux. According to Phoronix, the new CPUs have so many cores that Linux doesn't support systems when two of Ampere's 192-core chips (384 total cores) are installed in a single server. For now, the ARM64 Linux kernel only supports systems with 256 cores or less. To fix the issue, Ampere has submitted a patch proposing that the Linux kernel core limit be raised to 512 with a method called "CPUMASK_OFFSTACK."

This method lets Linux override the default 256-core limit in the current Linux kernel by allocating free bitmaps for CPU masks from memory. This means the Linux core limit can be raised without increasing the kernel image's memory footprint, as each core adds 8KB to the kernel image size.

Ampere's new CPUs feature the highest number of cores we've seen in a CPU to date. Even AMD's latest Zen 4c EPYC CPUs don't quite get there, with the highest core count chip at exactly 128 cores — two chips would hit the limit, but not extend beyond it. That explains why Ampere is the first CPU manufacturer to have serious problems with ARM64 Linux's 256-core limitation. Thankfully, it won't affect systems sporting only a single 192-core AmpereOne chip, but it's a serious problem for data center servers featuring two of these chips in a dual-socket configuration. (SMT logical cores, aka threads, are also well past the 256 figure on various systems, however.)

AmpereOne is a new CPU lineup from Ampere, sporting blisteringly high core counts with models in 136, 144, 160, 176, and 192-core flavors. These chips are built on the ARMv8.6+ instruction set and TSMC's 5nm node, featuring dual 128b Vector Units, 2MB of L2 cache per core, 3 GHz clock speed, eight-channel DDR5 memory controller, 128 PCIe Gen 5 lanes, and a 200-350W TDP. These chips are designed specifically for high-performance data center workloads that can utilize the hefty core counts.

According to Phoronix, it could take a while before the core count limit gets bumped up to 512. Back in 2021, a patch was sent out proposing the ARM64 CPU core limit be increased to 512, but Linux maintainers denied it because no CPU hardware was available at the time with more than 256 cores. At best, 512-core support won't be available until 2024, when Linux kernel 6.8 comes out.

This timeline only takes into consideration adding 512-core support the normal way, however, without utilizing the CPU mask off-stack method. Technically, the outgoing Linux kernel already supports the CPU mask off-stack method of increasing CPU core count limits, so it is simply up to Linux maintainers to enable this feature by default. 

Aaron Klotz
Contributing Writer

Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.

  • kanewolf
    Admin said:
    Data center CPU manufacturer Ampere has requested to boost the default Linux CPU core count from 256 to 512 to support its latest AmpereOne CPUs with core counts of up to 384 cores in dual-socket configurations.

    Yes, you can have too many CPU cores - Ampere's 192-core chips break ARM64 Linux kernel in two-socket systems, company requests higher core count s... : Read more
    This is not surprising. The HPC world has had to have tailored linux for many years.
    Reply
  • gfg
    It's wrong
    Zen 4c EPYC CPUs don't come close, with its highest core count chip at just 96 cores,
    128 coresAMD EPYC™ 9754 Zen4c
    Reply
  • Order 66
    I am kind of surprised considering what @kanewolf said. Also, @kanewolf why do you think this is surprising? my surprise is since HPC has been on linux for such a long time, I am surprised that nobody had though to increase this limit before.
    Reply
  • kanewolf
    Order 66 said:
    I am kind of surprised considering what @kanewolf said. Also, @kanewolf why do you think this is surprising? my surprise is since HPC has been on linux for such a long time, I am surprised that nobody had though to increase this limit before.
    Until recently, 256 cores was not an issue for most hosts. 1U or 2U server chassis that make up 99.999% of Linux 2 socket hosts couldn't come close to 256 cores. Only specialized hosts, generally used in HPC pushed more than 256 cores. Those specialized hosts could have vendor specific kernels maintained. SGI supported 2048 sockets 10 years ago. But they had custom Linux to support those extreme configs.
    I am not sure if the proposed kernel changes will be approved. This is still a very small market for such a major change.
    Reply
  • coromonadalix
    was there some other talk years ago about 128bit systems too, to deal with high count cpus cores ??
    Reply
  • Order 66
    coromonadalix said:
    was there some other talk years ago about 128bit systems too, to deal with high count cpus cores ??
    this is the first I've heard of it, but that's not saying much.
    Reply
  • vern72
    256 cores ought to be enough for everyone. :LOL:
    Reply
  • bit_user
    kanewolf said:
    Until recently, 256 cores was not an issue for most hosts. 1U or 2U server chassis that make up 99.999% of Linux 2 socket hosts couldn't come close to 256 cores.
    Intel typically offers up to 8-socket scalability on select SKUs, at the top end of their Xeon product line. At least, that's what they support for cache-coherent configurations without additional glue logic. With 8x of the top-spec Sapphire Rapids CPUs, you can reach 480 cores.
    https://ark.intel.com/content/www/us/en/ark/products/231747/intel-xeon-platinum-8490h-processor-112-5m-cache-1-90-ghz.html
    Note where it says: "Scalability: S8S"
    Reply
  • kanewolf
    bit_user said:
    Intel typically offers up to 8-socket scalability on select SKUs, at the top end of their Xeon product line. At least, that's what they support for cache-coherent configurations without additional glue logic. With 8x of the top-spec Sapphire Rapids CPUs, you can reach 480 cores.
    https://ark.intel.com/content/www/us/en/ark/products/231747/intel-xeon-platinum-8490h-processor-112-5m-cache-1-90-ghz.html
    Note where it says: "Scalability: S8S"
    Yes, but that is not a 2S config, and not a 1U or 2U host. The vast majority of datacenter hosts are 2S blades or 2S 1U or 2U hosts. Any company selling greater than 2S hosts has a tailored OS to support their hardware.
    Reply
  • brandonjclark
    You guys are missing the point. HPC has a very different set of requirements than traditional micro-serviced applications which also run on containers.

    An always on web app running via K8s is likely well-defined and is made up of many containers, volumes, a secret manager, etc.

    An HPC container tends to be a big 'ole fatty, containing everything needed.

    These large compute (and memory) dependent HPC containers are just not very "distributable", like in regular K8s apps over large numbers of nodes.

    So, a company produces a chip like these to PUSH the boundaries so they can be the best at offering super high core counts for even better HPC support.

    The kernel needs this patch, no biggie.

    It's not behind the times or ahead of it (linux). It's adapting to the modern needs.


    Now, if only HPC architects could figure out how to properly 'Kube some workloads!

    ;)
    Reply