HPE's unnamed 1,152-core system pushes Turbostat to support 8,192 cores in Linux 6.15

Intel Xeon Emerald Rapids
(Image credit: Intel)

Linux 6.15 will bring support for 8,192 cores in the Turbostat CPU monitoring utility, if you happen to have such a system (via Phoronix). The change was driven by an HPE (Hewlett-Packard Enterprise) engineer who faced an issue with their unnamed 1,152-core system since Turbostat wasn't designed to handle more than 1,024 cores/threads. We currently aren't aware of any server CPU configurations that can exceed this limit (in terms of physical cores), so this may be a custom or next-generation solution from Intel or AMD. The utility currently only supports x86 processors, which seemingly rules out an Arm system from causing the issue.

Turbostat is a Linux command-line utility provided by the kernel-tools package and is baked into most distributions. It's a monitoring utility that reports clock speeds, idle power-state statistics, temperature, etc., on x86-based processors. This is important information, as we can infer that the 1,152 core system is likely an Intel/AMD solution. Likewise, a while back, Ampere's 384-core servers exposed a maximum core count limitation with the ARM64 Linux kernel, which only supported up to 256 cores.

Turbostat had a hardcoded limit (CPU_SUBSET_MAXCPUS) that was set to 1,024, which defines the maximum number of CPUs (cores) it can handle. Yesterday, just before the merge window for Linux 6.15-rc1 closed, the CPU limit was increased to 8,192 along with the addition of a CPU idle debug telemetry tool, and several bug fixes.

Linux 8,192 core support

(Image credit: Lore.kernel)

The HPE engineer didn't specify the details of the hardware powering their system. On the Intel side, it would make sense to look into its latest Xeon 6 'Granite Rapids' offerings, where we find the Xeon 6788P (86 cores) with 688 cores or 1376 threads in an 8S configuration or the Xeon 6900E (288 cores), topping out at 576 cores when put in a 2S setup. Similarly, AMD's EPYC 9005 'Turin Dense' can achieve 384 cores in a dual-socket configuration with the EPYC 9965.

Since none of these match up to the 1,152-core system, it's plausible to assume HPE is using a custom solution for higher socket counts. There is a possibility that this metric refers to the logical cores (threads) and not the physical cores, which falls well in the ballpark of existing solutions. As far as future products like Diamond Rapids and Venice are concerned, we're still in the dark regarding key specs like core counts.

Hassam Nasir
Contributing Writer

Hassam Nasir is a die-hard hardware enthusiast with years of experience as a tech editor and writer, focusing on detailed CPU comparisons and general hardware news. When he’s not working, you’ll find him bending tubes for his ever-evolving custom water-loop gaming rig or benchmarking the latest CPUs and GPUs just for fun.

  • A Stoner
    8192 is 13 bits, an odd place to stop, but I guess it does not make sense to go to 16 bits and 65,536 support at this point in time.
    Reply
  • Jame5
    2x 288 core CPUs is 576 cores.

    w/ HT (though I'm pretty sure the 288 core CPUs don't have it), that would be exactly 1152.

    Is there a 4-socket solution that can take the 288 core sierra forest CPUs?
    Reply
  • qxp
    Jame5 said:
    2x 288 core CPUs is 576 cores.

    w/ HT (though I'm pretty sure the 288 core CPUs don't have it), that would be exactly 1152.

    Is there a 4-socket solution that can take the 288 core sierra forest CPUs?
    Turbostat reports both per core and per hyperthread info. For example, the core temperature is reported once per core, but IRQ count is per execution thread. From the description in the article, I would assume they did need more cores.
    Reply
  • bit_user
    A Stoner said:
    8192 is 13 bits, an odd place to stop, but I guess it does not make sense to go to 16 bits and 65,536 support at this point in time.
    The issue is not the number of bits, but rather the size of the static data structures needed to support all of those cores. Any tables used to track stuff on a per-core basis need to be enlarged to support the max number of cores the kernel is compiled with.
    Reply
  • bit_user
    Jame5 said:
    2x 288 core CPUs is 576 cores.

    w/ HT (though I'm pretty sure the 288 core CPUs don't have it), that would be exactly 1152.

    Is there a 4-socket solution that can take the 288 core sierra forest CPUs?
    I think this is probably for an 8-socket solution, which the top tier Intel CPUs traditionally support. My guess is that the number was enlarged to support hyperthreads, in which case it would require 8x 72-core CPUs to reach. Given Intel's Xeon 6 models that support hyperthreading have up to 128 physical cores, it's easy to believe.

    AMD's EPYCs haven't supported more than 2S scalability and I don't see that changing. Higher socket-count is a niche market that's probably shrinking in at least relative size, as server CPU core counts continue to balloon. I think it's too early for HPE to be testing the next gen AMD EPYCs, but it's not so hard to fathom a 288-core Zen 6C EPYC.
    Reply
  • A Stoner
    bit_user said:
    The issue is not the number of bits, but rather the size of the static data structures needed to support all of those cores. Any tables used to track stuff on a per-core basis need to be enlarged to support the max number of cores the kernel is compiled with.
    I started wondering if there was static data that needed to be held and reserved after I made the comment. So, you are likely right.
    Reply