Skylake Power Optimizations
Power consumption is of critical importance in the data center. It's an ongoing expense that must be factored into total cost of ownership and managed accordingly. It also correlates with waste heat, necessitating cooling, eating up more power in kind.
One of the best ways to reduce costs is getting more useful work done per watt of power. Intel's Scalable Processor family improves efficiency compared to generations prior by serving up higher performance, but using less power per core.
Much like mainstream Skylake-based chips, the new Xeons employ Speed Shift technology, which cedes control of power states to the processor instead of relying on constant (and latent) hints from the operating system. Instead, the OS defines preferences, such as minimum and maximum performance levels, and the processor handles fine-grained adjustments. An expanded set of P-states allows the processor to control frequency and voltages on a more granular level, thereby saving power and accelerating response time. Speed Shift also eliminates the latency associated with P-state commands from the operating system.
This is a step forward from the Hardware Controlled Power Management (HWPM) feature in Broadwell-EP. Among other optimizations, Intel developed independent per-core voltage and frequency domains that allow the processor to dynamically manage key uncore components like the mesh topology and shared L3 cache. The larger L2 also reduces the number of requests from the LLC. Those requests require a trip across the mesh, and because all data movement consumes power, fewer requests means less power consumption.
Linux-Bench Power Consumption
We logged platform power consumption during our Linux-Bench run, so these measurements also include the effects of DRAM and power supply efficiency.
Factoring in the amount of work performed per watt provides the best view of overall power efficiency, we think. Generalizing remains a challenging endeavor, though. Because the Scalable Processor family covers such a wide range of target markets and relevant workloads, it's hard to identify a handful of tests applicable to everyone. As such, consider these results a basic indicator of overall power consumption.
The 8176's lofty power use numbers are the result of 10 more cores than the most similar previous-gen CPU. Those cores do get more done per watt though, particularly in threaded benchmarks. For applications that aren't as aggressively parallelized, it's best to buy a processor with fewer cores able to operate at a higher clock rate. Intel's "M" processors offer the best mix of core counts and per-core performance, but they're premium products with matching price tags.
The Platinum wields 55% more cores than Intel's E5-2699 v3, but it only consumes 14 more watt-hours during the Linux-Bench script. That's pretty impressive, especially if you factor in higher performance, too.
Power Maximum Load And Idle
Power draw is recorded every second in our enterprise lab. During the Linux-Bench tests, we captured multiple high-draw bursts that only appeared for one second, cresting at 711W. The same granularity is used during our Linpack tests, but because of the 8176's lower AVX frequencies, we only recorded a 670W peak.
The 8176's extra 10 cores lead to higher overall power draw at idle and under full load. As you can see in the second chart, though, which calculates per-core consumption by dividing the total by the core count, Intel's 8176 uses far less power per core than the company's previous-gen CPUs. This paints a nice picture of improved efficiency.
MORE: Best CPUs
MORE: All CPU Content