Power Consumption of AI Workloads Approaches That of Small Country: Report
Demand for AI is immense these days. French firm Schneider Electric estimates that power consumption of AI workloads will total around 4.3 GW in 2023, which is slightly lower than power consumption of the nation of Cyprus (4.7 GW) was in 2021. The company anticipates that power consumption of AI workloads will grow at a compound annual growth rate (CAGR) of 26% to 36%, which suggests that by 2028, AI workloads will consume from 13.5 GW to 20 GW, which is more than what Iceland consumed in 2021.
Massive Power Requirements
In 2023, the total power consumption of all datacenters is estimated to be 54 GW, with AI workloads accounting for 4.3 GW of this demand, according to Schneider Electric. Within these AI workloads, the distribution between training and inference is characterized by 20% of the power being consumed for training purposes, and 80% allocated to inference tasks. This means that AI workloads will be responsible for approximately 8% of the total power consumption of datacenters this year.
Looking ahead to 2028, Schneider projects that the total power consumption of datacenters will escalate to 90 GW, with AI workloads consuming between 13.5 GW to 20 GW of this total. This indicates that by 2028, AI could be responsible for consuming around 15% to 20% of the total power usage of datacenters, showcasing a significant increase in the proportion of power consumed by AI workloads in datacenters over the five-year period. The distribution between training and inference is expected to shift slightly, with training consuming 15% of the power and inference accounting for 85%, according to estimates by Schneider Electric.
AI GPUs Get Hungrier
The escalating power consumption in AI datacenters is primarily attributed to the intensification of AI workloads, advancements of AI GPUs and AI processors, and increasing requirements of other datacenter hardware. For example, of Nvidia's A100 from 2020 consumed up to 400W, H100 from 2022 consumes up to 700W. In addition to GPUs, AI servers also run power-hungry CPUs and network cards.
AI workloads, especially those associated with training, necessitate substantial computational resources, including specialized servers equipped AI GPUs, specialized ASICs, or CPUs. The size of AI clusters, influenced by the complexity and magnitude of AI models, is a major determinant of power consumption. Larger AI models necessitate a more considerable number of GPUs, thereby increasing the overall energy requirements. For instance, a cluster with 22,000 H100 GPUs utilizes about 700 racks. An H100-based rack, when populated with eight HPE Cray XD670 GPU-accelerated servers, results in a total rack density of 80 kW. As a result, the whole cluster demands approximately 31 MW of power, excluding the energy required for additional infrastructural needs like cooling, Schneider Electric notes.
These clusters and GPUs are often operational at nearly full capacity throughout the training processes, ensuring that the average energy usage is almost synonymous with the peak power consumption. The document specifies that the rack densities in substantial AI clusters vary between 30 kW and 100 kW, contingent on the quantity and model of the GPU.
Network latency also plays a crucial role in the power consumption of AI datacenters. A sophisticated network infrastructure is essential to support the high-speed data communication required by powerful GPUs during distributed training processes. The necessity for high-speed network cables and infrastructures, such as those capable of supporting speeds up to 800 Gb/s, further escalates the overall energy consumption.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Given that AI workloads require power-hungry ASICs, GPUs, CPUs, network cards, and SSDs, cooling poses a major challenge. Given the high rack densities and the immense heat generated during computational processes, effective cooling solutions are imperative to maintain optimal performance and prevent hardware malfunctions or failures. Meanwhile air and liquid cooling methods are also 'expensive' in terms of power consumption, which is why they also contribute heavily to power consumption of datacenters used for AI workloads.
Some Recommendations
Schneider Electric does not expect power consumption of AI hardware to get lower anytime soon, and the company fully expects power consumption of an AI rack to get to 100 kW or higher. As such, Schneider Electric has some recommendations for datacentres specializing on AI workloads.
In particular, Schneider Electric recommends transitioning to a 240/415V distribution from the conventional 120/208V to better accommodate the high power densities of AI workloads. For cooling, a shift from air cooling to liquid cooling is advised to enhance processor reliability and energy efficiency, though immersive cooling might produce even better results. Racks used should be more capacious, with specifications such as being at least 750 mm wide and having a static weight capacity greater than 1,800 kg.

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.