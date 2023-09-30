As people debate whether Moore’s Law is slowing, remains applicable, or is even dead or alive in the 2020s, Nvidia scientists herald the impressive momentum behind Huang’s Law. Over the last decade, Nvidia GPU AI-processing prowess is claimed to have grown 1000-fold. Huang’s Law means that the speedups we have seen in “single chip inference performance” aren’t now going to peter out but will keep on coming.

Nvidia published a blog post about Huang’s Law on Friday, outlining the belief and the work practices behind it. What Nvidia Chief Scientist Bill Dally describes as a “tectonic shift in how computer performance gets delivered in a post-Moore’s law era” is interestingly primarily based on human ingenuity. This characteristic seems somewhat unpredictable to establish a law upon, but Dally believes that the impressive chart below marks just the beginning of Huang’s Law.

(Image credit: Nvidia)

According to Dally's recent Hot Chips 2023 conference talk, the chart above shows a 1000-fold increase in GPU AI inference performance in the last ten years. Interestingly, unlike Moore's Law, process shrinking has had little impact on the progress of Huang's Law, said the Nvidia Chief Scientist.

(Image credit: Nvidia)

Dally recalls how a 16x gain was achieved from changing Nvidia GPU underlying number handling. Another big boost was delivered with the arrival of the Nvidia Hopper architecture, wielding the Transformer Engine. Hopper uses a dynamic mix of eight- and 16-bit floating point and integer math to deliver a 12.5x performance leap - as well as save energy - it is claimed. Previously, Nvidia Ampere introduced structural sparsity for a 2x performance increase, said the scientist. Advances like NVLink and Nvidia networking technology have further bolstered these impressive gains.

One of Dally's most eyebrow-raising claims was that the above 1000x compounded gains in AI inference performance contrast starkly with gains attributed to process improvements. Over the last decade, as Nvidia GPUs shifted from 28nm to 5nm processes, the semiconductor process improvements have "only accounted for 2.5x of the total gains," asserted Dally at Hot Chips.

With concepts such as "ingenuity and effort inventing and validating fresh ingredients" behind it, how will Huang's Law continue apace? Thankfully, Dally indicates that he and his team still see "several opportunities" for accelerating AI inference processing. Avenues to explore include "further simplifying how numbers are represented, creating more sparsity in AI models and designing better memory and communications circuits."