Cerebras Systems, makers of the world's largest single processor that weighs in with a whopping 1.2 trillion transistors and 400,000 AI cores, announced today that it has entered into a partnership with the Department of Energy (DOE), long the leader in the supercomputing space, to use its new wafer-scale chips for basic and applied science and medicine with super-scale AI.
The Cerebras Wafer Scale Engine (WSE) sidesteps the reticle limitations of modern chip manufacturing, which limit the size of a single monolithic processor die, to create the wafer-sized processor. The company accomplishes this feat by stitching together the dies on the wafer, thus allowing it to work as one large cohesive unit.
That creates a massive processor that measures 42,225 square millimeters, the largest in the world, that packs 1.2 trillion transistors fabbed on TSMC's 16nm process. That's 56.7 times larger than the world's largest GPU (815mm2 with 21.1 billion transistors). The massive chip also comes packing a whopping 40,000 AI-processing cores paired with 18GB of on-chip memory. That pushes out up to 9 PBps, yes, petabytes per second, of memory bandwidth. We recently had the chance to see the massive chip up close at the Hot Chips conference, and as you can see, it is larger than our laptop's footprint.
The Cerebras WSE's will find a home in the Argonne and Livermore National Laboratories, where they will be used in conjunction with existing supercomputers to speed AI-specific workloads.
The DOE's buy-in on the project is incredibly important for Cerebras, as it signifies that the chips are ready for actual use in production systems. Also, as we've seen time and again, trends in the supercomputing space often filter down to more mainstream usages, meaning further development could find Cerebras' WSE in more typical server implementations in the future.
The DOE also has a history of investing heavily in the critical software ecosystem needed for mass adoption, as we've seen with its investment in AMD's ROCM software suite for the exascale-class Frontier supercomputer, the work the agency is doing with Intel's OneAPI for the Aurora supercomputer, and the partnership with Cray for El Capitan.
AI models are exploding in size as models double every five months. That doesn't currently appear to be a problem with the WSE's 18GB of SRAM memory, but because SRAM can't be scaled retroactively, larger models could soon outstrip the chips' native memory capacity. Cerebras tells us that it can simply use multiple chips in tandem to tackle larger workloads because, unlike GPUs, which simply mirror the memory across units (data parallel) when used in pairs (think SLI), the WSE runs in model parallel mode, which means it can utilize twice the memory capacity when deployed in pairs, thus scaling linearly. The company also says that scaling will continue with each additional wafer-size chip employed for AI workloads.
We're told that today's announcement just covers the basics of the partnership, but that more details, specifically in regards to co-development, will be shared at the Supercomputer tradeshow in November.