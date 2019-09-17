DOE Enters Partnership to Use World's Largest Chips With 1.2 Trillion Transistors and 400,000 Cores

by
15 Comments

Cerebras Systems, makers of the world's largest single processor that weighs in with a whopping 1.2 trillion transistors and 400,000 AI cores, announced today that it has entered into a partnership with the Department of Energy (DOE), long the leader in the supercomputing space, to use its new wafer-scale chips for basic and applied science and medicine with super-scale AI.Credit: Tom's Hardware / GPU for scaleCredit: Tom's Hardware / GPU for scale

The Cerebras Wafer Scale Engine (WSE) sidesteps the reticle limitations of modern chip manufacturing, which limit the size of a single monolithic processor die, to create the wafer-sized processor. The company accomplishes this feat by stitching together the dies on the wafer, thus allowing it to work as one large cohesive unit.  Credit: Tom's HardwareCredit: Tom's Hardware

That creates a massive processor that measures 42,225 square millimeters, the largest in the world, that packs 1.2 trillion transistors fabbed on TSMC's 16nm process. That's 56.7 times larger than the world's largest GPU (815mm2 with 21.1 billion transistors). The massive chip also comes packing a whopping 40,000 AI-processing cores paired with 18GB of on-chip memory. That pushes out up to 9 PBps, yes, petabytes per second, of memory bandwidth. We recently had the chance to see the massive chip up close at the Hot Chips conference, and as you can see, it is larger than our laptop's footprint.

  • HC31_1.13_Cerebras.SeanLie.v02-page-002
  • HC31_1.13_Cerebras.SeanLie.v02-page-006
  • HC31_1.13_Cerebras.SeanLie.v02-page-007
  • HC31_1.13_Cerebras.SeanLie.v02-page-005
  • HC31_1.13_Cerebras.SeanLie.v02-page-004
  • HC31_1.13_Cerebras.SeanLie.v02-page-003
  • HC31_1.13_Cerebras.SeanLie.v02-page-008
  • HC31_1.13_Cerebras.SeanLie.v02-page-009
  • HC31_1.13_Cerebras.SeanLie.v02-page-011
  • HC31_1.13_Cerebras.SeanLie.v02-page-014
  • HC31_1.13_Cerebras.SeanLie.v02-page-010
  • HC31_1.13_Cerebras.SeanLie.v02-page-013
  • HC31_1.13_Cerebras.SeanLie.v02-page-012
  • HC31_1.13_Cerebras.SeanLie.v02-page-017
  • HC31_1.13_Cerebras.SeanLie.v02-page-018
  • HC31_1.13_Cerebras.SeanLie.v02-page-015
  • HC31_1.13_Cerebras.SeanLie.v02-page-016
  • HC31_1.13_Cerebras.SeanLie.v02-page-019
  • HC31_1.13_Cerebras.SeanLie.v02-page-022
  • HC31_1.13_Cerebras.SeanLie.v02-page-023
  • HC31_1.13_Cerebras.SeanLie.v02-page-027
  • HC31_1.13_Cerebras.SeanLie.v02-page-021
  • HC31_1.13_Cerebras.SeanLie.v02-page-025
  • HC31_1.13_Cerebras.SeanLie.v02-page-024
  • HC31_1.13_Cerebras.SeanLie.v02-page-026
  • HC31_1.13_Cerebras.SeanLie.v02-page-031
  • HC31_1.13_Cerebras.SeanLie.v02-page-028

The Cerebras WSE's will find a home in the Argonne and Livermore National Laboratories, where they will be used in conjunction with existing supercomputers to speed AI-specific workloads.

The DOE's buy-in on the project is incredibly important for Cerebras, as it signifies that the chips are ready for actual use in production systems. Also, as we've seen time and again, trends in the supercomputing space often filter down to more mainstream usages, meaning further development could find Cerebras' WSE in more typical server implementations in the future. Credit: Cerebras SystemsCredit: Cerebras Systems

The DOE also has a history of investing heavily in the critical software ecosystem needed for mass adoption, as we've seen with its investment in AMD's ROCM software suite for the exascale-class Frontier supercomputer, the work the agency is doing with Intel's OneAPI for the Aurora supercomputer, and the partnership with Cray for El Capitan. Credit: Cerebras SystemsCredit: Cerebras Systems

AI models are exploding in size as models double every five months. That doesn't currently appear to be a problem with the WSE's 18GB of SRAM memory, but because SRAM can't be scaled retroactively, larger models could soon outstrip the chips' native memory capacity. Cerebras tells us that it can simply use multiple chips in tandem to tackle larger workloads because, unlike GPUs, which simply mirror the memory across units (data parallel) when used in pairs (think SLI), the WSE runs in model parallel mode, which means it can utilize twice the memory capacity when deployed in pairs, thus scaling linearly. The company also says that scaling will continue with each additional wafer-size chip employed for AI workloads.

  • Cerebras Systems Slides Embargoed until 8 19 -page-001
  • Cerebras Systems Slides Embargoed until 8 19 -page-002
  • Cerebras Systems Slides Embargoed until 8 19 -page-003
  • Cerebras Systems Slides Embargoed until 8 19 -page-004
  • Cerebras Systems Slides Embargoed until 8 19 -page-005
  • Cerebras Systems Slides Embargoed until 8 19 -page-006
  • Cerebras Systems Slides Embargoed until 8 19 -page-007
  • Cerebras Systems Slides Embargoed until 8 19 -page-008
  • Cerebras Systems Slides Embargoed until 8 19 -page-009
  • Cerebras Systems Slides Embargoed until 8 19 -page-010
  • Cerebras Systems Slides Embargoed until 8 19 -page-011
  • Cerebras Systems Slides Embargoed until 8 19 -page-012
  • Cerebras Systems Slides Embargoed until 8 19 -page-013
  • Cerebras Systems Slides Embargoed until 8 19 -page-014
  • Cerebras Systems Slides Embargoed until 8 19 -page-015
  • Cerebras Systems Slides Embargoed until 8 19 -page-016
  • Cerebras Systems Slides Embargoed until 8 19 -page-017
  • Cerebras Systems Slides Embargoed until 8 19 -page-018
  • Cerebras Systems Slides Embargoed until 8 19 -page-019
  • Cerebras Systems Slides Embargoed until 8 19 -page-020
  • Cerebras Systems Slides Embargoed until 8 19 -page-021
  • Cerebras Systems Slides Embargoed until 8 19 -page-022
  • Cerebras Systems Slides Embargoed until 8 19 -page-023
  • Cerebras Systems Slides Embargoed until 8 19 -page-024
  • Cerebras Systems Slides Embargoed until 8 19 -page-025
  • Cerebras Systems Slides Embargoed until 8 19 -page-026
  • Cerebras Systems Slides Embargoed until 8 19 -page-027
  • Cerebras Systems Slides Embargoed until 8 19 -page-028
  • Cerebras Systems Slides Embargoed until 8 19 -page-029
  • Cerebras Systems Slides Embargoed until 8 19 -page-030
  • Cerebras Systems Slides Embargoed until 8 19 -page-031
  • Cerebras Systems Slides Embargoed until 8 19 -page-032
  • Cerebras Systems Slides Embargoed until 8 19 -page-033
  • Cerebras Systems Slides Embargoed until 8 19 -page-034
  • Cerebras Systems Slides Embargoed until 8 19 -page-035
  • Cerebras Systems Slides Embargoed until 8 19 -page-036
  • Cerebras Systems Slides Embargoed until 8 19 -page-037
  • Cerebras Systems Slides Embargoed until 8 19 -page-038
  • Cerebras Systems Slides Embargoed until 8 19 -page-039

We're told that today's announcement just covers the basics of the partnership, but that more details, specifically in regards to co-development, will be shared at the Supercomputer tradeshow in November. 

You'd Also Like

About the author
Paul Alcorn

Paul Alcorn is a Senior Editor for Tom's Hardware US. He writes news and reviews on CPUs, storage and enterprise hardware.

Read more
14 comments
Comment from the forums
    Your comment
  • InvalidError
    Now imagine a wafer-scale GPU!
  • bit_user
    Quote:
    The DOE's buy-in on the project is incredibly important for Cerebras, as it signifies that the chips are ready for actual use in production systems.

    Um, I think the DoE invests in a lot of experimental tech. I wouldn't assume it necessarily means the tech is yet ready for end users.

    Quote:
    Also, as we've seen time and again, trends in the supercomputing space often filter down to more mainstream usages, meaning further development could find Cerebras' WSE in more typical server implementations in the future.

    Also, we've seen plenty of supercomputing tech that didn't filter down, like clustering, Infiniband, silicon-germanium semiconductors, and other stuff that I honestly don't know much about, because it hasn't filtered down. In fact, the story of the past few decades has been largely about the way that so much tech has filtered up from desktop PCs into HPC.

    That's not tot say nothing filtered down - it's gone both ways. But the supercomputing industry used to be exclusively built from exotic, custom tech and has been transformed by the use of PCs, GPUs, and a lot of other commodity technology (SSDs, PCIe, etc.). Interestingly, it seems to be headed back in the direction of specialization, as it reaches scales and levels of workload-customization (such as AI) that make no sense for desktop PCs. I'd say this accelerator is a good example of that trend.

    In particular, the problem with wafer-scale is that it will always be extremely expensive, because die space costs a certain amount per area. The better your fault-tolerance is, the less sensitive you are to yield, but it's still the case that die area costs a lot of money, as does their exotic packaging.

    Quote:
    Cerebras tells us that it can simply use multiple chips in tandem to tackle larger workloads because, unlike GPUs, which simply mirror the memory across units (data parallel) when used in pairs (think SLI), the WSE runs in model parallel mode, which means it can utilize twice the memory capacity when deployed in pairs, thus scaling linearly.

    This is silly. Of course you can scale models on GPUs in exactly the same way they're talking about.

    Cool tech - and fun to read about, no doubt - but, this is exactly the sort of exotic tech that will remain the preserve of extreme high-end, high-budget computing installations.
  • bit_user
    Quote:
    Now imagine a wafer-scale GPU!

    Apart from the cost issues I mentioned above, graphics has different data access patterns than AI. That's a big part of their pitch.

    Graphics needs fast random-access and is somewhat difficult to partition (unless you simply replicate the data, which makes the architecture less efficient in terms of power, performance, and cost).
Display All 14 comments
Most Popular
  1. Intel's Tiger Lake Growls: 10nm Chip Packs 50 Percent More L3 Cache and AVX-512
  2. Motherboard Vendors Release New Firmware for Intel Core i9-9900KS
  3. Intel’s 10nm Snow Ridge Wields Tremont Atom Cores and Mesh Architecture
Edition
Subscribe to our newsletter
Company
Resources
Other Purch sites
  • © 2019 Purch All Rights Reserved.