Skip to main content

Xeon Phi: Intel's Larrabee-Derived Card In TACC's Supercomputer

Back To Larabee: Starting The Many Core Revolution

Larrabee is the code name for a now-infamous project whereby Intel planned to build a graphics card based on a many-core processor and go toe-to-toe with AMD and Nvidia. Why not use x86 for everything, the company asked, and make some GPU-specific changes to the hardware, along with software-based optimizations? The fact that Intel has a huge investment in the x86 ISA explains its interest in leveraging existing technology to solve the future's performance issues. 

The idea of Larrabee was intriguing. We even published our own analysis back in 2009 (Larrabee: Intel's New GPU). Unfortunately, later that same year, Intel announced that Larrabee would not be a retail part. Then, in 2010, we received word that not only was the project shelved, but that Intel was taking a derivative of Larrabee into the HPC space.

Fast forward to now. Not only is there a shipping product based on the last eight years of work, but it's also part of a 10 petaFLOPS-class supercomputer called Stampede, which we mentioned on the prior page. Both Intel and TACC are quick to point out that the hardware composing Stampede is pre-production, although it's purportedly fairly similar to the Xeon Phi 5110P and 3100 series coprocessors.

The competition is also very active in this space. Nvidia has a longer history of GPU-based computing than Intel, and it recently disclosed that the Titan supercomputer, developed by Cray for the Oak Ridge National Laboratory, employs Kepler-based Tesla K20 cards to help push performance as high as 20 petaFLOPS. 

AMD is similarly working to drum up excitement about its FirePro cards, particularly in light of the exceptional compute performance enabled by the Graphics Core Next architecture. In the meantime, we also see the company enjoying success with its Opteron processors. The same Titan supercomputer populated with Nvidia GPUs also leverages 18 688 Opteron 6274 CPUs, each with eight Bulldozer modules.

Bottom line: although Intel is a long-time proponent of using multiple cores in parallel, its approach up until now has largely involved general-purpose x86 CPUs operating in concert. Meanwhile, companies like AMD and Nvidia do their part to compete with graphics-oriented architectures that just so happen to handle floating-point math deftly. By jumping on-board now, Intel is late to the game. But it's banking on the ubiquity of x86 to make work easier on software developers, many of whom are still trying to get their heads around programming for CUDA or OpenCL.