AMD's re-entrance into the data center begins with its EPYC lineup, which pits the company against the dominant Intel. Not enough pressure? Part of AMD's strategy also includes attacking the burgeoning AI segment. That lines the company up against Nvidia, which by all rights has a dominating position in the machine learning segment.
However, AMD is the only company with both GPUs and CPUs under the same roof, which it feels hands it an advantage when it comes to complementary designs.
That's where EPYC comes in. It's no coincidence that EPYC, AMD's new line of data center processors, have a copious allotment of 128 PCIe lanes. AMD believes that makes the EPYC platform a great fit for incredibly dense AI platforms in single socket servers. We broke down AMD's latest EPYC processors in our AMD Unveils EPYC Server Processor Models And Pricing Guidelines Article, but the single-socket server is one of the most important aspects of AMD's two-pronged AI strategy.
The Single Sockets
Roughly 25% of today's server platforms, which are almost entirely powered by Intel, ship with only one socket populated. That means they have only one processor, so there are a number of redundant components on the motherboard and in the chassis that aren't needed. Eliminating these redundancies reduces costs on multiple axes, possibly making a dedicated single-socket server a data center architects' best friend.
The data center is shifting en masse to AI-centric architectures, but these designs require beefy data throughput to feed the hungry GPUs. AMD's single-socket server leverages the platform's 2TB of memory capacity and 128 PCIe lanes to provide copious connectivity. As seen above in Inventec's system, cramming six GPUs into single-socket chassis is easy if you have enough connectivity and cores to push the workload. EPYC's 64 threads should suit those purposes nicely. The end result? Up to 100 TFLOPS in a single chassis. That's performance density at its finest.
AMD Trusts Its Instincts
AMD's got quite a bit of graphics IP laying around, so it employs a range of its architectures, including Vega, Polaris, and Fiji, to target various segments of the AI space. The Vega-powered MI25 slots in as the workhorse for the heavy compute-intensive training workloads, while the Polaris-based MI6 is more of an all-rounder that can handle a variety of training and inference workloads. The Fiji-based MI8 handles the low-power tasks, such as lightweight inference, at a lower price point.
Our resident GPU expert, Chris Angelini, did the heavy lifting when AMD announced these cards in December 2016. Head over to his article for more coverage of the high-level view. AMD also has the Vega Frontier Edition in the hopper and recently teased us with performance details.
AMD has its new Instinct solutions headed to market in Q3 and accordingly released more details. Let's take a quick trip through AMD's stable of Instinct cards.
Vega - Radeon Instinct MI25
Vega powers onto the scene with Global Foundries' 14nm FinFETs in tow. Well, arguably, it's Samsung's process. In either case, the MI25 bears down with a peak 24.6 TFLOPS of FP16 and 12.3 FP32 TFLOPS delivered by its 64 compute units, which equates to 4,096 stream processors. A complementary 16GB of ECC HBM2 provides up to 484 GB/s of memory bandwidth. The MI25 will eventually square up with Nvidia's beastly Voltas.
Fiji - Radeon Instinct MI8
AMD geared the Fiji-powered card to address HPC and inference workloads in a small form factor. It wields a peak 8.2 TFLOPS of FP16/FP32 and sucks 175W. It also features 4GB of HBM. Power efficiency is the goal here; the M18 provides 47 GFLOPS per Watt.
Polaris - Radeon Instinct M16
The MI6 leverages the Polaris architecture and sips a mere 150W to provide a peak FP16/FP32 5.7 TFLOPS. It steps down a notch from HBM to 16GB of GDDR5.
ROCm Sock 'Em
The entire lineup is worthless without tools, so AMD has designed a set of open source software tools. Here's the breakdown, courtesy of AMD:
Planned for June 29th rollout, the ROCm 1.6 software platform with performance improvements and now support for MIOpen 1.0 is scalable and fully open source providing a flexible, powerful heterogeneous compute solution for a new class of hybrid Hyperscale and HPC-class system workloads. Comprised of an open-source Linux® driver optimized for scalable multi-GPU computing, the ROCm software platform provides multiple programming models, the HIP CUDA conversion tool, and support for GPU acceleration using the Heterogeneous Computing Compiler (HCC). The open-source MIOpen GPU-accelerated library is now available with the ROCm platform and supports machine intelligence frameworks including planned support for Caffe, TensorFlow and Torch.
Open The Things
The open source component is key. The industry is weary of proprietary solutions and vendor lock-in. AMD is investing heavily in developing open tools for both the EPYC and Instinct lineups, which is encouraging. AMD even has budding initiatives in more far-reaching climes, such as Gen-Z, OpenCAPI, and CCIX standards.
The industry awaits a wave of competitive x86 alternatives based on open architectures. AMD's EPYC has already landed, and the Instinct lineup isn't far behind. It ships in Q3 to AMD's partners, which include Boxx, Colfax, Exxact Corporation, Gigabyte, Inventec and Supermicro, among others.