Intel adds support for new Xe-HPCVG and Ponte Vecchio VG variants with disabled AI features
This one's a real enigma.
The latest update for Intel's LLVM compiler adds support for Xe-HPCVG graphics devices and Ponte Vecchio VG, a variant of Ponte Vecchio (via Coelacanth's Dream). The new VG variant of both the Xe-HPC architecture and Ponte Vecchio comes with disabled DPAS instructions (Dot Product Accumulate Systolic), crucial for AI workloads and normally accelerated by Xe Matrix Extension (XMX) units.
In many ways, Xe-HPCVG is unchanged from Xe-HPC: It still has 128KB of shared local memory, double-precision FP64 computing, and half-precision BF16 support. However, the key difference is the explicit lack of DPAS instructions, which are all AI-accelerated versions of half-precision instructions like BF16, FP16, and INT8. These instructions are critical for good AI and machine learning performance.
That a version of Ponte Vecchio comes without DPAS instructions is definitely strange, as the chip is largely intended for AI workloads. The lack of DPAS instructions implies that there are no functioning XMX units on this VG variant of Ponte Vecchio. It's hard to imagine Intel would make a new version of Ponte Vecchio specifically to remove one of its primary features, so Intel may just be repurposing Ponte Vecchio chips with defective XMX units for use in an upcoming GPU.
It's also unclear what the 'VG' means, though it could be for something like the 'SDV' in Ponte Vecchio SDV (which stands for Software Development Vehicle). Unfortunately, the update gives us very little to go on since it's mostly just the removal of specific support for DPAS instructions and not much else.
If Intel is preparing to launch an XMX-less version of Ponte Vecchio, we're not sure about its intended market. XMX units, like Nvidia's Tensor cores, have proven to be very important for AI workloads. Without those, Ponte Vecchio VG is left with FP64 and platform-related features. 52 TFLOPs of FP64 performance is still impressive and beats out Nvidia's H100 at 26 TFLOPs, but it's well behind AMD's MI300X at 81.7 TFLOPs.
Ponte Vecchio represents a massive investment in chip stacking and EMIB for Intel. The full solution consists of 47 different 'tiles' (chips) manufactured on various process nodes from Intel and TSMC. The XMX instructions are an integral part of the compute tiles, so it seems unlikely for defects to only affect the XMX portions of the chips. More likely, this is an intentional decision on Intel's part to disable that feature, but to what end remains unknown. We'll learn more in the future, when or if these VG parts come to market.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Matthew Connatser is a freelancing writer for Tom's Hardware US. He writes articles about CPUs, GPUs, SSDs, and computers in general.