The Intel Arc A380 has to be one of the worst graphics card launches in history — not the hardware itself, necessarily, but the retail launch of the hardware. By all indications, Intel knew the drivers were broken when the hardware was ready for release earlier this year. Rather than taking sufficient time to fix the drivers before the retail launch, and with the clock ticking as new AMD and Nvidia GPUs are on the horizon, Intel decided to ship its Arc GPUs first in China — likely not the sort of approach a company would take if the product were worthy of making our list of the best graphics cards.
Several months later, after plenty of negative publicity courtesy of GPUs that made their way to other shores, and with numerous driver updates come and gone, Arc A380 has officially launched in the US with a starting price of $139. The single offering on Newegg sold out and is currently back ordered, but that's likely more to do with limited supplies than high demand. Still, the A380's not all bad, and we're happy to see Team Blue rejoin the dedicated GPU market for the first time in over 24 years. (And no, I don't really count the Intel DG1 from last year, since it only worked on specific motherboards.)
How does the Arc A380 stack up to competing AMD and Nvidia GPUs, and what's all the hype about AV1 hardware encoding acceleration? You can see where it lands in our GPU benchmarks hierarchy, which if you want a spoiler is… not good. But let's get to the details.
Arc Alchemist Architecture Recap
We've provided extensive coverage on Intel's Arc Alchemist architecture, dating back to about one year ago. At the time we first wrote that piece, we were anticipating a late 2021 or early 2022 launch. That morphed into a planned March 2022 launch, then eventually a mid-2022 release — and it's not even a full release, at least not yet. Arc A380 is merely the first salvo, at the very bottom of the price and performance ladder. We've seen plenty of hints of the faster Arc A750, which appears to be close to RTX 3060 performance based on Intel's own benchmarks, and that should launch within the next month or so. What about the faster still Arc A770 or mid-tier Arc A580 and other products? Only time will tell.
Arc Alchemist represents a divergence from Intel's previous graphics designs. There's probably plenty of overlap in certain elements, but Intel has changed names for some of the core building blocks. Gone are the "Execution Units (EUs)," which are now called Vector Engines (VEs). Each VE can compute eight FP32 operations per cycle, which gets loosely translated into "GPU cores" or GPU shaders and is roughly equivalent to the AMD and Nvidia shaders.
Intel groups 16 VEs into a single Xe-Core, which also includes other functionality. Each Xe-Core thus has 128 shader cores and roughly translates as equivalent to an AMD Compute Unit (CU) or Nvidia Streaming Multiprocessor (SM). They're basically all SIMD (single instruction multiple data) designs, and like the competition, Arc Alchemist has enhanced the shaders to meet the full DirectX 12 Ultimate feature set.
That naturally means having ray tracing hardware incorporated into the design, and Intel has one Ray Tracing Unit (RTU) per Xe-Core. The exact details of the ray tracing hardware aren't entirely clear yet, though based on testing each Intel RTU might match up decently against an Nvidia Ampere RT core.
Intel didn't stop there. Alongside the VEs and RTUs and other typical graphics hardware, Intel also added Matrix Engines, which it calls XMX Engines (Xe Matrix eXtensions). These are similar in principle to Nvidia's Tensor cores and are designed to crunch though lots of less precise data for machine learning and other uses. An XMX Engine is 1024-bits wide and can process either 64 FP16 operations or 128 INT8 operations per cycle, giving Arc GPUs a relatively large amount of compute power.
Intel Arc A380 Specifications
With that brief overview of the architecture out of the way, here are the specifications for the Arc A380, compared to a couple of competing AMD and Nvidia GPUs. While we provide theoretical performance here, remember that not all teraflops and teraops are created equal. We need real-world testing to see what sort of actual performance the architecture can deliver.
|Graphics Card||Arc A380||RX 6500 XT||RX 6400||GTX 1650 Super||GTX 1650|
|Architecture||ACM-G11||Navi 24||Navi 24||TU116||TU117|
|Process Technology||TSMC N6||TSMC N6||TSMC N6||TSMC 12FFN||TSMC 12FFN|
|Die size (mm^2)||157||107||107||284||200|
|SMs / CUs / Xe-Cores||8||16||12||20||14|
|GPU Cores (Shaders)||1024||1024||768||1280||896|
|Ray Tracing 'Cores'||8||16||12||—||—|
|Base Clock (MHz)||2000||2310||1923||1530||1485|
|Boost Clock (MHz)||2450||2815||2321||1725||1665|
|VRAM Speed (Gbps)||15.5||18||16||12||8|
|VRAM Bus Width||96||64||64||128||128|
|TFLOPS FP32 (Boost)||5||5.8||3.6||4.4||3|
|TFLOPS FP16 (MXM/Tensor if Available)||40||11.6||7.2||8.8||6|
|Video Encoding||H.264, H.265, AV1, VP9||—||—||H.264, H.265 (Turing)||H.264, H.265 (Volta)|
|Launch Date||Jun 2022||Jan 2022||Jan 2022||Nov 2019||Apr 2019|
On paper, Intel's Arc A380 basically competes against AMD's RX 6500 XT and RX 6400, or Nvidia's GTX 1650 Super and GTX 1650. It's priced slightly lower than the competition, especially looking at current online prices for new cards, with roughly similar features. There are some important qualifications to note, however.
Nvidia doesn't have ray tracing hardware below the RTX 3050 (or RTX 2060). Similarly, none of the AMD or Nvidia GPUs in this segment support tensor hardware either, giving Intel a potential advantage in deep learning and AI applications — we've included FP16 throughput for the GPU cores on the AMD and Nvidia cards by way of reference, though that's not entirely apples-to-apples.
Intel is the only GPU company that currently has AV1 and VP9 hardware accelerated video encoding. We're expecting AMD and Nvidia to add AV1 support to their upcoming RDNA 3 and Ada architectures, and possibly VP9 as well, but we don't have official confirmation on how that will play out. We'll look at encoding performance and quality later in this review as well, though note that the GTX 1650 uses Nvidia's older NVENC hardware that delivers a lower quality output than the newer Turing (and Ampere) version.
The Arc A380 has theoretical compute performance of 5.0 teraflops, which puts it slightly behind the RX 6500 XT but ahead of everything else. It's also the only GPU in this price class to ship with 6GB of GDDR6 memory, with a 96-bit memory interface. That gives the A380 more memory bandwidth than AMD but without Infinity Cache, and less memory bandwidth than Nvidia's GPUs. Power use targets 75W, though overclocked cards can exceed that, just like with AMD and Nvidia GPUs.
The ray tracing capabilities are harder to pin down. To quickly recap, Nvidia's Turing architecture on the RTX 20-series GPUs had full hardware ray tracing capabilities, and each RT core can do one ray/triangle intersection calculation per cycle, plus there's hardware support for BVH (bounding volume hierarchy) traversal. It's not clear how many ray/box BVH intersections per cycle the RT cores manage, as Nvidia to my knowledge hasn't provided any specific number.
Nvidia's Ampere architecture added a second ray/triangle intersection unit to the RT cores, potentially doubling the throughput. (It seems the Turing BVH hardware was "faster" than the ray/triangle hardware in most cases, so Ampere focused on improving the triangle rate.) In practice, Nvidia says Ampere's RT cores are typically 75% faster than Turing's RT cores, as Ampere can't always fill all the ray/triangle execution slots.
AMD's RDNA 2 architecture handles things a bit differently. It can do one ray/triangle intersection calculation per cycle on each Ray Accelerator, basically like Turing. However, it uses GPU shaders (technically texture units) for BVH traversal, at a rate of four ray/box intersections per cycle. That rate isn't too bad in theory, considering the number of texture units in RDNA 2, but the BVH work ends up conflicting with other shader and texture work. Ultimately, it makes AMD's current Ray Accelerators slower and less efficient than Nvidia's RT cores, and perhaps more memory intensive as well (judging by real-world performance).
Intel's RTUs are similar to Nvidia's RT cores in that they can do both ray/box intersections and ray/triangle intersections in hardware. The above video explains things in more detail, but the raw throughput is up to 12 ray/box BVH intersections per cycle and one ray/triangle intersection per cycle, per RTU. Intel also has a dedicated BVH cache to improve hit rates and performance, and a Thread Sorting Unit that optimizes the output from the RTUs to better match shading workloads for the Xe-Cores.
Intel makes the claim, more or less, that it's RTUs are actually more capable than Nvidia's Ampere RT cores, which would also mean the RTUs are better than Turing and RDNA 2 as well. To prove this point, sort of, Intel showed ray tracing performance in 17 games, pitting the Arc A770 against the RTX 3060.
Overall, Intel shows ray tracing performance on the A770 that's about 12% faster than the RTX 3060. Of course that's not the A380, and there are plenty of other factors that go into gaming performance as we're not doing pure ray tracing yet. The A770 also has 32 RTUs compared to the RTX 3060's 30 RT cores. Still, Intel's RTUs sound pretty decent on paper.
The thing is, with only eight RTUs, the A380 definitely won't be a ray tracing powerhouse — Nvidia for example has 20 or more RT cores in its RTX lineup, or 16 if you include the mobile RTX 3050 in the list, Nvidia's slowest RTX chip. AMD on the other hand has as few as 12 Ray Accelerators in its RX 6000-series parts, and integrated RDNA 2 implementations like the Steam Deck can have as few as eight RAs, though RT performance understandably suffers quite a lot — not that you need ray tracing, even four years after hardware first supported the functionality.