Why you can trust Tom's Hardware
Let's take a look at the layout of the GB10 SoC before diving into its performance. Here's the logical topology of the chip, visualized with the lstopo utility:
Despite its branding, the Grace CPU on GB10 isn’t exactly like its data center counterparts. The Mediatek-designed CPU complex has 10 high-performance Arm Cortex-X925 cores and another 10 area-efficient Cortex-A725 cores. These are all off-the-shelf Arm designs, not Neoverse cores like you’ll find in an actual GB200 server.
Those 20 cores are spread across two clusters of 10, further subdivided into groups of five A725s and five X925s. Each X925 core has 2MB of L2 cache, and each A725 has 512KB. One cluster has 16MB of shared L3 cache, while the other has 8MB.
Nvidia DGX Spark | |
CPU | 20-core (10x Arm Cortex-X925 performance cores, 10x Cortex-X725 efficient cores) |
GPU | Nvidia Blackwell architecture, 6144 CUDA cores, 5th Generation Tensor Cores, 4th Generation RT Cores |
Memory | 128 GB LPDDR5x unified system memory, 256-bit interface, 4266 MHz, 273 GB/s bandwidth |
Networking | 1x RJ-45 (10 GbE), ConnectX-7 Smart NIC, Wi-Fi 7, Bluetooth 5.4 |
Storage | 4 TB NVMe M.2 with self-encryption |
Peripheral connectivity | 4x USB Type-C, 1x HDMI 2.1a, HDMI multichannel audio |
Power | 240W external power supply GB10 SoC TDP: 140W 100W available for other system components (ConnectX-7, Wi-Fi, SSD, USB-C, etc.) |
Meanwhile, the GB10 GPU has 6144 Blackwell CUDA cores, so it looks like the desktop RTX 5070 if you squint. But the similarities pretty much end there. Unlike the 672 GB/s of exclusive bandwidth afforded by the RTX 5070’s 12GB of GDDR7, the DGX Spark’s 128GB of LPDDR5X offers just 273 GB/s of raw memory bandwidth to the entire SoC.
The GB10 GPU also has much less L2 cache than the desktop RTX 5070, at 24 MB vs 48 MB for the desktop card, but that reduction is potentially balanced out by a 16MB L4 or side cache on GB10 that is meant to smooth out memory accesses across the chip.
Thanks to that relatively modest memory bandwidth figure, this system isn’t going to set any records for tokens generated per second on an LLM, but the world of AI applications thus far isn’t just about straight-line chatbot throughput.
At the frontiers of local AI, just being able to fit a model or group of models into VRAM is the difference between exploring their capabilities and getting nowhere at all. And content creation workflows like image and video generation are much more compute-intensive than they are bandwidth-hungry.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
You’ll often hear Nvidia tout a 1 PFLOP raw compute figure for GB10, but as explained by the Blackwell architectural whitepaper, that claim only holds with NVFP4 workloads that can exploit sparsity, which is a pretty narrow use case. Take sparsity out of the picture, and peak NVFP4 compute is cut in half, to about 500 TFLOPS. As floating-point and tensor accumulate precision goes up from there, theoretical performance only goes down. Suffice it to say that the 1 PFLOP figure is mostly marketing.
All told, if we look at compute, memory bandwidth, and memory capacity as a triangle, the DGX Spark compromises on memory bandwidth while delivering enough computing power and memory capacity to serve as a jack-of-all-trades AI development sandbox.
All this goes to emphasize that this SoC is targeted at AI workloads first and foremost. You can game on it with the right Linux tools, but that’s not its primary objective. Let's see how it performs for some common local AI workflows.
MORE: Best Graphics Cards
MORE: GPU Benchmarks and Hierarchy
MORE: All Graphics Content
Current page: Going deep on the GB10 Superchip
Prev Page Nvidia DGX Spark Introduction Next Page Performance
As the Senior Analyst, Graphics at Tom's Hardware, Jeff Kampman covers everything to do with GPUs, gaming performance, and more. From integrated graphics processors to discrete graphics cards to the hyperscale installations powering our AI future, if it's got a GPU in it, Jeff is on it.
-
Gururu UGLY. Should have wrapped it in snakeskin leather. I'm just going to say what everyone else is saying. Can you open it up and put a picture of the innards in the review?Reply -
Pierce2623 I noticed the headline mentions beating Strix Halo, but I’m not sure it’s much of an accomplishment for a $3000 mini pc to beat a $1500 mini pc. If it doesn’t nuke Strix Halo out of existence, then it’s pretty horrible value. Since it’s currently desktop only, it should be getting compared against ITX PCs of equivalent price. Personally, I’d be comparing it against a 9950x3d/5080 ITX system, since that’s equivalent pricing. Lastly it’s a $3000 portable with no USB4/Thuderbolt? That’s next level greedy.Reply -
bit_user I'm pleasantly surprised by the analysis on Page 2. I expected to have some notes, but I think that analysis hit all of the main points. Memory bandwidth is indeed its Achilles heel. It's awesome for what was rumored to have been primarily a laptop chip, but it's got nothing on its true datacenter cousins.Reply
Yeah, it's where a big chunk of the cost comes from. I've seen street prices for that card running around $1500.The article said:... the onboard ConnectX 7 NIC running at up to 200 Gbps. That exotic NIC ...
Based on Nvidia's prior Jetson platforms, you're really stuck with this as the OS, whether you like it or not. I haven't heard how long Nvidia plans to support it, either. It's a pretty safe bet they'll move to 26.04, but who knows if they'll release anything beyond that for it?The article said:The preinstalled DGX OS is a lightly Nvidia-flavored version of Ubuntu 24.04 LTS.
Yes, and I think the ConnectX 7 NIC is a big part of that. Sad to see you didn't compare against the Ryzen AI Max 395+, here. Elsewhere, I've seen idle power of the Framework Desktop w/ Ryzen AI Max 395+ measured at a mere 12.5W.The article said:The Spark idles at about 35 W as a headless system
It really is just a happy accident that AMD created Strix Halo (Ryzen AI Max), when they did. It wasn't designed to do local LLM workloads, but rather an answer to Apple's M-series Pro. The fact that it can hang so close to GB10 is mostly a testament to just how memory-bottlenecked both are, since the GB10 has way more AI compute horsepower. -
kealii123 Reply
I'd recommend reading the entire article/reviewPierce2623 said:I noticed the headline mentions beating Strix Halo, but I’m not sure it’s much of an accomplishment for a $3000 mini pc to beat a $1500 mini pc. If it doesn’t nuke Strix Halo out of existence, then it’s pretty horrible value. Since it’s currently desktop only, it should be getting compared against ITX PCs of equivalent price. Personally, I’d be comparing it against a 9950x3d/5080 ITX system, since that’s equivalent pricing. Lastly it’s a $3000 portable with no USB4/Thuderbolt? That’s next level greedy. -
kealii123 Now I'm super curious how this chip is going to perform in the rumored consumer-oriented laptop that's supposedly shipping soon.Reply -
kealii123 I wish the review included the mentioned comparable mac studio in the benchmarks sectionReply