AMD swoops in to help as John Carmack slams Nvidia's $4,000 DGX Spark, says it doesn't hit performance claims, overheats, and maxes out at 100W power draw — developer forums inundated with crashing and shutdown reports

A DGX Spark developer workstation
(Image credit: NVIDIA)

Nvidia’s DGX Spark, the company’s new $4,000 mini PC platform powered by the Grace Blackwell GB10 superchip, is under fire after John Carmack, the former CTO of Oculus VR, began raising questions about real-world performance and power draw. His comments were enough to draw tech support from Framework and even AMD, with the offer of an AMD-driven Strix Halo-powered alternative.

In a post on X, Carmack said that the DGX Spark appears to max out at 100 watts of power draw, which is less than half of its 240-watt rating. While Nvidia advertises one petaflop of sparse FP4 compute, Carmack assumes the dense equivalent should be closer to 125 teraflops, and says he’s getting far less than that. He also flagged “spontaneous rebooting on a long run,” asking if the system had been “de-rated before launch.” (Expand the tweet below to see his comments.)

Similarly, independent testing by ServeTheHome found that a retail Spark unit pulled just under 200 watts under combined CPU+GPU load, and couldn’t hit the full 240W ceiling in any workload they ran.

Drawn in by the claims, Framework dropped by Carmack's thread to offer an AMD Strix Halo-powered box for him to try instead, and AMD's Anush Elangovan, the company's Vice President of AI Software and the public face of its CUDA-challenging ROCm software, even joined the pile-on, adding "Will be on standby for anything to support your exploration on Strix Halo."

Carmack’s post has kicked off a broader re-examination of what Nvidia actually promised. The petaflop figure is included across multiple pages as FP4 with sparsity, which implies 2:4 structured sparsity. This is a technique that can double effective throughput but only applies to certain matrix operations. When evaluated in denser formats like FP8 or BF16, the theoretical ceiling drops sharply. Nvidia’s specs list 273GB/s of memory bandwidth and 128GB of unified LPDDR5X shared between a 20-Arm-core Nvidia Grace CPU, making Spark a capacity-focused system with nowhere near the bandwidth of an HBM-equipped GPU.

Spark is meant to host large models in-memory rather than race through tokens per second. Nvidia’s marketing even suggests it can run 20-billion-parameter models locally, a feat few discrete setups can manage, due to its Blackwell architecture. But the growing number of users citing reboot issues and apparent power ceilings suggests Nvidia’s tight thermal and power envelope within a 150mm chassis may be starting to bite, especially when most users would have been more than happy for the Spark to ship in a larger footprint if that meant better performance and sufficient cooling.

What’s causing this suboptimal performance, such as a firmware-level cap or thermal throttling, is not clear. Nvidia hasn’t commented publicly on Carmack’s post or user-reported instability. Meanwhile, several threads on Nvidia’s developer forums now include reports of GPU crashes and unexpected shutdowns under sustained load.

It’s still very early days for DGX Spark, but with expectations for GB10 sky-high among users, Nvidia will need to explain why its flagship developer kit might be leaving so much performance potential on the table.

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Google Preferred Source

TOPICS
Luke James
Contributor

Luke James is a freelance writer and journalist.  Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory. 

  • SonoraTechnical
    Hmmm, you say it's good for 240W Jensen?

    https://cdn.mos.cms.futurecdn.net/VX7xAX8fYVnX2X5GqPDFdW-650-80.jpg.webp
    Reply
  • S58_is_the_goat
    I'm sure when they said 240w there was an asterisk and at the bottom it said 240w peak and 100w sustained.
    Reply
  • Notton
    One could say it's disgraced
    I'll see myself out
    Reply
  • vanadiel007
    Let's see AI come up with a solution to this problem.
    Reply
  • usertests
    vanadiel007 said:
    Let's see AI come up with a solution to this problem.
    The more you buy, the more TOPS you have.
    Reply
  • shady28
    TBH from what I have seen the Ryzen 9 AI+ 395, when paired with a healthy dose of high performance memory, matches the DGX at roughly 1/2 the cost. The main problem for the Ryzen is software / API compatibility. While that problem is significant, it illustrates how overpriced the DGX is.
    Reply
  • SkyBill40
    S58_is_the_goat said:
    I'm sure when they said 240w there was an asterisk and at the bottom it said 240w peak and 100w sustained.
    Yeah... the fine, fine print.
    Reply
  • bit_user
    S58_is_the_goat said:
    I'm sure when they said 240w there was an asterisk and at the bottom it said 240w peak and 100w sustained.
    The power figure is something people are latching onto as a potential cause, but the real issues are whether it can achieved the sustained compute rates that Nvidia claimed and:
    The article said:
    several threads on Nvidia’s developer forums now include reports of GPU crashes and unexpected shutdowns under sustained load.

    vanadiel007 said:
    Let's see AI come up with a solution to this problem.
    AI would probably design a custom waterblock for the Spark, to be used with an external radiator + pump.
    Reply
  • bit_user
    shady28 said:
    TBH from what I have seen the Ryzen 9 AI+ 395, when paired with a healthy dose of high performance memory,
    Nvidia said Spark was good for 1000 TOPS, which is a lot more than 126 TOPS that AMD quoted for combined performance of Strix Halo's CPU + GPU + NPU cores. TBH, I'm not sure if AMD's figure includes sparsity, but I think not. So, it's probably more like a quarter of Nvidia's figure than an 8th.

    Of course, that's on paper. What's achievable in the real world, on real models is another matter. And if one of the machines isn't even stable, then it doesn't really matter how fast it is.
    Reply
  • shady28
    bit_user said:
    Nvidia said Spark was good for 1000 TOPS, which is a lot more than 126 TOPS that AMD quoted for combined performance of Strix Halo's CPU + GPU + NPU cores. TBH, I'm not sure if AMD's figure includes sparsity, but I think not. So, it's probably more like a quarter of Nvidia's figure than an 8th.

    Of course, that's on paper. What's achievable in the real world, on real models is another matter. And if one of the machines isn't even stable, then it doesn't really matter how fast it is.

    Perhaps in some areas that is worth something, but for what I have been looking into those metrics are pretty worthless.

    Specifically, this is for using local LLM AI for development. This can be somewhat of a big deal, since such AI costs $$

    Apple is actually the best, but the cost of entry is also quite expensive, and the software support isn't there either for a serious developer.

    Another big advantage of the Spark is the ability to daisy chain them together via its built in 200 Gbit interface.

    But just the hardware, yeah, it's not really there. Except for that daisy chaining. For the cost of a decked out Mac Studio, you could buy two Sparks. At that point, the performance tables would probably turn.

    a/SOHozLpView: https://imgur.com/a/SOHozLp

    82SyOtc9flA:5View: https://www.youtube.com/watch?v=82SyOtc9flA&t=5s
    Reply