The Nvidia GeForce RTX 4090 hype train has been building for most of 2022. After more than a year of extreme GPU prices and shortages, CEO Jensen Huang revealed key details at GTC 2022, with a price sure to make many cry out in despair. $1,599 for the top offering from Nvidia's Ada Lovelace architecture? Actually, that's only $100 more than the RTX 3090 at launch, and if the card can come anywhere near Nvidia's claims of 2x–4x the performance of an RTX 3090 Ti, there will undoubtedly be people willing to pay it. The RTX 4090 now sits atop the GPU benchmarks hierarchy throne, at least at 1440p and 4K. For anyone who's after the fastest possible GPU, never mind the price, it now ranks among the best graphics cards.
That's not to say the RTX 4090 represents a good value, though that can get a bit subjective. Looking just at the FPS delivered by the various GPUs per dollar spent, it ranks dead last out of 68 GPUs from the past decade. Except our standard ranking uses 1080p ultra performance, and the 4090 most decidedly is not a card designed to excel at 1080p. In fact, it's so fast that CPU bottlenecks are still a concern even when gaming at 1440p ultra. Look at 4K performance and factor in ray tracing, and you could argue it's possibly one of the best values — see what we mean about value being subjective?
Again, you'll pay dearly for the privilege of owning an RTX 4090 card, as the base model RTX 4090 Founders Edition costs $1,599 and partner cards can push the price up to $1,999. But for those who want the best, or anyone with deep enough pockets that $2,000 isn't a huge deal, this is the card you'll want to get right now, and we'd be surprised to see anything surpass it in this generation, short of a future RTX 4090 Ti.
|Graphics Card||RTX 4090||RTX 3090 Ti||RTX 3090||RTX 3080 Ti||RX 6950 XT||Arc A770 16GB|
|Process Technology||TSMC 4N||Samsung 8N||Samsung 8N||Samsung 8N||TSMC N7||TSMC N6|
|Die size (mm^2)||608.4||628.4||628.4||628.4||519||406|
|SMs / CUs / Xe-Cores||128||84||82||80||80||32|
|Ray Tracing "Cores"||128||84||82||80||80||32|
|Boost Clock (MHz)||2520||1860||1695||1665||2310||2100|
|VRAM Speed (Gbps)||21||21||19.5||19||18||17.5|
|VRAM Bus Width||384||384||384||384||256||256|
|L2 / Infinity Cache||72||6||6||6||128||16|
|TFLOPS FP16 (FP8/INT8)||661 (1321)||160 (320)||142 (285)||136 (273)||47.4||138 (275)|
|Launch Date||Oct 2022||Mar 2022||Sep 2020||Jun 2021||May 2022||Oct 2022|
Here's a look at the who's who of the extreme performance graphics card world, with the fastest cards from Nvidia, AMD, and now Intel. Obviously, Intel's Arc A770 competes on a completely different playing field, but it's still interesting to show how it stacks up on paper.
We're going to simply refer you to our Nvidia Ada Lovelace Architectural deep dive if you want to learn about all the new technologies and changes made with the RTX 40-series. The above specs table tells a lot of what you need to know. Transistor counts have nearly tripled compared to Ampere; core counts on the RTX 4090 are 52% higher than the RTX 3090 Ti; GPU clock speeds are 35% faster, and the GDDR6X memory? It's still mostly unchanged, except there's now 12x more L2 cache to keep the GPU from having to request data from memory as often.
On paper, that gives the RTX 4090 just over double the compute performance of the RTX 3090 Ti, and there are definitely workloads where you'll see exactly those sorts of gains. But under the hood, there are other changes that can further widen the gap.
Ray tracing once again gets a big emphasis, and three new technologies — Shader Execution Reordering (SER), Opacity Micro-Maps (OMM) and Displaced Micro-Meshes (DMM) — all offer potential improvements. However, they also require developers to use them, which means existing games and engines won't benefit.
Deep learning and AI workloads also stand to see massive generational improvements. Ada includes the FP8 Transformer Engine from Hopper H100, along with FP8 number format support. That means double the compute per Tensor core, for algorithms that can use FP8 instead of FP16, and up to four times the number-crunching prowess of the 3090 Ti.
One algorithm that can utilize the new Tensor cores — along with an improved Optical Flow Accelerator (OFA) — is DLSS 3. In fact, DLSS 3 requires an RTX 40-series graphics card, so earlier RTX cards won't benefit. What does DLSS 3 do? It takes the current and previously rendered frames and generates an extra in-between frame to fill the gap. In some cases, it can nearly double the performance of DLSS 2. We'll take a closer look at DLSS 3 later in this review.
From a professional perspective, particularly for anyone interested in deep learning, you can easily justify the cost of the RTX 4090 — time is money, and doubling or quadrupling throughput will definitely save time. Content creators will find a lot to like and it's a quick and easy upgrade from a 3090 or 3090 Ti to the 4090. We'll look at ProViz performance as well.
But what about gamers? Unlike the RTX 3090 and 3090 Ti, Nvidia isn't going on about how the RTX 4090 is designed for professionals. Yes, it will work great for such people, but it's also part of the GeForce family, and Nvidia isn't holding back on its gaming performance claims and comparisons. Maybe the past two years of cryptocurrency mining are to blame, though GPU mining is now unprofitable so at least gamers won't have to fight miners for cards this round.