Why you can trust Tom's Hardware
Testing the RTX 5090 in games that support DLSS 4 — either natively or via the Nvidia App override method — certainly muddies the waters. Especially when you factor in Multi Frame Generation. It's a topic that deserves additional attention, as this has been a key element in Nvidia's performance claims for the entire Blackwell family of GPUs. First, let's provide some background information via Nvidia — a bit heavy on the marketing speak, as expected.
Bryan Catanzaro is Nvidia's VP of Applied Deep Learning Research, the group that has been continually working to improve features like DLSS, Broadcast, and more. The original DLSS 1.x used spatial upscaling, and the quality and performance were, in a word, lacking. It was restrictive and there was a clear loss in image quality. But that was 2018, and things have changed a lot in the past six years.
DLSS 2.x brought temporal upscaling into the algorithm, yielding far more flexibility, better performance, and higher image quality. As long as a game isn't CPU limited, DLSS upscaling will allow it to render fewer pixels at a lower resolution and then upscale the result to a higher resolution. There are typically four modes: Quality (2.25X upscaling), Balanced (3X upscaling), Performance (4X), and Ultra Performance (9X) — with the latter originally billed as a way to make 8K gaming viable. DLSS 2 and later have been incorporated into over 500 games now, so it's easily the most widely used upscaling algorithm. Not coincidentally, it's also the best looking in our testing.
But what do you do when a game is CPU limited? Flight Simulator as an example sees effectively zero benefit in terms of performance if you're using upscaling on a card like the RTX 4090 at resolutions of 1440p and lower. And with the 5090, that extends to 4K ultra as well. The solution, with the RTX 40-series GPUs, was to add a new Optical Flow Accelerator to help calculate the optical flow between two frames, which could then be used to interpolate a high quality intermediate frame.
The catch is that the game now has to delay showing the user the latest rendered frame while it's busy with frame generation. DLSS 3 framegen could add up to two whole frames' worth of latency, which is why Nvidia requires all DLSS 3 games to implement Reflex. Reflex optimizing the rendering workflow to sample user input later in the pipeline, resulting in lower latency. The result is that framegen with Reflex generally can get pretty close to matching the latency of the non-framegen, non-reflex rendering — but non-framegen with Reflex is still more responsive.
DLSS 3.5 was a separate branch, which dealt with ray reconstruction. In short, it's an AI-trained denoising algorithm that often does a better job than hand-built and custom denoising algorithms. It's not perfect, and it's only useful with ray tracing games, but it also runs on any RTX card and often looks better than the standard denoising algorithms.
Now, with Blackwell, Nvidia has introduced DLSS 4. There are two major updates, one of which applies to all RTX GPUs, while the other requires (as far as we know) an RTX 50-series card. (The above two videos are mostly for reference, not because they're related to the DLSS 4 discussion.)
DLSS Transformers is a retraining of the upscaling algorithm using a transformers network rather than a CNN (convolutional neural network). It requires 4X more compute, but seems to run nearly as fast as the old DLSS while providing higher image fidelity. (Note that running DLSS Transformers on older RTX 30- and 20-series GPUs may show a far larger hit to performance.) Related, there's a ray reconstruction transformers model as well.
The other new feature with DLSS 4 is Multi Frame Generation (MFG). As the name implies, it generates multiple frames rather than interpolating just one. It can run in three modes: 2X (generate a single frame), 3X (generate two frames), and 4X (generate three frames). Otherwise, in principle it's a lot like the original framegen in that it takes two input images plus motion vectors and depth buffers and works to generate high quality intermediate frames.
In practice, there are some big changes compared to DLSS 3 on the RTX 40-series. MFG runs off the Blackwell tensor cores. We don't know if that means it can leverage the FP4 number format, but that would make sense. The OFA in the 40-series was a fixed function unit, and Nvidia eventually reached the point where it could do better framegen via a software algorithm running a new AI model than by the OFA. It also runs faster than the OFA variant, though that might simply be a case of having more compute available.
Could Nvidia also make MFG work on non-Blackwell GPUs? Logic would suggest yes, though the full answer is a lot more nuanced. Whether it can make it run acceptably fast on something like an RTX 2070 is a different matter. We don't have any official statement yet on whether MFG will ever be enabled for the RTX 40-series and earlier, so until it's announced we'd classify this as more of a "maybe" than something that we'll actually see.
There are over 75 games and applications that are currently DLSS 4 enabled, either with native support or via Nvidia App overrides. You can also use the new DLSS Transformers model to get improved image quality, with a slight hit to performance relative to the previously existing DLSS CNN (Convolutional Neural Network) models but with higher image quality. How much higher? That's more difficult to assess and something we haven't had time to really dig into.
But MFG is a bit easier to deal with. It's natively supported in the latest 2.2.1 Cyberpunk 2077 patch, and we ran some tests. We'll start by reporting on the RTX 5090 experience, but the results should apply in various ways to other RTX 50-series GPUs. We focused on 4K using RT-Overdrive (path tracing) for these tests.
First, Here's the chart of performance. Note that we've used the new Transformer DLSS/DLAA models in all cases, except for the "DLSS Q CNN" result that uses the old CNN model. At least with the RTX 5090, the difference in performance between CNN and transformers ends up being negligible, but that's probably because we're running a test with full ray tracing. In rasterization games, or with less demanding RT settings, we expect the transformers model to run a few percent slower — again, at least on the 5090.
We haven't done the testing ourselves, but it sounds like the performance hit of the transformers model on the RTX 40-series is relatively minor, but that it becomes much more significant on the 30-series and 20-series cards. Still, even DLSS Ultra Performance mode at 4K looked far better than we'd expect from 9X upscaling.
If you only look at the given performance numbers, the RTX 5090 with MFG looks quite impressive. You can run DLAA at 4K, with ray reconstruction, and still get 110 FPS! Except, that's only part of the story.
On the labels for each line, we've listed the input latency, as reported by FrameView. Note that here, the MFG4X result has 21ms more latency than the standard DLAA result. What's more, compared to any of the DLSS results, the latency is particularly bad — about double what you get with Quality mode upscaling.
Now, before going further, please note that we tested with DLAA for a reason. We specifically wanted a lower base framerate, because this is what people looking at the mid-tier cards like an RTX 5070 are likely to get. If you start with DLSS Quality mode, MFG doesn't feel nearly so sluggish — and you can end up with about 200 FPS.
While the framerates are what you'll see on your monitor (with an appropriately high refresh rate display), what you'll feel is a different matter. All the non-MFG results are basically what we're used to feeling. With MFG, the added latency makes things feel mushy.
Subjectively, in this one particularly game, I'd say that all the MFG results feel more like a game running at 35~45 FPS. It's playable, in other words, but also a bit sluggish feeling. Note that user input gets sampled at 1/2, 1/3, or 1/4 the listed framerates for MFG, so the base DLAA result samples input every 32.3 ms. Compare that with sampling every 35.0 ms with MFG2X, 35.5 ms with MFG3X, and 36.3 ms with MFG4X.
In my experience with this testing, Cyberpunk would look like it was running at 57, 85, or 110 FPS — and admittedly, anything above 100 really starts to be less noticeable — but it basically felt like a game running at closer to 40 FPS. Which is better than the actual 28.6, 28.2, and 27.6 FPS before MFG kicks in, but not as good as what you'd get if these were actually rendered frames with new user input.
Contrasting the experience of MFG with the higher DLSS upscaling factors and it's a different story. Quality mode feels noticeably more responsive, Balanced and Performance modes feel better still, and Ultra Performance was great.
Again, you can absolutely run Cyberpunk 2077 with DLSS Quality mode, or even DLSS Performance mode, and then enable MFG4X to get a completely different experience. But there's a limit to how fast we need games to go, at least for something like Cyberpunk 2077. Competitive shooters might benefit from native 200 FPS performance, and MFG could then push that up to more than any currently existing monitor. That's not really a meaningful comparison in my book.
It's the experience of taking modest performance (30~40 FPS) and then using framegen or MFG to get a better result that matters. Because rest assured, we'll see RTX 5060 numbers from Nvidia at some point showing MFG4X delivering 120 FPS, give or take. And at that level of performance? It's fine, better than native 30 FPS perhaps... but definitely not the same as a non-framegen 120 FPS result.
Current page: Nvidia RTX 5090 Full RT and DLSS 4 Testing
Prev Page Nvidia RTX 5090 Ray Tracing Gaming Performance Next Page Nvidia RTX 5090: Content Creation, Professional Apps, and AIJarred Walton is a senior editor at Tom's Hardware focusing on everything GPU. He has been working as a tech journalist since 2004, writing for AnandTech, Maximum PC, and PC Gamer. From the first S3 Virge '3D decelerators' to today's GPUs, Jarred keeps up with all the latest graphics trends and is the one to ask about game performance.
-
Crazyy8 Quick look, raster performance seems a bit underwhelming. Weird that the RTX 5090 can be slower than the 4090(in niche cases). Wasn't going to buy the 5090 anyway, too expensive for a plebe like me. Looking forward to DLSS 4 and how amazing(or not)it'll be.Reply -
Gururu Amazing but expected as far as I am concerned. I don't think there is a lot of need to compare against anything including AMDs new cards. It was brilliant to pair with the 13900 with crazy interesting results. Will read a few more times to glean more details. (y)Reply -
Admin said:We also tested the RTX 5090 on our old 13900K test bed, with some at times interesting results. Some games still seem to run faster on Raptor Lake, though overall the 9800X3D delivers higher performance. The margins are of course quite small at 4K ultra.
For me, as a 13900K owner, that's a consolation :cool: -
Gaidax Okay, that IS a sick cooler that actually manages to do the job.Reply
I bet aftermarket 4 slot monstrocities will do better, but for 2 slots 600w that's insane. -
m3city Products like this should receive 3stars max. Great performance but at what cost? Is it the right direction that power draw increases at each iteration? Is it worth to chase max perf each time? For me it would be perfect if 5000 series stayed at same TDP as previous ones - meaning better design, better gpu - with understandably lower increase of perf compared to 4000. And then, 6000 series to have even reversed direction: higher perf with drop of TDP.Reply
And secondary, how come 500W gpu can be air cooled, but nerds on forums will claim you absolutely NEED water cooling for 125W ryzen, cause "highend"?. Yeah, i know 125W means more actual draw. -
redgarl Okay Jarred... you are shilling at this point.Reply
4.5 / 5 for a 2000$ GPU that barely get 27% more performances?
While consuming 100W more than a 4090?
And offering the same cost per frame value as a 4090 from 2 years ago?
Flagship or not, this is horrible.
Not to mention the worst uplift from an Nvidia GPU ever achieved... 27%...
https://i.ibb.co/4fks6Gt/reality.jpg -
redgarl Did you bench into an open bench or a PC case? I am asking because there is some major concerns of overheating because the CPU coolers is choked by the 575W of heat dissipation inside a closed PC Case. If you have an HSF for your CPU, then you are screwed.Reply
https://uploads.disquscdn.com/images/7a5bf4d586b20ffe0aa6281c57d419012a32cbdabd43b3e8050d2aa9a00d6cc1.png -
oofdragon 20% better at 4K and 15% better at 2K, all that having 30% more cores and etc............... got it. Oh boy the 5080 and 5070 are sure going to disappoint a lot of people.Reply
The good news is the RX 9070 will bring 4070 Ti Super performance to the table around $500 including the VRAM, ray tracing and dlss image quality. AMD will prolly counter the multi frame gen nonsense with something like the LSFG 3.0 is doing and smart buyers will finally have a good GPU to replace their 3080 or 6700 XT. -
vanadiel007 They should have given it code name sasquatch, because that's the chance you will be seeing these sell for $2,000 in the coming months.Reply
More like $3,000 and a lot of luck needed to find one in stock.
I pass on it. -
YSCCC a real space heater inside the case, and extremely expensive with not great raw performance increase... sounds like built for those with more money than brain or logic..Reply