GeForce RTX 4090 Leaves Plenty of Room for a Future RTX 4090 Ti Flagship
AD102 has more cores, L2 cache, and ROPs to offer
Nvidia's GeForce RTX 4090 might look incredibly strong, and will certainly rank as the fastest option on our list of the best graphics cards when it debuts (at least until AMD's RDNA 3 GPUs arrive), but the shaved down AD102 die in the RTX 4090 isn't close to showing off the full potential of AD102 with all of its cores and cache enabled. This combined with additional enhancements could hint at a future RTX 4090 Ti that will be much faster — and perhaps even more expensive.
The specs for the Nvidia RTX 40-series and Ada Lovelace GPUs, but those only show the announced and rumored cards. Nvidia's full AD102 die comes equipped with 144 SMs, 18,432 CUDA cores, 96MB of L2 cache, and 192 ROPs. This translates to 12% more CUDA cores and a whopping 33% more L2 cache capacity compared to the RTX 4090 we have today. The fully enabled AD102 die also packs 9% more ROPS and 12% more Texture Mapping Units as well, thanks to the additional SMs.
But that's not all that could be done for the future 4090 Ti. Micron has new 24Gbps GDDR6X memory modules in the works, another 14% boost over the RTX 4090's 21Gbps modules, and still faster than the RTX 4080 16GB's 22.4 Gbps modules that Nvidia claims are the fastest in the world right now. That would push the hypothetical (but very likely) RTX 4090 Ti up to 1152 MB/s of bandwidth.
But faster memory would come with higher power consumption, and we suspect that Nvidia is seriously holding back AD102's full clock speed and power potential as well. All those rumors of 600W RTX 40-series graphics cards? We know Nvidia has successfully overclocked RTX 4090 to more than 3.0GHz, and that would definitely push up power use.
It looks like the Ada architecture and TSMC's 4N process have plenty of headroom remaining beyond the RTX 4090's 2520 MHz boost frequency. Once the process matures a bit more, and if Nvidia is willing to increase the power limits, we wouldn't be surprised to see a RTX 4090 Ti clock at closer to 2800 MHz.
The theoretical performance of AD102 with all these bells and whistles enabled could reach a whopping 103 teraflops in FP32 workloads, and 826 teraflops in FP16 workloads with the Tensor cores, and 1652 teraflops with the Tensor cores in FP8 mode. That would be a huge 25% performance jump in comparison to the RTX 4090.
These gains would only be realized in GPU limited scenarios, of course, so probably not 1080p or 1440p gaming. Heavy compute applications would also likely benefit. The combination of more L2 cache capacity, additional GDDR6X bandwidth, and more cores and clocks could result in tangible improvements.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Row 0 - Cell 0 | RTX 4090 Ti (Full AD102) | RTX 4090 | RTX 3090 Ti |
Process | TSMC 4N | TSMC 4N | Samsung 8N |
Transistors | 76.3B | 76.3B | 28.3 |
SMs | 144 | 128 | 84 |
GPU Cores | 18432 | 16384 | 10752 |
Tensor Cores | 576 | 512 | 336 |
Ray Tracing Cores | 144 | 128 | 84 |
Boost Clock | 2800MHz??? | 2520MHz | 1860MHz |
VRAM Speed | 24 Gbps? | 21 Gbps | 21Gbps |
VRAM | 24GB | 24GB | 24GB |
Bus Width | 384 | 384 | 384 |
Memory Bandwidth | 1152GB/s | 1008GB/s | 1008GB/s |
L2 Cache Capacity | 96MB | 72MB | None |
ROPs | 192 | 176 | 112 |
TMU | 576 | 512 | 336 |
TFLOPS FP32 | 103.2 | 82.6 | 40 |
TFLOPS FP16 | 826 | 661 | N/A |
TDP | 600W?? | 450W | 450W |
When Will We See an RTX 3090 Ti?
It appears Nvidia has a lot of performance headroom remaining with its GA102 die, with the potential to create a RTX 4090 Ti that could theoretically smoke the RTX 4090. It would certainly cost a lot more money, and consume way more power than a RTX 4090, but it can be done.
All of this will depend on how hard Nvidia wants to push its GA102 die, and that will almost certainly depend on how close AMD can come to matching Nvidia's performance with the upcoming RDNA 3 chips. Yields on fully functional AD102 GPUs would also play a role, though it's doubtful these would be high volume parts.
Nvidia could add some or all of these enhancements to an RTX 4090 Ti any time it feels the need. We didn't get the RTX 3090 Ti until 18 months after the RTX 3090 debut, but there were a lot of compounding factors in play. More likely is we'll see a 2023 refresh of the RTX 40-series some time around nine months to 12 months after the initial salvo.
There's also the rare chance Nvidia could skip the RTX 4090 Ti completely in favor of a new Titan variant, but we doubt that will be the case. Titan cards tend to cut into the lucrative RTX A-series professional card profits too much.
Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.
-
"When Will We See an RTX 3090 Ti?
It appears Nvidia has a lot of performance headroom remaining with its GA102 die, with the potential to create a RTX 4090 Ti that could theoretically smoke the RTX 4090. It would certainly cost a lot more money, and consume way more power than a RTX 4090, but it can be done.
All of this will depend on how hard Nvidia wants to push its GA102 die.."
Slight TYPO error in the article it seems, Aaron ? You obviosuly meant to say the RTX 4090 Ti, and the AD102 die featured in the flagship ADA GPU, instead of the GA102 die and the RTX 3090 Ti ? -
pgde Isn't the full AD102 called the RTX 6000 with 48GB of VRAM? See https://www.nvidia.com/en-us/design-visualization/rtx-6000/Reply -
TJ Hooker
That's a pro/workstation card, whereas this article is talking about a potential gaming (Geforce) card. Last generation had both, the RTX A6000 and the Geforce RTX 3090 Ti.pgde said:Isn't the full AD102 called the RTX 6000 with 48GB of VRAM? See https://www.nvidia.com/en-us/design-visualization/rtx-6000/
Also, even the Ada RTX 6000 doesn't quite have a fully enabled AD102 die; it has it has 2 SMs disabled. -
nostriluu So for all the applications that are memory and bus constrained, there's not much difference compared to a two year old 3090.Reply -
JDJJ The bigger question is what is happening with DisplayPort 2.0? Intel & AMD are expected to feature it this generation. Crazy that NVIDIA blew it off for 4090 & 4080. If the other two follow through as expected, DP 2.0 monitors and VR goggles can’t be far behind. (Spec has been ratified for over 3 years now). With the demands of high resolutions, refresh rates, & HDR, that’ll put NVIDIA in a tough spot, or 40-series card owners in a tough spot of owning powerful, expensive cards that can’t drive modern displays to their fullest. Compression over 1.4a can only do so much.Reply -
hannibal JDJJ said:The bigger question is what is happening with DisplayPort 2.0? Intel & AMD are expected to feature it this generation. Crazy that NVIDIA blew it off for 4090 & 4080. If the other two follow through as expected, DP 2.0 monitors and VR goggles can’t be far behind. (Spec has been ratified for over 3 years now). With the demands of high resolutions, refresh rates, & HDR, that’ll put NVIDIA in a tough spot, or 40-series card owners in a tough spot of owning powerful, expensive cards that can’t drive modern displays to their fullest. Compression over 1.4a can only do so much.
Nvidia will release 4000 super cards next year that may have DP2.0 so that people upgrade to that GPU form their 4090!
If you buy 4090 now, buy 4090 super next year and at the end of the next year upgrade it to 4090ti ;) -
blacknemesist Plenty of room huh? So what is that talk about BoM when we are not even getting offered the full potential of the chip?Reply
NVidia is just gone mental since 2XXX because of their RT advantage, hope they don't get away with stuff like this.