Nvidia's next-generation Ada GPU architecture will reportedly receive a massive boost in L2 cache sizes, as shared by @harukaze5719 and @xinoassassin1 on Twitter. The new leak alleges L2 sizes have increased on the new architecture by 16x compared to Nvidia's current GeForce RTX 30 series GPUs, and are on par with AMD's Radeon RX 6000 series GPUs with Infinity Cache technology.
The flagship Ada die, AD102, reportedly gets the biggest cache size of the entire lineup at an impressive 96MB (16MB per 64-bit memory controller), while the AD103 and AD104 die could receive 64MB of cache instead. AD106 will have 48MB and AD107 will come with the least amount of cache at 32MB. Nonetheless, even AD107's cache size dwarfs that of Nvidia's most powerful GPUs today running on the GA102 die that comes with just 6MB (512KB per 32-bit memory controller).
So then, let's summarize about Lovelace and Hopper...@kopite7kimi @xinoassassin1 pic.twitter.com/hioRcvn8fbMarch 2, 2022
Ada will have huge L2 @VideoCardz pic.twitter.com/sjMGXttX0YMarch 2, 2022
It's not hard to understand why Nvidia is making this unorthodox move with its Ada architecture. Previous reports indicated that Ada could have up to 71% more cores than Ampere. Of course, with more cores, you need to increase memory bandwidth to ensure that each is constantly fed with data, and without going to a wider memory interface, improving bandwidth utilization with a larger cache is a good option.
AMD was the first company to demonstrate the full potential of large internal caches on GPUs, introducing its RNDA2-based Radeon RX 6000 series GPUs. With RDNA2, AMD introduced its Infinity Cache, which added a massive L3 cache to the GPU along with the other L0, L1, and L2 caches.
Compared to previous generations of Nvidia and AMD GPUs, these caches are much larger, with the top RDNA2 die possessing 128MB of Infinity Cache. In testing with GPUs such as the RX 6800 XT and RX 6900 XT, it was found that these huge cache sizes could completely alleviate bottlenecking the GPU's relatively weak 256-bit memory bus, which paled in comparison to Nvidia's RTX 3080 with a 320-bit bus and a 384-bit bus for the RTX 3090. Nvidia's RTX 3080 and 3090 are also running faster GDDR6X memory, while AMD has stuck to regular GDDR6 ICs.
So despite the seemingly drastically underpowered memory configuration of the RX 6800 XT and RX 6900 XT, these GPUs were able to go toe-to-toe with Nvidia's counterparts thanks to the large pool of Infinity Cache.
Ada will be the first time Nvidia has really taken a page out of AMD's playbook and will likely combat its higher core count requirements with bigger caches instead of purely increasing raw bandwidth with faster ICs and larger bus-widths. Not only is Nvidia allegedly using large L2 caches, but it's also apparently keeping its bus widths the same as current generation Ampere GPUs, according to the leak. This means that Nvidia will rely more on internal caches to keep memory bandwidth high, just like AMD.
We recently saw the introduction of 21Gbps GDDR6X modules and new Samsung roadmaps for GDDR6+ and GDDR7. Nvidia may leverage those higher clocked ICs to gain even more memory bandwidth. On the other hand, AMD chose to stick with standard GDDR6 modules for the entirety of its RDNA2 lineup — with even the 6900 XT LC uses 18.5 GDDR6.
Recent reports indicate that RDNA3 dies could have as much as 256MB to 512MB of infinity cache, which would be 2.7x and 5.4x as large as Ada's current L2 cache size projections. However, as with most caching architectures, there's a point of diminishing returns, and even a 512MB L3 cache may not radically improve performance over a 128MB cache. Regardless, things are shaping up to be quite interesting in the GPU space later this year.