Nvidia's new tech reduces VRAM usage by up to 96% in beta demo — RTX Neural Texture Compression looks impressive

Compusemble First Look At RTX Neural Texture Compression (BETA)
(Image credit: Compusemble - YouTube)

Nvidia's RTX Neural Texture Compression (NTC) has finally been benchmarked, demonstrating the technology's capabilities in an actual 3D workload. Compusemble benchmarked Nvidia's new memory compression tech on an RTX 4090 at 1440p and 4K resolution, revealing a whopping 96% reduction in memory texture size with NTC compared to conventional texture compression techniques. You can see the results in the video below.

First Look At RTX Neural Texture Compression (BETA) On RTX 4090 - YouTube First Look At RTX Neural Texture Compression (BETA) On RTX 4090 - YouTube
Watch On

Compusemble tested NTC in two modes: "NTC transcoded to BCn" and "Inference on Sample." The former transcodes textures to BCn on load, while the latter only decompresses the individual texels needed to render a specific view, further reducing texture memory size.

At 1440p with DLSS upscaling enabled, the "NTC transcoded to BCn" mode reduced the test application's texture memory footprint by 64%, from 272MB to 98MB. However, the "NTC inference on sample" mode decreased the texture size significantly to just 11.37MB. This represents a 95.8% reduction in memory utilization compared to non-neural compression, and an 88% reduction compared to the previous neural compression mode.

Compusemble's benchmarks revealed that performance takes a minor hit when RTX Neural Texture Compression is enabled. However, the benchmarker ran this beta software on the prior-gen RTX 4090, not the current-gen RTX 5090, so it is possible that these performance reductions could shrink with the newer architecture. 

"NTC transcoded to BCn" mode showed a negligible reduction in average FPS compared to NTC off, though 1% FPS lows were noticeably better than regular texture compression with NTC disabled. "NTC inference on sample" mode took the biggest hit, going from the mid-1,600 FPS range to the mid-1,500 FPS range. 1% lows significantly dropped into the 840 FPS range.

Memory capacity reduction is the same at 1440p with TAA anti-aliasing instead of DLSS upscaling, but the GPU's performance behavior differs. All three modes ran significantly faster than DLSS, operating at almost 2000 FPS. 1% lows in the "NTC inference on sample" mode ran in the mid 1,300 FPS range, a big leap from 840 FPS.

Unsurprisingly, upping the resolution to 4K drops performance significantly. DLSS upscaling enabled shows an average FPS in the 1,100 FPS range in the "NTC transcoded to BCn" mode and just under 1,000 FPS average in the "NTC inference on sample" mode. 1% lows for both modes were in the 500 FPS range. Disabling DLSS in favor of native resolution with TAA anti-aliasing shows an average FPS boost into the 1,700 FPS range with the "NTC transcoded to BCn" mode and an average FPS in the 1,500 range with the "NTC inference on sample" mode. 1% lows for the former NTC mode were just under 1,100 FPS, while the latter mode's 1% lows were just under 800 FPS.

Finally, Compusemble tested cooperative vectors with the "NTC inference on sample" mode at 4K resolution with TAA. Cooperative vectors enabled resulted in an average frame rate in the 1,500 range, disabled average FPS plummets to just under 650 FPS. 1% lows similarly were just under 750 FPS, with cooperative vectors turned on; disabled 1% lows were just above 400 FPS, respectively.

Takeaways

Compusemble's RTX NTC benchmarks reveal that Nvidia's neural compression technology can reduce a colossal amount of a 3D application's memory texture footprint but at the cost of performance, especially in the "inference on sample" mode.

The DLSS vs native resolution performance is the most interesting aspect. The significant frame rate increase at native resolution shows that the tensor cores used to process RTX NTC are being taxed very hard, probably to the point where DLSS upscaling performance is hindered, enough to potentially bottleneck the shader cores. If this didn't, we should see the DLSS mode operating at a higher frame rate than the native 4K TAA benchmarks.

RTX Neural Texture Compression has been in development for at least a few years. The new technology uses the tensor cores in modern Nvidia GPUs to compress 3D applications and video game textures instead of traditional Block Truncation Coding. RTX NTC represents the first massive upgrade in texture compression technology since the 1990s, allowing for up to four times higher-resolution textures than GPUs are capable of running today.

The technology is in beta, and there's no release date. Interestingly, the minimum requirements for NTC appear to be surprisingly low. Nvidia's GitHub page for RTX NTC confirms that the minimum GPU requirement is an RTX 20-series GPU. Still, the tech has also been validated to work on GTX 10 series GPUs, AMD Radeon RX 6000 series GPUs, and Arc A-series GPUs, suggesting we could see the technology go mainstream on non-RTX GPUs and even consoles.

Aaron Klotz
Contributing Writer

Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.

Read more
Nvidia Neural Rendering deep dive
Nvidia neural rendering deep dive — Full details on DLSS 4, Reflex 2, mega geometry, and more
Nvidia Blackwell RTX 50-series at CES 2025
Nvidia DLSS 4 is the magic bullet behind the RTX 50-series' touted 2X performance — Reflex 2, Multi Frame Gen AI tools come to the fore
Radeon Pro W7900 Dual Slot
AMD RDNA 3 professional GPUs with 48GB can beat Nvidia 24GB cards in AI — putting the 'Large' in LLM
Nvidia Blackwell Benchmarking
Benchmarking Blackwell and RTX 50-series GPUs with Multi Frame Generation will require some changes, according to Nvidia
Nvidia RTX Blackwell for Creators
Nvidia Blackwell for creators and professionals — upgrades for editing video, images, audio, and more
Nvidia Blackwell Architecture deep dive slides
Nvidia Blackwell architecture deep dive: A closer look at the upgrades coming with RTX 50-series GPUs
Latest in GPUs
Despite external similarities, the RTX 3090 is not at all the same hardware as the RTX 4090 — even if you lap the GPU and apply AD102 branding.
GPU scam resells RTX 3090 as a 4090 — complete with a fake 'AD102' label on a lapped GPU
WireView Pro 90 degrees
Thermal Grizzly's WireView Pro GPU power measuring utility gets a 90-degree adapter revision
Nvidia Ada Lovelace and GeForce RTX 40-Series
Analyst claims Nvidia's gaming GPUs could use Intel Foundry's 18A node in the future
RX 9070 XT Sapphire
Lisa Su says Radeon RX 9070-series GPU sales are 10X higher than its predecessors — for the first week of availability
RTX 5070, RX 9070 XT, Arc B580
These are the best GPU 'deals' based on real-world scalper pricing and our FPS per dollar test results
Zotac Gaming GeForce RTX 5090 AMP Extreme Infinity
Zotac raises RTX 5090 prices by 20% and seemingly eliminates MSRP models
Latest in News
Despite external similarities, the RTX 3090 is not at all the same hardware as the RTX 4090 — even if you lap the GPU and apply AD102 branding.
GPU scam resells RTX 3090 as a 4090 — complete with a fake 'AD102' label on a lapped GPU
Inspur
US expands China trade blacklist, closes susidiary loopholes
WireView Pro 90 degrees
Thermal Grizzly's WireView Pro GPU power measuring utility gets a 90-degree adapter revision
Qualcomm
Qualcomm launches global antitrust campaign against Arm — accuses Arm of restricting access to technology
Nvidia Ada Lovelace and GeForce RTX 40-Series
Analyst claims Nvidia's gaming GPUs could use Intel Foundry's 18A node in the future
Core Ultra 200S CPU
An Arrow Lake refresh may still be in the cards with only K and KF models, claims leaker
  • -Fran-
    Hm... This feels like we've gone full circle.

    Originally, if memory serves me right, the whole point of textures was to lessen the load of the processing power via adding these "wrappers" to the wireframes, because you can always make a mesh and add the colouring information in it (via shader code or whatever) and then let the GPU "paint" it according to what you want it to; you can do that today even, but hardly something anyone wants to do. Textures were a shortcut from all that extra overhead on the processing, but it looks like nVidia wants to go back to the original concept in graphics via some weird "acceleration"?

    Oh well, let's see where this all goes to.

    Regards.
    Reply
  • oofdragon
    Hey I had just now this awesome great ideia that they didn't realize... what about just giving players more friggin VRAM in the GPUs???????
    Reply
  • oofdragon
    Nvidia be like: mp3 music, lossless flac music, it's music just the same! 🤓
    Reply
  • salgado18
    So Nvidia is creating a tech that lets them build an RTX 5090 with 4GB of VRAM.

    They also created a tech that makes DLSS useless.

    And I'm really curious to see how it would run on an RTX 4060, or even lower. And on non-RTX cards.

    That drop in 1% lows performance is a bit troubling.

    All that said, I really like the tech. A 96% reduction in memory usage with a 10-20% performance hit is actually very good. Meshes, objects and effects will be traded for extremely high resolution textures, or even painted textures instead of tiled images.

    Too many questions remain, but I like this.
    Reply
  • Conor Stewart
    -Fran- said:
    Hm... This feels like we've gone full circle.

    Originally, if memory serves me right, the whole point of textures was to lessen the load of the processing power via adding these "wrappers" to the wireframes, because you can always make a mesh and add the colouring information in it (via shader code or whatever) and then let the GPU "paint" it according to what you want it to; you can do that today even, but hardly something anyone wants to do. Textures were a shortcut from all that extra overhead on the processing, but it looks like nVidia wants to go back to the original concept in graphics via some weird "acceleration"?

    Oh well, let's see where this all goes to.

    Regards.
    Not really, it is just compressing the textures in a different and more efficient way that does mean that they will need decompressed when they are used.
    Reply
  • ingtar33
    these textures will be just as "lossless" as mp3s were "lossless" for sound?

    somehow i doubt it. since mp3s weren't lossless either.
    Reply
  • rluker5
    I like how the performance impact was measured with a ridiculous framerate.
    Everything a GPU does takes time. Apparently neural decompressing takes less time than the frametime you would have at 800 something fps. That isn't very much time and I imagine my low vram 3080 could decompress a lot of textures without having a noticeable impact at 60 fps.
    Reply
  • Mcnoobler
    No matter what Nvidia does, it is always the same. Frame generation? Fake frames. Lossless scaling? AFMF? Soo amazing and game changing.

    I think its become apparent that Nvidia owners spend most of their time gaming, and AMD owners spend more of their time all about sharper text and comment fluidity for their social media negativity. No a new AMD GPU will not load the comment section faster.
    Reply
  • bit_user
    I'm a little unclear on whether there's any sort of overhead not accounted for, in their stats. Like whether there's any sort of model that's not directly represented as the compressed texture, itself. In their paper, they say they don't use a "pre-trained global encoder", which I take to mean that each compressed texture includes the model you use to decompress it.

    However, what can say is fishy...

    oofdragon said:
    Nvidia be like: mp3 music, lossless flac music, it's music just the same! 🤓
    Yeah, the benchmark is weird in that it seems to compare against uncompressed textures. AFAIK, that's not what games actually do.

    Games typically use texture compression with something like 3 to 5 bits per pixel. Compared to a baseline 24-bit uncompressed, that's a ratio of 5x to 8x.
    https://developer.nvidia.com/astc-texture-compression-for-game-assets
    So, this benchmark is really over-hyping the technology. Then again, this is Nvidia and they love to tout big numbers. What else is new?

    Edit: I noticed what Nvidia is telling developers about it:
    "Use AI to compress textures with up to 8x VRAM improvement at similar visual fidelity to traditional block compression at runtime."

    Source: https://developer.nvidia.com/blog/get-started-with-neural-rendering-using-nvidia-rtx-kit/
    Reply
  • bit_user
    rluker5 said:
    I like how the performance impact was measured with a ridiculous framerate.
    Everything a GPU does takes time. Apparently neural decompressing takes less time than the frametime you would have at 800 something fps. That isn't very much time
    Did you happen to notice how small the object is, on screen? The cost of all texture lookups will be proportional to the amount of textured pixels drawn per frame. A better test would've been a full frame scene with lots of objects, some transparency, and edge AA.
    Reply