AMD's FidelityFX Super Resolution Is Just 7% Slower in FP32 Mode vs FP16
FSR was designed to run on the widest range of GPUs possible
If you're not the owner of one of the latest and greatest best graphics cards, don't fret: AMD's FidelityFX Super Resolution (AMD FSR) can still improve your frame rates and run nearly as efficiently as on modern hardware. According to CapFrameX on Twitter, Running FidelityFX Super Resolution in FP32 "backward compatibility mode" yields just a 7% cost in performance compared to FP16.
CapFrameX used SciFiHelmet, an RX 6800 XT, 4K resolution, and the FSR Ultra Quality preset to test the two modes. While the RX 6800 XT does have both FP16 and FP32 functionality, you can manually change FSR's code to run either FP16 or FP32.
With the RX 6800 XT specifically, the difference between FP32 and FP16 is just 7% in performance, which is very small and is great news for users of older graphics cards that don't support faster FP16 code. But keep in mind this test is only based on the RX 6800 XT and isn't necessarily representative of other GPU architectures, so you could see varying results with other graphics cards.
FP32 is known as single-precision floating-point, while FP16 is half-precision floating-point. FP32 has been the standard format for GPU operations for many years, but certain operations don't benefit from the added precision and can run faster in FP16 mode — assuming the GPU suports fast FP16 modes. Basically, FP32 allows for larger numbers than FP16 and is useful for more complicated workflows. However, FP32 needs twice the memory bandwidth and isn't necessary in some workloads.
This is where 'fast' FP16 comes into play. Many modern GPUs, including AMD's Vega, RDNA, and RDNA2 architectures, can do twice the number of FP16 calculations as FP32 calculations on standard GPU cores. Nvidia's Turing architecture also supports fast FP16 operations, but interestingly, the Ampere GPUs only run FP16 code at the same rate as FP32 code. Intel's Gen11 and Xe architectures also support double speed FP16 operations.
FP16 adoption is still relatively new, which is why you don't see native FP16 capabilities in older graphics hardware. At best, FP16 performance might match the FP32 performance, but sometimes it runs at a fraction of the FP32 rate — just like most consumer GPUs have relatively limited double-precision FP64 support.
FSR has native support for both FP32 and FP16 to ensure the upscaling and image enhancement tech supports the widest range possible of graphics cards. Newer architectures may perform a bit better with FP16 mode, but the 7% boost in performance relative to FP32 pales in comparison to the 50% or more boost in framerates that FSR can provide via upscaling lower resolutions.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.
There's a budget GeForce GPU selling in China that not even Nvidia knew it made — RTX 4010 turns out to be a modified RTX A400 workstation GPU
US to patch loopholes that allow China to buy banned AI GPUs from other countries — new regulations include national quotas on GPU exports and a global licensing system