AMD Radeon RX 9060 XT 16GB review: plenty of performance with 16GB

Be wary of the 8GB models, which are a completely different ballgame.

AMD Radeon RX 9060 XT 16GB
Editor's Choice
(Image: © Tom's Hardware)

Why you can trust Tom's Hardware Our expert reviewers spend hours testing and comparing products and services so you can choose the best for you. Find out more about how we test.

Modern GPUs aren't just for gaming. They're used to offload tasks like video encoding from the CPU, for accelerating professional CAD/CAM and scientific applications, and they're particularly useful for AI. We've revamped our professional and AI test suite to give a more detailed look at the various GPUs. We'll start with the AI benchmarks, as those tend to be more important for a wider range of users.

Procyon has multiple AI tests, and we've run the AI Vision benchmark along with two different Stable Diffusion image generation tests. The tests have several variants available that are all determined to be roughly equivalent (in output) by UL: OpenVINO (Intel), TensorRT (Nvidia), and DirectML (mostly for AMD). There are also options for FP32, FP16, and INT8 data types on some of the tests, which can give different results. We tested the available options and used the best result for each GPU.

Nvidia usually clobbers AMD in the Procyon AI tests, but the RTX 5060 Ti 8GB has some issues at times, as does the RTX 5060. Both fail to run the Stable Diffusion XL workload at the expected rates, which is odd because the RTX 4060 and RTX 4060 Ti 8GB cards didn’t have problems. We’ve mentioned in the past that Nvidia still seems to have some driver kinks to work out with Blackwell RTX 50-series GPUs, and this is a great example of that.

In the AI Vision test, the fastest AMD cards still can’t even match the RTX 3060, never mind the RTX 5060 and 5060 Ti. Intel’s Arc B580 also does much better than anything from AMD, but AI Vision doesn’t work in integer mode via DirectML, which seriously hinders AMD’s performance. Stable Diffusion 1.5 represents a more typical generative AI workload, but one that doesn’t need lots of VRAM. Here the 9060 XT basically ties the RX 7800 XT but still falls behind the Arc B580 and the RTX 5060.

And then we get to Stable Diffusion XL. A lot of the cards show similar relative performance in SDXL as SD1.5, but the RTX 5060 actually outperformed the RTX 5060 Ti 8GB in our testing. And we tried multiple times. Normally, the test only takes a few minutes to run, but on the 5060 Ti 8GB it took hours to complete the task. There’s some driver or application issue where it’s simply not working well on Nvidia’s RTX 50-series GPUs that have 8GB VRAM. It’s basically like trying to run Stalker 2 at 4K ultra settings right now: The 5060 Ti 8GB collapses to basically nothing (they actual score is 44, but the second white “4” isn’t visible), while the 9060 XT chugs along happily with a score of 1069.

ML Commons' MLPerf Client 0.5 test suite does AI text generation in response to a variety of inputs. There are four different tests, all using the LLaMa 2 7B model, and the benchmark measures the time to first token (how fast a response starts appearing) and the tokens per second after the first token. These are combined using a geometric mean for the overall scores, which we report here.

While AMD, Intel, and Nvidia are all ML Commons partners and were involved with creating and validating the benchmark, it doesn't seem to be quite as vendor-agnostic as we would like. AMD and Nvidia GPUs only have a DirectML execution path, while Intel has both DirectML and OpenVINO as options. Intel's Arc GPUs score quite a bit higher with OpenVINO than with DirectML.

The 5060 Ti 8GB delivers a solid 21% uplift in tokens/sec compared to the RX 9060 XT 16GB — and the 8GB and 16GB cards show relatively similar performance this time. The time to first token is much better on Nvidia as well, requiring just 0.22 seconds compared to 0.57 seconds. That’s important for real-time conversations but ultimately the average tokens per second rate is the more meaningful metric in our opinion.

AMD Radeon RX 9060 XT 16GB Content Creation, Professional Apps, and AI

(Image credit: Tom's Hardware)

We'll have some additional SPECworkstation 4.0 results below, but there's an AI inference test composed of ResNet50 and SuperResolution workloads that runs on GPUs (and potentially NPUs). We calculate the geometric mean of the four results given in inferences per second, which isn't an official SPEC score but it's more useful for our purposes.

The 5060 Ti 8GB delivers 25% higher performance than the 9060 XT 16GB in the SPEC WS4.0 GPU inference test, and this is another test that doesn't need 16GB (or 12GB). Both of the 5060 Ti cards have identical performance.

For our professional application tests, we'll start with Blender Benchmark 4.3.0, which has support for Nvidia Optix, Intel OneAPI, and AMD HIP libraries. Those aren't necessarily equivalent in terms of the level of optimizations, but each represents the fastest way to run Blender on a particular GPU at present. We had to use a special preview build of Blender 4.4.0 on the RDNA 4 GPUs, however, as Blender Benchmark currently fails to run.

Blender really tends to like Nvidia GPUs, and that’s true with the 5060 Ti and 9060 XT. While gaming performance was often close, in Blender the RTX 5060 Ti ends up delivering about double the performance. This is another test that didn’t need more than 8GB of VRAM, incidentally. Having native Optix support in Blender definitely helps Nvidia out.

SPECworkstation 4.0 has two other test suites that are of interest in terms of GPU performance. The first is the video transcoding test using HandBrake, a measure of the video engines on the different GPUs and something that can be useful for content creation work. We use the average of the 4K to 4K and 4K to 1080p scores. Note that this only evaluates speed of encoding, not image fidelity.

AMD and Nvidia both worked to improve their respective video engines with the latest architectures, but ultimately AMD’s 9060 XT delivers superior throughput. It’s 39% faster than Nvidia’s 5060 Ti, though we’d need to do a deeper dive on encoding quality before declaring a true winner here.

Our final professional app tests consist of SPECworkstation 4.0's viewport graphics suite. This is basically the same suite of tests as SPECviewperf 2020, only updated to the latest versions. (Also, Siemens NX isn't part of the suite.) There are seven individual application tests, and we've combined the scores from each into an unofficial overall score using a geometric mean.

AMD's drivers for its consumer cards tend to be more friendly toward these professional applications, and this gives the 9060 XT a big win over the 5060 Ti. Nvidia did deliver slightly higher performance in Maya and Solidworks, and the two GPUs are tied in 3ds Max, but other tests heavily favor AMD — the Medical test in particular gives the 9060 XT a 400% advantage. Overall, AMD’s GPU ends up 48% faster.

These AI and professional tests are ultimately just one aspect of GPU performance, and if you only care about gaming they shouldn't exert much influence on your choice of GPU. That's especially true of the professional tests. AI could become something useful even for gaming, but higher Blender performance will only matter if you're actually using Blender for 3D modeling.

TOPICS
  • thestryker
    While I still feel like there should have only been a single 9060 XT the 16GB is definitely what passes for as a good deal price v perf despite the upsell pricing. Hopefully over the lifetime of the card MSRP will be hit.
    Reply
  • JamesJones44
    Feels like if one is going to step up to a 16 GB model, the 5060 Ti looks like a better choice for $40 more. Otherwise one is just looking to save $90 by sticking with the 8 GB model.
    Reply
  • Alvar "Miles" Udell
    AMD showing again why they don't care about gaining market share: they have a product that can compete with Nvidia, but they don't price it anywhere near what it would take to get people to buy it if they're already Nvidia users.
    Reply
  • palladin9479
    I was hoping to see a 9060 XT 16 vs 8 GB charts the same as the 5060 Ti has a way to see where the cutoff is instead of the misinformation that gets spread. It's also entirely what the market cost is gonna be at.
    Reply
  • 3ogdy
    AMD Radeon RX 9060 XT 16GB : plenty of money to pay for an x60 card at $400
    Reply
  • tvargek
    when will TH add last gen xx60 class cards to their GPU ranking charts??
    Reply
  • virgult
    Alvar Miles Udell said:
    AMD showing again why they don't care about gaining market share: they have a product that can compete with Nvidia, but they don't price it anywhere near what it would take to get people to buy it if they're already Nvidia users.
    That's because it cannot compete. It's a bit worse, for a bit more power, if you're a gamer. Non-gaming workloads run abysmally bad compared to Nvidia, due to AMD's neglect of HIP, ROCm, and any effort to make pro workloads run well.
    This is not a competitive product, that's why it should be priced way lower.
    Reply
  • tvargek
    but don't forget 5060ti has lower performance on older MB's cause of narrow lanes and all those hoping to upgrade their older system with 5060 series should also buy new MB+CPU+MEM to gain full advantage of 5060ti
    Reply
  • palladin9479
    tvargek said:
    but don't forget 5060ti has lower performance on older MB's cause of narrow lanes and all those hoping to upgrade their older system with 5060 series should also buy new MB+CPU+MEM to gain full advantage of 5060ti

    Ehh that really depends. PCI-E bandwidth, which is what you are talking about, is only involved when data gets transmitted from system RAM to GPU VRAM. When you have plenty of VRAM then you really don't need to worry about that, if you are in a VRAM constrained situation which requires graphics resources to be swaped in and out of system RAM across the PCIe bus. PCIe 4 x16 slot is 32GB/s one way, PCIe 5.0 x16 is 64GB/s one way. System memory is much faster and therefor not the bottleneck. Honestly if someone is in such a situation that are you swaping texture data across the PCIe bus, they are already having a bad experience and need to either turn down texture resolution or upgrade to a newer card.
    Reply
  • GravtheGeek
    I had no problem getting the XFX model of the 16 gig for MSRP ($350) via newegg. Lots of $350 models out there. If you live near a Microcenter it's even better selection.

    One major thing to note about the powercolor reaper: it's only 200mm x 39mm for a 16 gb version. That makes it one of the best cards for some smaller SFF builds out there for the money.
    Reply