Nvidia teams up with Microsoft to put neural shading into DirectX, giving devs access to AI tensor cores

Nvidia and Microsoft announced on Thursday that they would be adding neural shading support to the Microsoft DirectX preview this April. Neural shading will use cooperative vectors and Nvidia's Tensor cores (matrix operations units) to speed up graphics rendering in games that support the technology. It will better allow for the generic use, via HLSL (high level shading language) of traditional rendering techniques alongside AI enhancements.
While real-time computer graphics and graphics processing units (GPUs) have come a long way, the graphics rendering pipeline itself has evolved slower than hardware. In particular, while Nvidia's GPUs have featured Tensor cores (primarily aimed at AI compute) for over seven years now, they have only been used so far for things like upscaling (Nvidia's DLSS), ray reconstruction (DLSS 3.5) and denoising, and frame generation (at least for DLSS 4).
This is going to change with the so-called neural rendering — a broad term that describes a real-time graphics rendering pipeline enhanced with new methods and capabilities enabled by AI.
A specific subset of neural rendering focused on enhancing the shading process in graphics is called neural shading. Its main purpose is to improve the appearance of materials, lighting, shadows, and textures by integrating AI into the shading stage of the graphics pipeline. The addition of cooperative vectors — which let small neural networks run in different shader stages, like within a pixel shader, without monopolizing the GPU — is a key enabler for neural shading.
Cooperative vectors rely on matrix-vector multiplication, so they need specialized hardware, such as Nvidia's Tensor cores, to operate. To that end, they can work on Intel's XMX hardware as well as Nvidia's tensor cores. Intel also released a statement saying cooperative vector support will be provided on Arc A- and B-series dedicated GPUs as well as the built-in Arc GPUs found in Core Ultra Processors (Series 2) — basically, every GPU from Intel that includes XMX support.
It seems as though cooperative vectors may also work on AMD's RDNA 4 AI accelerators, though RDNA 3 seems more doubtful (as it lacks AI compute throughput compared to the competition). Still, Microsoft is working with AMD, Intel, Nvidia, and Qualcomm to ensure cross-vendor support for cooperative vectors over time.
"Microsoft is adding cooperative vector support to DirectX and HLSL, starting with a preview this April," said Shawn Hargreaves, Direct3D development manager at Microsoft. "This will advance the future of graphics programming by enabling neural rendering across the gaming industry. Unlocking Tensor Cores on Nvidia RTX GPUs will allow developers to fully leverage RTX Neural Shaders for richer, more immersive experiences on Windows."
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Nvidia's phrasing makes it sound as though the upcoming DirectX preview with cooperative vectors is as an Nvidia exclusive. However, Intel will co-present a session Cooperative Vectors with Microsoft, so the only unknown right now appears to be AMD GPU support. If driver support is available from AMD, it should work on its GPUs as well.
Ultimately, we'll need to wait to find out not only whether it works, but how well it works — both in terms of image fidelity as well as performance. Differing levels of compute among GPUs will likely affect the end user experience.

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.
-
DS426 What about Vulkan? I'd like to see how quickly the open source community can advance this.Reply -
rluker5 It sounds like it could lead to some really impressive performance gains vs standard rasterization in a select few games. And lead to some otherwise well performing cards being incapable of acceptable performance in these titles. Will probably be a great selling point for the latest generation of Nvidia cards.Reply
But it could be a great performance and efficiency boost to any GPU with a decent amount of that AI type processing power, if implemented altruistically, which it won't be, at least at first. -
George³
I've even forgotten what it means. I fear we may never see a truly modern Windows, including one that allows high speed random reading of small files.wr3zzz said:Still waiting for DirectStorage in the real world. -
AkroZ It seems the partnership between Microsoft and nVidia is due to last, previously DirectX has been built with all actors, now It's only defined by nVidia, AMD and Intel can only follow.Reply -
bit_user
Vulkan already had cooperative vector support for a while, which is how people have been achieving good AI inferencing performance with it. I don't know how much more is involved in MS' new developments, but I think that's the main thing.DS426 said:What about Vulkan? I'd like to see how quickly the open source community can advance this.
https://www.phoronix.com/news/Vulkan-1.3.300-Released
Interestingly, that article also mentions GLSL_NV_cooperative_matrix2, which implies you should even be able to do it from OpenGL. -
bit_user
I'll admit that I was a little dismissive of neural texture compression as blue skies research, when news first broke about it, a couple years ago. However, when Nvidia started playing up the technology, 2 months ago (during CES), I took a deeper look at the technology and it sounds pretty solid. It is a significant compute vs. memory tradeoff, though. I'm not really sure where it makes the most sense to apply.-Fran- said:What could possibly go wrong for the Industry, right?
As for some of the other applications mentioned, they seem very plausible to me. Nvidia has demonstrated very effective AI denoising that enables realtime global illumination, which doesn't seem so different to me from how you might use AI to interpolate shadows from coarse shadow maps, for instance.
I don't really know how fully neural textures might compare to procedural ones, but we all know textures aren't uniformly great, to begin with. In some cases, they might be higher fidelity, while lower in others. -
bit_user
To understand why they need a special feature, you have to understand that the modern GPU programming model is to treat each SIMD lane as a scalar "thread". Because these matrix operations are intrinsically multi-lane, they break that model wide open and effectively force "cooperation" between those "threads". Underneath, there's nothing special going on. If GPUs' SIMD were exposed like the SSE/AVX-style vector instructions used to implement it, then this wouldn't be a big deal. However, that would make GPU shaders harder to program, which is why the SIMD is hidden from view.The article said:Cooperative vectors rely on matrix-vector multiplication, so they need specialized hardware, such as Nvidia's Tensor cores, to operate.
RDNA 4 should be in good shape for this. RDNA 3 will probably do fine, in a similar sense of how it did with ray tracing. In some games that use these techniques, they might disproportionately impact RTX 2000 and RX 7000 GPUs, so you'd want to keep an eye on those 1-percentile FPS and turn it off, if they're dropping too low.The article said:To that end, they can potentially work on Intel's XMX hardware as long as they meet Microsoft's requirements. They may also work on AMD's RDNA 4 AI accelerators, though RDNA 3 seems more doubtful (as it lacks AI compute throughput compared to the competition).