Microsoft to Bring DirectStorage API to Windows in 2021: Speeding Up Gaming With NVMe SSDs

(Image credit: Microsoft)

Microsoft this week said that it would bring preview of its DirectStorage application programming interface that powers the company’s Xbox Velocity Architecture to Windows 10 developers in 2021. The API is designed to speed up game loading times and improve performance of games by eliminating storage API-related bottlenecks and reducing CPU involvement, but on a client PC it can do much more than that. Nvidia has also adopted the technology, branded Nvidia RTX IO, for its Ampere graphics cards

Expensive I/O Requests 

Modern PC games use tens of gigabytes of storage and to load them quickly one needs an SSD that supports the NVMe protocol and boasts a high sequential read speed. To further optimize performance by ensuring that all the necessary data like textures and sounds fits into memory (both system RAM and GPU RAM), contemporary game engines break the assets into blocks and load only those that are needed for the scene being rendered. These blocks may be rather small, but they are still larger than 4 KB blocks used to rate random input/output (I/O) performance of SSDs. 

(Image credit: Microsoft)

According to Microsoft, the custom SSD used in the upcoming Xbox Series X console generates well over 35,000 64 KB I/O requests per second to hit its peak sequential read speed of 2.4 GB/s. The NVMe protocol and modern SSDs can handle multiple queues simultaneously (which is called queue depth) and each of them can contain many requests. But raw performance of the drive is only a part of the equation.

Existing storage APIs require the application to manage its I/O requests sequentially: submit the request, wait for it to complete, handle its completion, move on to another request. Older games that generated hundreds of requests (as they were designed primarily with hard drives in mind) did not produce a significant overhead and therefore did not use too much CPU time. But with upcoming titles that generate tens of thousands of requests that overhead gets so substantial that it might prevent modern systems from taking full advantage of modern SSDs and/or leave no CPU horsepower for other tasks. 

In addition, current storage APIs also encounter some ‘extra steps’ (such as data transformations) between an I/O request made by an application and its actual execution by the storage device, which further increases overheads of the whole pipeline. 

Additionally, many of the game assets (e.g., textures) come compressed and their decompression means some more complications in the pipeline between the storage and the CPU/GPU.

DirectStorage Here to Help

Microsoft’s Xbox Velocity Architecture uses DirectStorage API that was designed specifically to keep higher-end NVMe SSDs busy without using too many CPU cycles and reduce the number of ‘extra steps’ incurred today to speed up the entire pipeline. 

(Image credit: Samsung)

According to Microsoft, the DirectStorage API can do the following:

  • Reduce per-request NVMe overhead;
  • Submit large batches of I/O requests in parallel with little intervention by the OS to save CPU time;
  • Give applications finer grain control over when they get notified of I/O request completion instead of having to react to every IO completion.

The DirectStorage API does not replace the NVMe protocol. What it is meant to do is to reduce CPU and protocol overheads, allow developers to specify their I/O procedures, and skip unnecessary extra steps for I/O requests. 

In fact, earlier this year Microsoft said that on Xbox Series X its DirectStorage API could reduce CPU overhead for tens of thousands of I/O operations ‘to a small fraction of a single core.’ What Microsoft did not say was how it achieved such an impressive result and whether this is the best-case scenario with huge I/O requests or something to expect typically.  

Previously AMD (the developer of the Xbox One Series X SoC) has experimented with peer-to-peer messaging between its GPU and two NVMe SSDs in its Radeon Pro SSG graphics card, but it is unclear whether DirectStorage has anything to do with p2p messaging. 

DirectStorage on PC: Gaming First

So far, Microsoft has only disclosed plans to bring its DirectStorage API to its Windows operating system as well as to provide it to game developers in 2021. This will essentially enable future PC games to take advantage of the same technologies as upcoming Xbox games will.

(Image credit: Corsair Components)

 Premium first-person shooter titles designed to run in high resolutions and at high framerates, which performance is currently limited by a host of bottlenecks on different system levels, will benefit from Microsoft’s DirectStorage. Perhaps, by making data travelling from the SSD to the graphics card cheaper from performance point of view and in a more controlled way, the new API could reduce requirements for onboard VRAM (or at least slow down their growth). 

For Microsoft, it is natural to enable DirectStorage on PC for games first because it has already perfected the technology on a gaming console. But it is important to note that there are non-gaming applications on PC that could take advantage of faster and more manageable storage performance. 

Hardware Support 

Adding DirectStorage to Windows means that Microsoft will have to ensure that there is hardware that supports the new API. The API itself is meant for NVMe SSDs and there are plenty of NVMe-compliant drives around. Meanwhile, the software giant does not say that all of them will support DirectStorage, but claims that it will be supported by ‘certain systems with NVMe drives’ that are ‘properly configured.’ 

(Image credit: Patriot)

Since Microsoft yet has to disclose all the peculiarities of its new API, it is unclear whether its support will mandate a particular subset of NVMe instructions (and therefore particular drives with particular firmware will be required) or there are certain things beyond SSD that are needed.

Nvidia has adopted the API for its Ampere graphics cards. Nvidia says its RTX IO feature can speed up I/O performance by up to 100X over standard hard drives and storage APIs. 

Summary

Microsoft’s DirectStorage API is said to lower CPU overhead for tens of thousands of I/O operations that modern gaming systems perform, eliminate unnecessary data transformations steps, and give game developers a finer control of the storage. Everything in a bid to shrink load times and allow GPUs to consume data from SSDs faster and make virtual worlds richer. 

(Image credit: Adata)

Bringing data closer to the processor is an industrial trend and Microsoft’s DirectStorage follows it. Servers can use hierarchical storage to maximize their performance and capabilities, but game consoles and PCs usually do not accommodate NVDIMMs, high-end PCIe SSDs for frequently used data, and slower devices for cold storage. Client devices cannot do that, so making default storage devices faster and more efficient is something that is required to improve capabilities of PCs. 

Microsoft yet has to disclose all the peculiarities of its DirectStorage API and how it can achieve its goals, but game developers will get the new interface in 2021 and this is the secret sauce behind the Xbox Velocity Engine will be revealed. What will be particularly interesting to see is whether DirectStorage can improve performance of applications beyond gaming. 

Microsoft claims that its DirectStorage API uses NVMe SSDs, but it does not automatically mean that all Windows PCs with an NVMe SSD will automatically support DirectStorage. The company is working with its hardware partners to finish designing the API and components that will support it. 

Since DirectStorage is already used for games aimed at the Xbox Series X, it is reasonable to assume that at least some cross-platform titles will also take advantage of the API on the PC starting from late 2021 (or rather from 2022) and onwards.

Source: Microsoft DirectX Developer Blog

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • gggplaya
    And just like that PCIe 4.0 is a must have when building any new PC today.

    No more arguing about whether or not a GPU saturates the bus, or storage loading times are any faster.
    Reply
  • sizzling
    gggplaya said:
    And just like that PCIe 4.0 is a must have when building any new PC today.

    No more arguing about whether or not a GPU saturates the bus, or storage loading times are any faster.
    That’s a big assumption
    Reply
  • InvalidError
    "Existing storage APIs require the application to manage its I/O requests sequentially"

    Windows has had an async IO API for a long time, just too clunky to bother with unless you absolutely need to. I'm guessing most of what DirectStorage does is wrap it in a more convenient package, perhaps with added zero-copy capabilities.
    Reply
  • spongiemaster
    gggplaya said:
    And just like that PCIe 4.0 is a must have when building any new PC today.

    No more arguing about whether or not a GPU saturates the bus, or storage loading times are any faster.
    Never use version 1.0 of anything from Microsoft. We're a few years away from this being a have to have feature.
    Reply
  • AnimeMania
    This sounds like an Anti-Virus Security nightmare.
    Reply
  • TerryLaze
    InvalidError said:
    "Existing storage APIs require the application to manage its I/O requests sequentially"

    Windows has had an async IO API for a long time, just too clunky to bother with unless you absolutely need to. I'm guessing most of what DirectStorage does is wrap it in a more convenient package, perhaps with added zero-copy capabilities.
    That's still sequential it's just on a second thread and doesn't have to wait to be called by the main thread.
    It still has to do all this :" submit the request, wait for it to complete, handle its completion, move on to another request. "
    The new API will probably just dump a whole bunch of memory addresses to the nvme controller and that will just copy them there with the firmware.
    Reply
  • gggplaya
    I think that developers will take advantage of this API, since it's nearly the same or could be the same as what will be used on the Xbox Series X.

    We'll see larger open worlds and levels with more objects and detail. Textures and Objects can load in as you explore the level from room to room. Some games will take advantage of it, but not all. So for me, it's a must have, which I do have because I have an X570 and Ryzen 3900x. I just need to upgrade my NVMe drive from 3.0 to 4.0.
    Reply
  • colson79
    It's about time. The next generation of consoles finally moving to SSD's is finally going to do something to help PC gamers. We have had fast SSD's in PC's for a long time but we were still stuck with crappy load times because most games are ports from consoles targeting poor spindle drives. Glad to see the change happening.
    Reply
  • nofanneeded
    I dont get it ... I understand that such way of storing and retrieving data was being used internally by the software devs themselves to lower load times . that is using blocks that contain many smaller sized files and load them to memory then inside the memory unpack them instead of getting the small files directly from SSD or even HDD ...

    This is old programming method ... well known already. how is MS making anything different here ?
    Reply
  • TerryLaze
    nofanneeded said:
    This is old programming method ... well known already. how is intel making anything different here ?
    Microsoft not intel.
    And if you could do this with one block until now you can do it with a lot of blocks now at the same time and with very little cpu power ,at least that's how it sounds.
    Reply