Microsoft this week said that it would bring preview of its DirectStorage application programming interface that powers the company’s Xbox Velocity Architecture to Windows 10 developers in 2021. The API is designed to speed up game loading times and improve performance of games by eliminating storage API-related bottlenecks and reducing CPU involvement, but on a client PC it can do much more than that. Nvidia has also adopted the technology, branded Nvidia RTX IO, for its Ampere graphics cards.
Expensive I/O Requests
Modern PC games use tens of gigabytes of storage and to load them quickly one needs an SSD that supports the NVMe protocol and boasts a high sequential read speed. To further optimize performance by ensuring that all the necessary data like textures and sounds fits into memory (both system RAM and GPU RAM), contemporary game engines break the assets into blocks and load only those that are needed for the scene being rendered. These blocks may be rather small, but they are still larger than 4 KB blocks used to rate random input/output (I/O) performance of SSDs.
According to Microsoft, the custom SSD used in the upcoming Xbox Series X console generates well over 35,000 64 KB I/O requests per second to hit its peak sequential read speed of 2.4 GB/s. The NVMe protocol and modern SSDs can handle multiple queues simultaneously (which is called queue depth) and each of them can contain many requests. But raw performance of the drive is only a part of the equation.
Existing storage APIs require the application to manage its I/O requests sequentially: submit the request, wait for it to complete, handle its completion, move on to another request. Older games that generated hundreds of requests (as they were designed primarily with hard drives in mind) did not produce a significant overhead and therefore did not use too much CPU time. But with upcoming titles that generate tens of thousands of requests that overhead gets so substantial that it might prevent modern systems from taking full advantage of modern SSDs and/or leave no CPU horsepower for other tasks.
In addition, current storage APIs also encounter some ‘extra steps’ (such as data transformations) between an I/O request made by an application and its actual execution by the storage device, which further increases overheads of the whole pipeline.
Additionally, many of the game assets (e.g., textures) come compressed and their decompression means some more complications in the pipeline between the storage and the CPU/GPU.
DirectStorage Here to Help
Microsoft’s Xbox Velocity Architecture uses DirectStorage API that was designed specifically to keep higher-end NVMe SSDs busy without using too many CPU cycles and reduce the number of ‘extra steps’ incurred today to speed up the entire pipeline.
According to Microsoft, the DirectStorage API can do the following:
- Reduce per-request NVMe overhead;
- Submit large batches of I/O requests in parallel with little intervention by the OS to save CPU time;
- Give applications finer grain control over when they get notified of I/O request completion instead of having to react to every IO completion.
The DirectStorage API does not replace the NVMe protocol. What it is meant to do is to reduce CPU and protocol overheads, allow developers to specify their I/O procedures, and skip unnecessary extra steps for I/O requests.
In fact, earlier this year Microsoft said that on Xbox Series X its DirectStorage API could reduce CPU overhead for tens of thousands of I/O operations ‘to a small fraction of a single core.’ What Microsoft did not say was how it achieved such an impressive result and whether this is the best-case scenario with huge I/O requests or something to expect typically.
Previously AMD (the developer of the Xbox One Series X SoC) has experimented with peer-to-peer messaging between its GPU and two NVMe SSDs in its Radeon Pro SSG graphics card, but it is unclear whether DirectStorage has anything to do with p2p messaging.
DirectStorage on PC: Gaming First
So far, Microsoft has only disclosed plans to bring its DirectStorage API to its Windows operating system as well as to provide it to game developers in 2021. This will essentially enable future PC games to take advantage of the same technologies as upcoming Xbox games will.
Premium first-person shooter titles designed to run in high resolutions and at high framerates, which performance is currently limited by a host of bottlenecks on different system levels, will benefit from Microsoft’s DirectStorage. Perhaps, by making data travelling from the SSD to the graphics card cheaper from performance point of view and in a more controlled way, the new API could reduce requirements for onboard VRAM (or at least slow down their growth).
For Microsoft, it is natural to enable DirectStorage on PC for games first because it has already perfected the technology on a gaming console. But it is important to note that there are non-gaming applications on PC that could take advantage of faster and more manageable storage performance.
Adding DirectStorage to Windows means that Microsoft will have to ensure that there is hardware that supports the new API. The API itself is meant for NVMe SSDs and there are plenty of NVMe-compliant drives around. Meanwhile, the software giant does not say that all of them will support DirectStorage, but claims that it will be supported by ‘certain systems with NVMe drives’ that are ‘properly configured.’
Since Microsoft yet has to disclose all the peculiarities of its new API, it is unclear whether its support will mandate a particular subset of NVMe instructions (and therefore particular drives with particular firmware will be required) or there are certain things beyond SSD that are needed.
Nvidia has adopted the API for its Ampere graphics cards. Nvidia says its RTX IO feature can speed up I/O performance by up to 100X over standard hard drives and storage APIs.
Microsoft’s DirectStorage API is said to lower CPU overhead for tens of thousands of I/O operations that modern gaming systems perform, eliminate unnecessary data transformations steps, and give game developers a finer control of the storage. Everything in a bid to shrink load times and allow GPUs to consume data from SSDs faster and make virtual worlds richer.
Bringing data closer to the processor is an industrial trend and Microsoft’s DirectStorage follows it. Servers can use hierarchical storage to maximize their performance and capabilities, but game consoles and PCs usually do not accommodate NVDIMMs, high-end PCIe SSDs for frequently used data, and slower devices for cold storage. Client devices cannot do that, so making default storage devices faster and more efficient is something that is required to improve capabilities of PCs.
Microsoft yet has to disclose all the peculiarities of its DirectStorage API and how it can achieve its goals, but game developers will get the new interface in 2021 and this is the secret sauce behind the Xbox Velocity Engine will be revealed. What will be particularly interesting to see is whether DirectStorage can improve performance of applications beyond gaming.
Microsoft claims that its DirectStorage API uses NVMe SSDs, but it does not automatically mean that all Windows PCs with an NVMe SSD will automatically support DirectStorage. The company is working with its hardware partners to finish designing the API and components that will support it.
Since DirectStorage is already used for games aimed at the Xbox Series X, it is reasonable to assume that at least some cross-platform titles will also take advantage of the API on the PC starting from late 2021 (or rather from 2022) and onwards.
Source: Microsoft DirectX Developer Blog