Phison’s new game-optimized I/O+ firmware could be coming soon to an SSD near you, bringing DirectStorage-class performance to the masses. Enthusiasts are excited that the DirectStorage API will enable supporting games to load in a few seconds, bringing explosive performance to bear. However, the new API will also enable broader improvements in all manner of storage-bound tasks. The first DirectStorage-enabled game, Forspoken, won't arrive until January 2023. Still, Phison gave us access to an early version of its new firmware that will transform some existing Phison-powered SSDs into DirectStorage-capable devices.
Phison will initially offer the free I/O+ firmware to its OEMs for drives using its high-end PCIe 4.0 E18 controller, focusing on models that have Micron’s fast 176-Layer TLC flash (B47R). Manufacturers can choose how to offer this firmware, but the firmware is designed to meet DirectStorage's high-quality criteria for next-gen gaming performance. Future Phison SSDs, like those powered by the E26 controller, will come with a version of this firmware by default.
First, let's look at how this firmware works to enable DirectStorage-class performance, and then we'll put it to the test in our benchmark suite.
How DirectStorage Optimized SSDs Work
The Microsoft Windows storage API is designed to improve storage performance specifically by taking advantage of fast NVMe SSDs, but tailoring the drives for optimum performance will extract the fastest speeds. The API will be best on Windows 11, but it will also work on Windows 10.
The general idea is that storage stack overhead can be reduced by eliminating inefficiencies via BypassIO, an optimized pathway to the StorNVMe driver and storage device that greatly reduces the number of steps required to access data. For example, BypassIO reduces the path from 11 steps to three, thus reducing CPU overhead and latency. The NVMe SSD can then be used as a sort of cache for streaming assets and data, reducing the GPU’s VRAM load. This will improve further with GPU-accelerated decompression In the future.
All of this poses challenges for current consumer SSDs because they are designed for bursty rather than sustained workloads. Phison’s tailored tests imply a fuller drive that must sustain a tremendous amount of data read activity over multiple hours — 2.5 GBps is a minimum for low quality, but 5 GBps+ is desirable. For example, Forspoken's first public demo ran at medium detail and required a steady 4 GBps stream from the SSD.
Traditionally, “real world” consumer performance metrics have focused on 4KB accesses at low queue depths ranging from 1 to 4, but DirectStorage will use large random read accesses at very high queue depths. So here we’re dealing with large 32KB+ block sizes and a 512+ queue depth instead, which is representative of a potential DirectStorage workload. In fact, we should anticipate I/O up to 1MB in size, with 64KB being a typical target for consistency.
This type of workload also challenges a drive’s endurance due to 'Block Read Disturb,' a process that creates wear on frequently-read blocks, thus reducing endurance. Managing this condition is exceptionally important with DirectStorage SSDs — each block of game data can experience up to 20,000 page reads per hour over a 60 to 100GB span of the drive.
Block Read Disturb is a negligible condition with standard drives. However, the new firmware needs to maintain the flash due to the intense nature of DirectStorage workloads, all while still prioritizing host I/O requests. Hammering the flash with reads introduces bit errors over time which can temporarily impact performance, but drive access remains in high demand. Phison has developed smart scheduling for maintenance with adaptive wear algorithms that seamlessly work in the background so that performance remains consistent with minimal additive wear.
Being able to harness your NVMe SSD’s native parallelization and horsepower fully is certainly a good thing, even if game developers will take a while to catch up. The technology can help with applications, too, particularly with content creation and design. This would include editing and rendering video for the former, and deep learning or bioinformatics with the latter. Computer-aided design and manufacture (CAD/CAM) is another area that may see early improvement. Phison also says that source code compilation also sees notable speed-ups.
Today we'll use some of Phison’s recommended synthetic DirectStorage tests to show what type of performance the drive can maintain under future workload conditions. We'll also run this firmware through our typical test suite to see what impact it might have in everyday applications. Phison expects either normal performance or a bit of a gain. In the future, we will have retail products, benchmarks, and real-world applications aimed at testing the new API. Similar technology is also used in current-gen consoles, so we may have to explore impacts there, as well. For now, we’re just giving you a taste.
A Closer Look
Phison recommends that any drive with this firmware should use a heatsink, as the associated workloads are prolonged and taxing. Specifically, this firmware is designed for sustained random reads with larger block sizes, but Phison has made sure that non-optimized performance areas won't see performance degradation. The firmware might also improve write performance from this specific flash because Micron's 176-Layer TLC has untapped potential.
From the outside, the sample drive doesn't appear special. It's similar to the original 2TB preview drives sent out for the Phison E18 controller with Micron’s 176-layer flash. We see a controller with DRAM in the middle and a total of eight NAND packages, with another DRAM package on the back.
The preview drives came with a heatsink installed, and this did not. That makes sense for appearances, but these workloads require cooling. Luckily, Phison also sent a heatsink — the same one as used on the preview drives — which helped a lot.
Here we see a Phison E18 controller manufactured in the middle of last year. It’s certainly possible for manufacturers to offer this firmware for their existing drives. However, there may be valid reasons not to, especially considering how tough these workloads are on the SSD.
Two 1GB DDR4 modules from SK hynix help this drive stay in running shape. One aspect we want to test in the future is the impact of DRAM on DirectStorage workloads. Normally DRAM caching is more useful for writes than reads, particularly as the “hottest” data gets priority. Larger I/O also generally requires less memory for addressing.
Local controller SRAM and 64MB of external system memory via the host memory buffer (HMB) function may very well be sufficient, but we suspect the maintenance shuffling needed for optimal operation will benefit from DRAM, particularly with capacious drives.
The drive comes with 176-layer TLC flash from Micron. The I/O+ firmware is designed to work with this flash, but it could be extended to work with older NAND. Micron’s B47R has performed very well on various drives with multiple controllers. It’s quite fast, so it makes for a solid testbed here.
MORE: Best SSDs
MORE: All SSD Content