The DirectStorage Advantage: Phison IO+ SSD Firmware Preview

Phison I/O+ SSD Firmware
(Image credit: Tom's Hardware)

Phison’s new game-optimized I/O+ firmware could be coming soon to an SSD near you, bringing DirectStorage-class performance to the masses. Enthusiasts are excited that the DirectStorage API will enable supporting games to load in a few seconds, bringing explosive performance to bear. However, the new API will also enable broader improvements in all manner of storage-bound tasks. The first DirectStorage-enabled game, Forspoken, won't arrive until January 2023. Still, Phison gave us access to an early version of its new firmware that will transform some existing Phison-powered SSDs into DirectStorage-capable devices. 

Phison will initially offer the free I/O+ firmware to its OEMs for drives using its high-end PCIe 4.0 E18 controller, focusing on models that have Micron’s fast 176-Layer TLC flash (B47R). Manufacturers can choose how to offer this firmware, but the firmware is designed to meet DirectStorage's high-quality criteria for next-gen gaming performance. Future Phison SSDs, like those powered by the E26 controller, will come with a version of this firmware by default.

First, let's look at how this firmware works to enable DirectStorage-class performance, and then we'll put it to the test in our benchmark suite. 

How DirectStorage Optimized SSDs Work

The Microsoft Windows storage API is designed to improve storage performance specifically by taking advantage of fast NVMe SSDs, but tailoring the drives for optimum performance will extract the fastest speeds. The API will be best on Windows 11, but it will also work on Windows 10.

The general idea is that storage stack overhead can be reduced by eliminating inefficiencies via BypassIO, an optimized pathway to the StorNVMe driver and storage device that greatly reduces the number of steps required to access data. For example, BypassIO reduces the path from 11 steps to three, thus reducing CPU overhead and latency. The NVMe SSD can then be used as a sort of cache for streaming assets and data, reducing the GPU’s VRAM load. This will improve further with GPU-accelerated decompression In the future.

All of this poses challenges for current consumer SSDs because they are designed for bursty rather than sustained workloads. Phison’s tailored tests imply a fuller drive that must sustain a tremendous amount of data read activity over multiple hours — 2.5 GBps is a minimum for low quality, but 5 GBps+ is desirable. For example, Forspoken's first public demo ran at medium detail and required a steady 4 GBps stream from the SSD. 

Traditionally, “real world” consumer performance metrics have focused on 4KB accesses at low queue depths ranging from 1 to 4, but DirectStorage will use large random read accesses at very high queue depths. So here we’re dealing with large 32KB+ block sizes and a 512+ queue depth instead, which is representative of a potential DirectStorage workload. In fact, we should anticipate I/O up to 1MB in size, with 64KB being a typical target for consistency.

This type of workload also challenges a drive’s endurance due to 'Block Read Disturb,' a process that creates wear on frequently-read blocks, thus reducing endurance. Managing this condition is exceptionally important with DirectStorage SSDs — each block of game data can experience up to 20,000 page reads per hour over a 60 to 100GB span of the drive. 

Block Read Disturb is a negligible condition with standard drives. However, the new firmware needs to maintain the flash due to the intense nature of DirectStorage workloads, all while still prioritizing host I/O requests. Hammering the flash with reads introduces bit errors over time which can temporarily impact performance, but drive access remains in high demand. Phison has developed smart scheduling for maintenance with adaptive wear algorithms that seamlessly work in the background so that performance remains consistent with minimal additive wear.

Being able to harness your NVMe SSD’s native parallelization and horsepower fully is certainly a good thing, even if game developers will take a while to catch up. The technology can help with applications, too, particularly with content creation and design. This would include editing and rendering video for the former, and deep learning or bioinformatics with the latter. Computer-aided design and manufacture (CAD/CAM) is another area that may see early improvement. Phison also says that source code compilation also sees notable speed-ups. 

Today we'll use some of Phison’s recommended synthetic DirectStorage tests to show what type of performance the drive can maintain under future workload conditions. We'll also run this firmware through our typical test suite to see what impact it might have in everyday applications. Phison expects either normal performance or a bit of a gain. In the future, we will have retail products, benchmarks, and real-world applications aimed at testing the new API. Similar technology is also used in current-gen consoles, so we may have to explore impacts there, as well. For now, we’re just giving you a taste.

A Closer Look

Phison recommends that any drive with this firmware should use a heatsink, as the associated workloads are prolonged and taxing. Specifically, this firmware is designed for sustained random reads with larger block sizes, but Phison has made sure that non-optimized performance areas won't see performance degradation. The firmware might also improve write performance from this specific flash because Micron's 176-Layer TLC has untapped potential.

From the outside, the sample drive doesn't appear special. It's similar to the original 2TB preview drives sent out for the Phison E18 controller with Micron’s 176-layer flash. We see a controller with DRAM in the middle and a total of eight NAND packages, with another DRAM package on the back.

The preview drives came with a heatsink installed, and this did not. That makes sense for appearances, but these workloads require cooling. Luckily, Phison also sent a heatsink — the same one as used on the preview drives — which helped a lot.

Here we see a Phison E18 controller manufactured in the middle of last year. It’s certainly possible for manufacturers to offer this firmware for their existing drives. However, there may be valid reasons not to, especially considering how tough these workloads are on the SSD.

Two 1GB DDR4 modules from SK hynix help this drive stay in running shape. One aspect we want to test in the future is the impact of DRAM on DirectStorage workloads. Normally DRAM caching is more useful for writes than reads, particularly as the “hottest” data gets priority. Larger I/O also generally requires less memory for addressing.

Local controller SRAM and 64MB of external system memory via the host memory buffer (HMB) function may very well be sufficient, but we suspect the maintenance shuffling needed for optimal operation will benefit from DRAM, particularly with capacious drives.

(Image credit: Tom's Hardware)

The drive comes with 176-layer TLC flash from Micron. The I/O+ firmware is designed to work with this flash, but it could be extended to work with older NAND. Micron’s B47R has performed very well on various drives with multiple controllers. It’s quite fast, so it makes for a solid testbed here.

MORE: Best SSDs

MORE: How We Test HDDs And SSDs

MORE: All SSD Content

Shane Downing
Freelance Reviewer

Shane Downing is a Freelance Reviewer for Tom’s Hardware US, covering consumer storage hardware.

  • -Fran-
    Just out of curiosity... How would a SAS or sATA HDD behave in these tests? Even in RAID 0 would be interesting. More than anything, just to know how far these two are from each other.

    EDIT: "these tests" as in the QD32+ and 64KB+ blocks.

    Regards.
    Reply
  • alceryes
    I think DirectStorage will only make a mediumish splash with gaming, and only in the mid to low tier space.
    PCIe 4+ is lightning quick. Put together a PCIe 4+ performance system and you're loading up NVMe-optimized games in 7 seconds or less anyway. Yes, DS could potentially take that down to 3 seconds but, meh.

    The unexpected gains of DS will be in the mid-performance gaming systems. Not only will games load much quicker on middling-performance storage mediums but, if you were sometimes hitting a CPU bottleneck due to a mid-performance CPU, DS may be exactly what you need to relieve 3-4% of the CPU workload by moving the asset decompression stage from the CPU to the GPU.

    ...and, it's free so, yeah, good stuff all around.
    Reply
  • salgado18
    alceryes said:
    I think DirectStorage will only make a mediumish splash with gaming, and only in the mid to low tier space.
    PCIe 4+ is lightning quick. Put together a PCIe 4+ performance system and you're loading up NVMe-optimized games in 7 seconds or less anyway. Yes, DS could potentially take that down to 3 seconds but, meh.

    The unexpected gains of DS will be in the mid-performance gaming systems. Not only will games load much quicker on middling-performance storage mediums but, if you were sometimes hitting a CPU bottleneck due to a mid-performance CPU, DS may be exactly what you need to relieve 3-4% of the CPU workload by moving the asset decompression stage from the CPU to the GPU.

    ...and, it's free so, yeah, good stuff all around.
    The big deal is not full level loading, but constant incremental loading of assets during games. Something like what Rage tried to do before SSDs. As an example, don't load all textures at once, load only the ones you need at the current scene, and when the player moves you load what you need. That would be very hard on the CPU, but with DS it would be a lot more efficient.
    Reply
  • elforeign
    With the SK Hynix P41, are there any firmware improvements in the pipeline to access the benefits of Directstorage or will it require a new drive with a new controller fit for purpose? I recently bought one for a new build and am using it as my primary SSD, but I lack insight into this technology.
    Reply
  • itsmedatguy
    salgado18 said:
    The big deal is not full level loading, but constant incremental loading of assets during games. Something like what Rage tried to do before SSDs. As an example, don't load all textures at once, load only the ones you need at the current scene, and when the player moves you load what you need. That would be very hard on the CPU, but with DS it would be a lot more efficient.

    It's interesting I believe that Unreal 5 is doing something like this using an atlas to lookup assets, which seems to have lowered the overhead for streaming in what's needed, because Unreal 5 seems capable of doing this kind of thing off of a standard 2.5" SSD
    Reply
  • salgado18
    itsmedatguy said:
    It's interesting I believe that Unreal 5 is doing something like this using an atlas to lookup assets, which seems to have lowered the overhead for streaming in what's needed, because Unreal 5 seems capable of doing this kind of thing off of a standard 2.5" SSD
    A stupid example to represent the idea, in old GTA's you only got a few of the cars on the streets, which caused the game to never show a car, but once you got it the game showed that car a lot suddenly. In UE5 Matrix demo, every car is unique, because they are loaded on the fly. I think that's the big advancement of this tech.
    Reply
  • gggplaya
    alceryes said:
    I think DirectStorage will only make a mediumish splash with gaming, and only in the mid to low tier space.
    PCIe 4+ is lightning quick. Put together a PCIe 4+ performance system and you're loading up NVMe-optimized games in 7 seconds or less anyway. Yes, DS could potentially take that down to 3 seconds but, meh.

    The unexpected gains of DS will be in the mid-performance gaming systems. Not only will games load much quicker on middling-performance storage mediums but, if you were sometimes hitting a CPU bottleneck due to a mid-performance CPU, DS may be exactly what you need to relieve 3-4% of the CPU workload by moving the asset decompression stage from the CPU to the GPU.

    ...and, it's free so, yeah, good stuff all around.


    I think you'll see a benefit in higher clutter object density in scenes and larger openworlds games. Also, more unique objects throughout the map as well.
    Reply
  • alceryes
    salgado18 said:
    The big deal is not full level loading, but constant incremental loading of assets during games. Something like what Rage tried to do before SSDs. As an example, don't load all textures at once, load only the ones you need at the current scene, and when the player moves you load what you need. That would be very hard on the CPU, but with DS it would be a lot more efficient.
    Partial level loading has been a thing for decades(?)
    But, quicker asset access will benefit things like pop-in and more detail distant textures, definitely.
    Reply
  • gggplaya
    alceryes said:
    Partial level loading has been a thing for decades(?)
    But, quicker asset access will benefit things like pop-in and more detail distant textures, definitely.

    Correct, but loading more map sections are typically disquised as a long tunnel, a long road or highway, or a warp portal etc.... Direct Storage and super fast SSD's will eliminate the need for that.
    Reply
  • hannibal
    What I would like to see is Phison with and without this firmware update. What goes up, what goes down...
    Reply