A number of storage vendors have jumped into the PCI Express-based SSD market during the past 18 months. With a few exceptions (notably, Fusion-io), the formula is fairly simple. Take a few SATA-based controllers, mix in a RAID controller, some RAM, set the blender to emulsify, and in a few minutes you have a storage device unencumbered by the 6 Gb/s SATA interface.
Alright, so it's not quite that simple. But those are the basic building blocks we're seeing from most vendors. Micron eschews that approach by baking up something completely different.
Meet the company's P320h half-height, half-length PCI Express-based SSD. The P320h is available in 350 and 700 GB capacities, selling for $3,495 and $6,995, respectively. That's expensive, no doubt. But for $10/GB you get very impressive performance specifications.
You should be aware that Micron sells the RealSSD P320h in one other format (aside from the half-height, half-length design we're reviewing today). Its 2.5" PCI Express RealSSD P320h is available in 175 and 350 GB capacities, but is limited to 415,000 4 KB random read IOPS and 1.75 GB/s in sequential reads. Though those numbers are still impressive, the form factor requires a server equipped with the right interface to support it.
| Micron P320H | ||
|---|---|---|
| User Capacity | 350 GB | 700 GB |
| Interface | PCI Express 2.0 x8, Half-Height, Half-Length | |
| Sequential Read | 3.2 GB/s | |
| Sequential Write | 1.9 GB/s | |
| 4 KB Random Read | 785,000 IOPS | |
| 4 KB Random Write | 205,000 IOPS | |
| Power Consumption (Active) | 25 W | |
| Power Consumption (Idle) | 10 W | |
| Write Endurance | 25 PB | 50 PB |
With specified read performance up to 3.2 GB/s and as many as 785,000 4 KB random write IOPS, Micron is basically saturating a second-gen PCI Express x8 link with the HHHL version of its P320h. The use of single-level cell NAND allows for 50 PB of write endurance on the 700 GB variant. At a high level, those are all impressive numbers. But lets take a deeper look to see how Micron achieves such ambitious specs.
Micron set out to simplify PCI Express-based SSDs with its P320h. But before we can understand how the company does this, let's first have a look at a more conventional layout. The image below corresponds to Intel's SSD 910.

Starting from the PCI Express connector on the left, you have a RAID/HBA chip attached to a number of SAS- or SATA-enabled controllers. Those controllers, in turn, communicate with the NAND. In essence, you're taking a number of SSDs (in the example above, four), hooking them up to a host bus adapter, and presenting them as a single device.
In contrast, the P320h consists of a PCI Express interface, a host controller with the ability to communicate over PCI Express built-in, and the flash memory. When you take the HBA out of the equation, you alleviate bandwidth limitations and minimize latency.
Clearly, that custom ASIC is responsible for enabling Micron's design. The company partnered with IDT (Integrated Device Technology) to develop the P320h's controller, combining its extensive knowledge of NAND with IDT's leadership in high-speed serial switching. The result is a 32-channel (!) controller built into a 1517-pin FCBGA (Flip Chip Ball Grid Array) package.
Let that sink in for a second. Most of the controllers you find in SATA-based SSDs communicate across eight channels. It makes sense, then, that it'd take a quartet of "eight-channel SSDs" on a PCI Express-based add-in card to saturate the interface. Micron makes it very clear that its P320h loves workloads that push high queue depths, and we can see the drive's controller was designed to shoulder those intense environments. As a matter of fact, company representatives recommended that we test using queue depths at least as high as 256.

As any hardware geek can tell you, there's something about a bare circuit board that begs for examination, and the P320h's layout is incredibly clean. This half-height, half-length (HHHL) PCI Express 2.1 card is about as elegant as it gets, consisting of the passively-cooled controller flanked by NAND flash and DDR3 cache.

The Micron/IDT controller sits front and center. Immediately to its left are five Micron 256 MB DDR3-1333 memory packages, with another four around back, totaling 2.25 GB of cache. Additionally, the PCB hosts 32 NAND packages, each with 16 GB of Micron's 34 nm ONFi 2.1 single-level cell memory. There is also a pair of double-sided mezzanine boards, adding 32 more packages. All told, the 700 GB RealSSD P320h hosts 1 TB of SLC NAND.

The 32-channel controller, the 2.25 GB of cache, and the 1 TB of SLC memory should all be clear indicators that this thing is an enterprise-class piece of equipment designed for enterprise-class workloads. Fortunately, we have some of those to throw at it.
As we mentioned on the previous page, the 700 GB RealSSD P320h actually has 1 TB of on-board SLC NAND. Not all of the unaccounted-for memory is used for over-provisioning, as you might expect. Rather, Micron's RAIN (Redundant Array of Independent NAND) technology takes up 128 GB (12.5%). From there, the company imposes ~22% of total capacity for over-provisioning. This takes the total usable capacity down to ~650 GB.
RAIN is a way of creating parity at the logical storage unit level. Stated most simply it's a 7+1 RAID 5 configuration. For every seven blocks that are written, a single parity element is calculated.
This calculation is done in hardware, in real-time, and in the background, requiring no user/application interaction. Beyond ECC, which the P320h also supports, RAIN allows the card to recover from physical NAND failures. If there is problem, the P320h logs that event and makes it viewable by way of the RealSSD Manager. You can also communicate with the P320h through a fully-featured CLI application.
The RealSSD Manager is a nice tool that allows system administrators to quickly check the health of the drive without creating custom scripts or memorizing cryptic command line switches. Not only can you view the typical SMART (Self Monitoring and Analysis Reporting Technology) parameters, but you can also view live performance stats, media wear, and thermal readings.
| Test Hardware | |
|---|---|
| Processor | Intel Core i7-3960X (Sandy Bridge-E), 32 nm, 3.3 GHz, LGA 2011, 15 MB Shared L3, Turbo Boost Enabled |
| Motherboard | Intel DX79SI, X79 Express |
| Memory | G.Skill Ripjaws Z-Series (4 x 4 GB) DDR3-1600 @ DDR3-1600, 1.5 V |
| System Drive | Intel SSD 320 160 GB SATA 3Gb/s |
| Tested Drives | Micron P320h 700 GB, PCI Express x8, Firmware: B146000 |
| Graphics | AMD FirePro V4800 1 GB |
| Power Supply | OCZ ModXStream Pro 700 W |
| System Software and Drivers | |
| Operating System | Windows 7 x64 Ultimate |
| DirectX | DirectX 11 |
| Driver | Graphics: ATI 8.883 |
| Iometer 1.1.0 | # Workers = 4, 4 KB Random: LBA= Full Span varying Queue Depths | ||
|---|---|---|---|
| AS SSD | v1.6437.30508 | ||
| ATTO | v2.47, 2 GB, QD=4 | ||
| Custom | C++, 8 MB Sequential, QD=4 | ||
| Enterprise Testing: Iometer Workloads | Read | Random | Transfer Size |
| Database | 67% | 100% | 8 KB: 100% |
| File server | 80% | 100% | 512 Bytes: 10% |
| 1 KB: 5% | |||
| 2 KB: 5% | |||
| 4 KB: 60% | |||
| 8 KB: 2% | |||
| 16 KB: 4% | |||
| 32 KB: 4% | |||
| 64 KB: 10% | |||
| Web server | 100% | 100% | 512 Bytes: 22% |
| 1 KB: 15% | |||
| 2 KB: 8% | |||
| 4 KB: 23% | |||
| 8 KB: 15% | |||
| 16 KB: 2% | |||
| 32 KB: 6% | |||
| 64 KB: 7% | |||
| 128 KB: 1% | |||
| 512 KB: 1% |
The Storage Networking Industry Association (SNIA), a working group made up of SSD, flash, and controller vendors, has produced a testing procedure that attempts to control as many of the variables inherent to SSDs as possible. SNIA’s Solid State Storage Performance Test Specification (SSS PTS) is a great resource for enterprise SSD testing. The procedure does not define what tests should be run, but rather the way in which they are run. This workflow is broken down into four parts:
- Purge: Purging puts the drive at a known starting point. For SSDs, this normally means Secure Erase.
- Workload-Independent Preconditioning: A prescribed workload that is unrelated to the test workload.
- Workload-Based Preconditioning: The actual test workload (4 KB random, 128 KB sequential, and so on), which pushes the drive towards a steady state.
- Steady State: The point at which the drive’s performance is no longer changing for the variable being tracked.
These steps are critical when testing SSDs. It is incredibly easy to not fully condition the drive and still see fresh-out-of-box behavior and think it is steady-state. These steps are also important when going between random and sequential writes.
For all performance tests in this review, the SSS PTS was followed to ensure accurate and repeatable results.
All tests employ random data, when available. Micron's RealSSD P320h does not perform any data compression prior to writing, so there is no difference in performance based on data patterns.
Notes
We did run into a few issues during our time testing the P320h, which were mainly related to the Windows driver we were provided. Initially, the sample that Micron sent to us only had Linux support. The company did a great job getting us a driver for Windows so that we could start our benchmarking, but it wasn't completely finished. Micron was also clear that it did a majority of its validation on servers. Our test bench doesn't use a server chipset, and it runs Windows. Twice during our testing the P320h entered a state where it had to rebuild during POST. We didn't lose any data, but the rebuilds took quite a while.
To make sure our issues were configuration-specific, we ran reboot testing under Linux in a 1U server for two straight days. The machine restarted literally hundreds of times without an issue. And because this issue did not affect performance, for the sake of consistency we finished our testing on our standard test bench.
Most customers will never even come close to exceeding the write endurance limits of today's desktop-oriented SSDs. Write exhaustion requires continuous writing to a drive for weeks and months on end before you completely consume the usable life of each NAND cell. In the enterprise world, however, this is a much more likely scenario. Knowing the write endurance of an SSD can help IT professionals select drives that are best suited to their tasks.
This is a metric we're expecting to set Micron apart from its competition. The company chooses to use 34 nm SLC flash in its P320h, and that decision is impactful in two ways.
The first is write endurance. Typically, SLC is capable of 100,000 program/erase cycles, while eMLC is closer to 30,000. The MLC memory most commonly used today falls between 3,000 and 5,000 P/E cycles. In our testing, SLC (even from different manufacturers) consistently performs better than its rating. Micron's NAND is no exception. It doesn't come cheap, though. The RealSSD P320h runs roughly $10/GB.
Before we dig into the results, if you are unfamiliar with the different types of NAND or the concept of write exhaustion in general, take a look at Intel SSD 910 Review: PCI Express-Based Enterprise Storage.
In order to test write endurance, we write large-block, sequential data to the drive and continuously monitor the Percentage Lifetime Used SMART attribute. This tells us, on a scale from 0 to 100, the percentage of life exhausted from the drive's NAND. We started with a clean drive and wrote to it until the attribute reached 1%.
By writing sequential data, we demonstrate the maximum usable life of the flash, removing variables like wear-leveling and garbage collection. In this configuration, write amplification should be very close to 1.0x. We did run into an issue, though, that complicated testing. Mainly, Micron does not provide a SMART attribute that reports the total amount of data written to the drive. This definitely caused some concern because we had no way of knowing how much data had been written previously, which could have skewed our results. Normally, we rely on our testing software to keep track of this, but we always like to double-check our work. This could be a concern for system administrators that want a hard number. For most, though, the SMART attributes provided should be sufficient to successfully administer the P320h.
| Endurance Rating Sequential Workload, QD=1, 8 MB, Random | ||||
|---|---|---|---|---|
| Micron Real SSD P320h | Intel SSD 910 | Intel X25-E | Toshiba MK4001GRZB | |
| NAND Type | Micron 32 nm SLC | Intel 25 nm eMLC (HET) | Intel 50 nm SLC | Toshiba 32 nm SLC |
| RAW NAND Capacity | 1,024 GB | 896 GB | 77GB | 512 GB |
| IDEMA Capacity (User Accessible) | 700 GB | 800 GB | 64 GB | 400 GB |
| Over-provisioning | 22% (12.5% RAIN) | 12% | 20% | 28% |
| P/E Cycles Observed (IDEMA) | 276,652 | 46,339 | 237,968 | 225,064 |
| P/E Cycles Observed (Raw) | 185,700 | 41,374 | 198,307 | 175,831 |
| Host Writes per 1% of MWI | 1857.0 TB | 370.71 TB | 152.3 TB | 900.2 TB |
| $/PB-Written | $38.57 | $106.60 | $60.51 | $79.63 |
If you only consider write endurance and cost in dollars per petabyte written ($38), ignoring all else, the RealSSD P320h is our new champion. Based on our results, the P320h is clearly the best choice for customers who need a write-caching solution with high endurance for a reasonable price.

Make no mistake about it, the RealSSD P320h is an absolute monster when it comes to random 4 KB reads. Topping out at over 500,000 IOPS, it easily beats the best from OZC and Intel. Micron did caution us that the P320h is optimized for large queue depths and likely wouldn't perform as well with fewer outstanding I/O requests (an admission applicable to any SSD, really). In our testing, we see that at queue depths of 16 and less, Intel's SSD 910 hangs right with the P320h, as its 32 channels simply go underutilized. At a queue depth of 32 and higher, though, Micron's P320h leaves its competition gasping for air. Presented with a workload truly able to push its unique architecture, the P320h rewards you greatly.

Random 4 KB writes, while still very impressive, don't impart as much shock value as read performance did. The RealSSD P320h can't keep up with OCZ's Z-Drive R4, though it easily dispatches Intel's SSD 910 at high queue depths. At low queue depths, the P320h doesn't perform as well. The Intel and OCZ offerings both establish a comfortable edge.

The P320h's average response time is excellent, beating the Z-Drive R4 by a hair.

Our maximum response time test shows the P320h flexing its muscles yet again. This is one of the clearest examples of the benefits afforded by bypassing the SATA/SAS interface entirely. Of course, arming the device with SLC NAND doesn't hurt, either. In fact, the P320h's maximum response time is only slightly higher than the SSD 910's average.
The next set of tests simulates different enterprise workloads, including database, file server, Web server, and workstation configurations.
Our Iometer database workload (also categorized as transaction processing) involves purely random I/O. Its profile consists of 67% reads and 33% writes using 8 KB transfers.

The P320h maintains a steady lead over OCZ's Z-Drive R4 across all queue depths. This should come as no surprise given what we saw on the previous page, and considering that this workload is more heavily weighted towards reads.

The file server workload consists of 80% random reads of varying transfer sizes. Because the P320h has an advantage over OCZ's Z-Drive R4 in random reads, we thought these results would look a lot more like the database chart. To our surprise, though, the Z-Drive R4 holds its own against the P320h, edging it out at almost all queue depths.
Originally, we left off at a queue depth of 256. But after reviewing the results, it looked like the P320h had a little bit left in its proverbial tank. So, we reran the numbers at queue depths of 512 and 1024, too. Under that extra load, the P320h finally came out on top.


The Web server (100% read, varying transfer size) and workstation (80% reads, 80% random) workloads don't provide nearly as much drama. The P320h easily tops the Z-Drive R4 and SSD 910 at all queue depths.

When it comes to sequential performance, the P320h lives up to expectations...so long as you throw enough operations at it. Normally, we use ATTO for evaluating sequential performance across transfer sizes. This causes a problem for us, though. ATTO's low queue depths negatively affect the P320h's outcome. At the utility's maximum queue depth of 10, Micron's drive goes severely underutilized.
Switching over to Iometer and a queue depth of 32 gives us performance results closer to what we were expecting. Although we weren't quite able to coax 3.2 GB/s out of it, we consistently saw more than 2.8 GB/s, peaking above 3 GB/s.

Sequential write operations result in a similar tale. Although the P320h responds better to writes at lower queue depths than it did to reads, dipping below 32 outstanding operations causes the P320h to hit its peak moving 2 MB blocks (as opposed to 128 KB blocks).
Enterprise video streaming is becoming a much more demanding workload within the enterprise space. Companies want more HD streams with higher bit-rates and no stuttering. A storage solution well-suited for enterprise-class video delivery has completely different capabilities than something designed for databases. At the end of the day, you're basically looking for exceptional large-block sequential write performance. You also need a high level of consistency that traditionally isn't seen from consumer SSDs. For a more in-depth analysis, take a look at our Intel SSD 910 review.
As a refresher, once the drive is in a steady state, we write its entire capacity 100 times. We use 8 MB transfer sizes and a queue depth of four, recording timestamps for each individual write. The graph below reflects 100-point averaging so that you can better visualize the results.
Frankly, we were shocked after our first look at the data. So much so, in fact, that instead of our usual 100 full writes, we went all the way up to 250. That's over 160 TB of data written to the P320h, including over 20 million individual 8 MB writes. The graph below shows the best- and worst-case runs out of those 250 iterations.

And that's it. Two minor hiccups, each of which is easy to overcome with a modest buffer. Because we are testing the P320h as a formatted drive, just as you would use it in the real-world, it's impossible to even say what caused those dips. They could have very well been from the driver or operating system doing some periodic check on the hardware. The table below shows how much memory would be required to maintain a given threshold.
| Threshold (MB/s) | Best-Case Buffer Size In MB | Worst-Case Buffer Size In MB |
|---|---|---|
| 1850 | 26 | 53 |
| 1900 | 34 | 66 |
| 1950 | 40 | 79 |
| 2000 | 52 | 92 |
| 2025 | 6310 | 6815 |
These are not normal results, particularly for an SSD. Typically, solid-state storage has issues where, in a very small percentage of writes, the operation takes an order of magnitude longer to complete. This is normally attributed to internal SSD tasks like garbage collection. Because those background operations are inherent and unavoidable, this sort of testing is necessary to measure how the outliers negatively affect streaming performance.
The P320h is such a consistent performer, though, that it needs almost no buffer up to and beyond its rated performance. Normally, when you go beyond the average, the required buffer grows exponentially. But the P320h goes from almost no buffer at 2,000 MB/s, to an unrealistically-high number just 25 MB/s higher.
Based on our maximum latency performance test, these results probably shouldn't come as a shock. But the fact that the consistency held up over such a long period of time certainly surprised us.
As we've discussed previously, an SSD that draws 25 W is no power-saver on its own. However, when you add up the number of 2.5" SAS-based disks spinning at 10,000 RPM it'd take to match the performance of a drive like the P320h, the savings are significant.
Just like the SSD 910 we reviewed previously, Micron's RealSSD P320h is also rated at 25 W. The P320h does have a lower idle power draw, but considering the environments these drives are intended to serve, they won't be sitting idle very long.

The P320h butts right up against its power ceiling subjected to both sequential and random operations. That's good news because any extra consumption would translate into more heat. The P320h also requires 1.5 m/s of airflow, which allows it to operate at up to 50 degrees C. As with most PCIe cards rated at 25 W, normal server airflow is enough to keep it cool. However if your rack-mounted machines suffer from restricted ventilation, the P320h will get very hot, very fast.
The future of solid-state storage has nothing to do with the SAS or SATA interfaces, at least as we know them today. Achieving 8 and 12 Gb/s transfer rates on the desktop will likely require that the SATA software infrastructure be applied to PCI Express, yielding a technology that'll be referred to as SATA Express.
Of course, both SAS and SATA will have a place in enterprise storage for a long time to come. But if you have a chance to look at the roadmaps from Intel or Samsung, it's clear that the companies see the future in NVMe, the stylized version of Non-Volatile Memory Host Controller Interface Specification. This is going to standardize SSDs on PCI Express so that just one driver is needed to support compliant products.
We're not there yet, as evidenced by the fact that we had some proprietary driver issues with Micron's RealSSD P320h. However, the company does give us a glimpse at what is possible without legacy bottlenecks getting in the way.
Based on our benchmarks, we can say with certainty that the RealSSD P320h is a great enterprise storage option. A review of the results makes a few points abundantly clear. First, this device delivers outstanding read performance. Whether you're talking about small-block random operations or large-block sequentials, the P320h consistently outperforms Intel's SSD 910 and OCZ's Z-Drive R4. In certain examples, the results aren't even close. As a result of its sizable advantage in read operations, the P320h also comes out on top in most enterprise workloads, which are so heavily read-biased that their outcomes were sort of foregone conclusions.
Although read performance is out of this world, the RealSSD P320h's write performance isn't nearly as spectacular. That's not to say the drive doesn't do well; it's just not as impressive after looking at those massive read numbers. Achieving 200,000 random 4 KB write IOPS, Micron's drive is fast enough to smoke Intel's SSD 910, but it falls short of OCZ's Z-Drive R4.
When Micron told us that its P320h was optimized for high queue depths, the company wasn't kidding. A queue depth of 256 is the sweet spot for this device. If your application can't keep more than 32 commands outstanding, it's a toss-up whether the P320h can beat the SSD 910 or Z-Drive. Just in case you were thinking about it, wealthy enthusiast, this is the exact reason you wouldn't want to drop Micron's drive into a workstation. For it to shine, you really need to tax the P320h with an enterprise workload...
...like video streaming. The P320h is, hands down, the most consistent drive we have ever tested. Registering a write latency of <311 µs and a read latency of <47 µs, this drive does remarkably well in our streaming tests. That's partly attributable to the PCI Express to memory interface, and also a result of Micron's reliance on SLC NAND. The results speak for themselves.
Final Words
Micron's RealSSD P320h has a lot going for it. But, like most enterprise-oriented products, it does its job best in very specific environments. Fortunately, what the P320h does well is applicable to many different types of high-end storage that recommending it becomes a lot easier. Just be aware that Micron sells the product in two form factors, available in different capacities and with varying specifications. The 700 GB RealSSD P320h is what we were testing today, and it's a true beast. If this is truly the future of SSDs, we can't wait to see what's next.





