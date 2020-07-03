Comparison Products
Today, we put the 500GB Crucial P2 up against a bunch of the best SSDs on the market. We include performance leaders like the Samsung 970 EVO Plus, Adata XPG SX8200 Pro, and Seagate BarraCuda 510. We also threw in Silicon Power’s P34A60 and Crucial’s P1, which are two direct competitors on the pricing front. Also, we included a Crucial MX500 and WD Black HDD for good measure.
Game Scene Loading - Final Fantasy XIV
Final Fantasy XIV Stormbringer is a free real-world game benchmark that easily and accurately compares game load times without the inaccuracy of using a stopwatch.
The P2’s game loading performance takes a hit due to its DRAM-less design. Both of Crucial’s Silicon Motion-powered P1 and the MX500 were faster than the P2. The Crucial P2 dishes out the slowest performance of the bunch with a total game scene load time of 13.7 seconds.
Transfer Rates – DiskBench
We use the DiskBench storage benchmarking tool to test file transfer performance with our own custom blocks of data. Our 50GB data set includes 31,227 files of various types, like pictures, PDFs, and videos. Our 100GB includes 22,579 files with 50GB of them being large movies. We copy the data sets to new folders and then follow-up with a reading test of a newly written 6.5GB zip file and 15GB movie file.
When we copied our 50GB test folder on the half-full P2, performance exceeded that of the SATA competition but trailed the NVMe SSDs. We pushed things a bit harder by throwing a 100GB transfer at the P2, and it delivered roughly three times the performance of the P1 and outpaced the Silicon Power P34A60. Large file read performance was also very strong, placing it ahead of the P1 and P34A60 once again.
Trace Testing – PCMark 10 Storage Tests
PCMark 10 is a trace-based benchmark that uses a wide-ranging set of real-world traces from popular applications and common tasks to measure the performance of storage devices. The quick benchmark is more relatable to those who use their PCs lightly, while the full benchmark relates more to power users. If you are using the device as a secondary drive, the data test will be of most relevance.
Under light operation, Crucial’s P2 offers a snappy and responsive user experience that will surpass any SATA SSD. It trades blows with Silicon Power’s P34A60 and even keeps up with the Samsung 970 EVO Plus, yet gets beat in both the Quick and Full System benchmarks by both of the drives. Crucial’s P1, with its DRAM-based architecture, outperforms both the P2 and P34A60. That proves that DRAM-based designs provide the most responsive user experience, even with slower QLC flash.
Trace Testing – SPECworkstation 3
Like PCMark 10, SPECworkstation 3 is a trace-based benchmark, but it is designed to push the system harder by measuring workstation performance in professional applications.
SPECworkstation 3’s final score finds the Crucial P2 in fif1th place, just ahead of the Silicon Power P34A60 and just a hair behind the P1, but that score doesn’t tell the entire story. On average, the Crucial P1 was just as fast, or faster than Crucial’s P2 in most of the workloads SPECworkstation 3 threw its way. However, the P1’s performance tanked on some of the tests because of its slow QLC flash. As a result, the P1 took over twice as long to complete the entire benchmark compared to the P2, not to mention that the MX500 completed the test in roughly an hour. Crucial’s P2 offers more consistent prosumer workload performance than most entry-level SSDs.
Synthetics - ATTO
ATTO is a simple and free application that SSD vendors commonly use to assign sequential performance specifications to their products. It also gives us insight into how the device handles different file sizes.
DRAMless SSDs aren’t the fastest at copying and reading small files, yet depending on how the SSD is designed, some can overcome this issue. The Crucial P2, however, isn't tuned quite as well for small file sizes as we would like to see. We tested Crucial’s P2 at a queue depth (QD) of 1, representing most day-to-day file access at various block sizes, and the P2 struggled to match the other high-performance NVMe SSDs. Still, it delivered over four times the throughput of the SATA MX500.
Synthetic Testing - iometer
iometer is an advanced and highly configurable storage benchmarking tool that vendors often use to measure the performance of their devices.
We typically test an SSDs’ sequential read and write speeds with a 128KB block size. Although the P2’s write speed easily hit 1.8 GBps with that block size, even at a QD of 32, Crucial’s P2 couldn’t hit the rated 2.3 GBps performance it should have. We upped the ante with a larger 1MB block size for testing and sequential read performance hit the rated ‘up to’ 2.3 GBps spec.
Even though we had the little hiccup with sequential performance testing, random responsiveness measured very well. At QD1, the P2 responded quicker than the Samsung 970 EVO Plus, exceeding it by roughly 1,200/7,500 read/write IOPS.
Sustained Write Performance and Cache Recovery
Official write specifications are only part of the performance picture. Most SSD makers implement a write cache, which is a fast area of (usually) pseudo-SLC programmed flash that absorbs incoming data. Sustained write speeds can suffer tremendously once the workload spills outside of the cache and into the "native" TLC or QLC flash. We use iometer to hammer the SSD with sequential writes for 15 minutes to measure both the size of the write cache and performance after the cache is saturated. We also monitor cache recovery via multiple idle rounds.
Crucial’s P2 outperforms the P1 massively in heavy write workloads. After writing 24GB of data to Crucial’s P2 at a rate of 1.85 GBps, the dynamic write cache filled and write performance degraded to an average of 450 MBps. The Crucial P2 lags high-end competition, but it offers the fourth-best write performance out of the SSDs in the test pool, beating the other entry-level options. The write cache also recovers quickly after a minute of idle time.
Power Consumption and Temperature
We use the Quarch HD Programmable Power Module to gain a deeper understanding of power characteristics. Idle power consumption is an important aspect to consider, especially if you're looking for a laptop upgrade. Some SSDs can consume watts of power at idle while better-suited ones sip just milliwatts. Average workload power consumption and max consumption are two other aspects of power consumption, but performance-per-watt is more important. A drive might consume more power during any given workload, but accomplishing a task faster allows the drive to drop into an idle state faster, which ultimately saves power.
When possible, we also log the temperature of the drive via the S.M.A.R.T. data to see when (or if) thermal throttling kicks in and how it impacts performance. Bear in mind that results will vary based on the workload and ambient air temperature.
Efficiency and low heat output are some of the strengths of a DRAMless SSD, largely because they don’t have a DRAM package sucking down power. Crucial’s P2 proves efficient, managing to deliver a bit more MBps-per-watt over Crucial’s P1 as well as the high-end Samsung 970 EVO Plus and Seagate BarraCuda 510. It also had some of the lowest average and maximum readings (much less than the MX500). It dropped down to a low power state when it slipped into its idle state, sipping just milliwatts of power.
With such low power draw, the controller’s temperature peaked at 66 degrees Celsius while transferring 300GB of data around in a 23C environment with no airflow. Even this heavy workload didn’t trigger thermal throttling, so you shouldn’t have issues with cooling during normal use.
The indications that QLC flash may be replacing the drive's TLC in a future revision also means the P2 might perform substantially worse than the results shown here, so I hope to see some follow up testing if that ends up being the case. The review says the BX500 did that, and mentions "keeping us up to date" about it, but if I search for BX500 reviews, including the one from Tom's Hardware, they all appear to describe the drive as using TLC and performing better than the drive's specifications would indicate, and no update appears to have been made about the switch to QLC. Sending out faster TLC drives for review, then releasing versions with QLC under the same product name at a later date seems rather shifty.
As far as a "responsive user experience" goes, something tells me that these differences of a few millionths of a second are not going to be perceptible. The difference in latency between the fastest and slowest SSD tested here only amounts to around one ten-thousandth of a second, so I don't see how anyone would notice that when it will take a typical monitor around a hundred times as long to update the image to display the output. Maybe a bunch of these operations added together could make some difference, but then more of the drive's performance characteristics will come into play than just latency, so I don't see how that synthetic benchmark would bear much direct relation to the actual real-world experience. It's fine to show those latency benchmark results, but I don't think direct relations to the user experience can really be drawn from them.
I'd like to see more real-world load time results in these reviews, as that's what these drives will typically be getting used for most of the time. As the Final Fantasy test shows, just because one drive appears multiple times as fast in some synthetic benchmarks or file copy tests, that doesn't necessarily translate to better performance at actually loading things. Practically all of the synthetic benchmarks show the P2 being substantially faster than an MX500 SATA drive, but when it comes to loading a game's files, it ends up being noticeably slower by a few seconds. Is that result a fluke, or are these synthetic tests really that out-of-touch with the drive's real-world performance? These reviews should measure other load times of common applications, games and so on, and not just rely primarily on pre-canned and synthetic benchmarks that seem to be at odds with the one real-world loading test.
Also, I'd like to see test results for drives that are mostly full. Does the real-world performance tank if the drive is 75% or 90% full, and less space is dedicated to the SLC cache? These benchmark results don't really provide any good indication of that. The graphs showing how much performance drops once the cache is filled are nice, but the size of that cache will typically change as the drive is filled. A mostly-full drive may only have a handful of gigabytes of cache that gets filled even with moderately-sized write operations.
A "few MB" is kind of vague. How much system RAM is it actually using? The Crucual P1 had 1GB of DRAM onboard for each 1TB of storage capacity. If the P2 is using 1GB of system RAM for the same purpose, then that's a hidden cost not reflected by the price of the drive itself. This might be especially relevant if one were adding such a drive to a system with just 8GB of RAM. And even on a system with 16GB, that could become more of a concern within a few years as RAM requirements rise for things like games. If one ends up needing to upgrade their RAM sooner due to DRAMless drives consuming a chunk of it, then the cost savings of cutting that out of the drive itself seems questionable, especially given the effects on performance.
And for that matter, it seems like the performance of system RAM could affect test results more than it does on drives with their own onboard RAM. I'm curious whether running system RAM at a lower speed, or perhaps on a Ryzen system with different memory latency characteristics could affect the standings for these DRAMless drives. The use of system RAM also undoubtedly affects the power test results as well. This drive appears to be among the most efficient models, but is the system RAM seeing higher power draw during file operations in its place?
Updates I am referring to typically come in news posts or if I am sampled the updated device, it will be reflected in an update in the review. I wasn't aware of the BX500 QLC swap until recently, the week of writing this review, and actually haven't had a moment to notify the team about it until the other day, tho I think I saw a post about it somewhere at one point.
After toying with hundreds of SSDs, for me, I notice the difference in responsiveness between SATA and PCIe SSDs in day to day use. It's slight but noticeable, and especially so when launching apps after boot and moving a bunch of files around. You may not be able to draw conclusions by only looking at synthetic, but that doesn't mean one can't.
The iometer and ATTO synthetic data are just a few of the data points I look at when analyzing performance. But, there are some relationships/patterns in these data points metrics that carry over to real-world performance. After analyzing the strengths and weaknesses between many SSDs architectures and performance scores and operation habits in the same system, one can start linking synthetic differences between devices to real-world experience differences between devices. As well, these results are included to validate manufacturer performance ratings. Real-world benchmarks can not do that, which is why I include them as supporting evidence to complement the real-world data.
The only synthetic tests that I use are iometer and ATTO. PCMark 10 and SPECworkstation three are trace-based that test the SSD directly against multiple real-world workloads that cater to their respective consumer and prosumer market segments.
Final fantasy shows just a second or two difference because of a few reasons. Most SSDs are similarly responsive to this one workload simply because it isn't a demanding one, it is a rather light read test really. Overall, the game data loading process is so well optimized for HDD usage that when you replace the HDD with the SSD, most will load the few hundred GB of data per game scene at relatively the same time since its such a small transfer.
The fastest SSDs can respond faster to the random and sequential requests than others due to lower read request latency, and thus they rank ahead of slower ones here - those few hundredths of an ms add up to show that difference. Ideally, I need a larger, more graphically demanding game benchmark, a better GPU, and a 4K monitor to get a larger performance delta between drives. Different resolution settings and games will perform differently. I use Final Fantasy's benchmark because it is the only one I know of that saves load time data. I hope more devs could include load times in their game benchmarks. If you have any recommendations, I'm all ears!
Ah yes, more write cache testing, my favorite! I could do more and it would be cool to include, but it is not worth doing so at this time. Most dynamic SLC write caches will shrink at a higher full rate, but most perform well still. I perform all my testing on SSDs that are running the current OS and 50% full as it is (except the write cache testing is done empty after a secure erase when possible). Most of the time, even though the cache shrinks, it's still as responsive as when the drive was empty, the write cache is smaller in size so only larger transfers will be impacted.
Unfortunately, I do not have tools that tell me exactly how much each drive utilizes and manufacturers will not disclose specifics all the time...well I might have a tool, but I haven't been able to explore using it quite yet. From the drive's I have tested with HMB and had the RAM usage disclosed to me, it has been set to around 32-128MB. However, based on some discussion with a friend, we think it could be up to 2GB-4GB based on the spec's data
All testing is currently on an Asus X570 Crosshair VIIIHero (Wifi) + Ryzen R5 3600X @4.2 all core platform with a kit of 3600MHz CL18 DDR4. I actually have that suspicion myself and have been planning to get a faster kit of RAM to test out how it influences both DRAMless and DRAM-based SSD performance, too.
The Final Fantasy test usually seems to be relatively in line with typical measurements of NVMe vs SATA vs HDD performance as far as game loading is concerned, but the results for this drive seem a bit off, with it not even matching the MX500, despite practically all the other benchmarks in the review saying it should perform better. Maybe its a result of being DRAMless? If the game is heavily utilizing the CPU and RAM during the loading process, perhaps it's fighting with the drive for access to system memory? It might be something worth investigating further.
I'm not really sure about any demanding games that include load time readouts. However, there could be other methods. For example, recording video of games loading with an external capture device and using the resulting video file to check how long loading took, though I can see how that might make testing a bit inconvenient. Or maybe just using a graph of disk access to check that, which would likely work reasonably well for at least some titles. There wouldn't necessarily need to be a lot of tests, but having a few might prevent any one from making a particular drive appear better or worse than might typically be the case.
They are based on real-world system interaction and are traces that are exactly the same on each SSD. It is up to the storage device to differentiate itself when running those exact same workloads. There is some more detailed data in the results, but the time to spend on graphing out the details isn't. Most of the data shows similar to the rankings already graphed anyways. For more knowledge, you can read through the benchmark technical guides for more detailed information on how the tests run and how scoring is calculated. Storage tests are near the bottom of the tech guide.
Yes, entirely so. The firmware and HMB on Phison's E13T's optimized for write performance while Silicon Motion's SM2263XT's firmware and HMB is optimized for reading. Going from SSD controller though the PCIe lanes, to host memory for the buffer, and then, back to the controller adds a bunch of latency. Which is why the write-optimized E13T can't keep up in reading tasks, it is ready to respond to write requests more than it is ready to respond to read requests.
An M.2 PCI-E drive are not going to affect your top GPU slot, unless said board has a very weird configuration. Typically an M.2 will disable sata ports, or PCI-E slots that most people never really use much anyway. SLI/CF is dead, as well.