Why Your Benchmarks Are Probably Wrong
Enterprise SSDs: How to Prove Real World Performance
What's this
Iometer was originally designed by a developer team at Intel in the 1990s and discontinued in 2001, whereupon it became an open source project hosted at SourceForge.net. It remains the industry’s go-to tool for drive testing, but Iometer relies on mixes of random/sequential and read/write amounts to simulate a given type of workload. Some media outlets use their own custom scripts to automate and standardize these mixes internally while some testers will simply enter their own values on a test by test basis. In either case, it is very unlikely that these values will reflect the enterprise’s specific workload. More importantly, many evaluators only run Iometer for three-minute tests. The more adventurous might go for ten or even 30 minutes. As we noted in our prior article, such short times are wholly inadequate for enterprise duty cycles. Data centers need drives able to run for thousands of hours straight, not a few minutes.
But does it matter? Can’t we infer from a benchmarking snapshot what a drive’s long-range performance will be? Not reliably.
The above charts show the response times for three different “enterprise-class” drives tested over a period of four hours with the SPC-1C benchmark suite (see below). Ideally, drive response time in milliseconds should always remain under 30 ms, and many can continue on excellently under 10 ms. Note how device B fares fairly well for the first hour and them launches into a range 4X to 5X higher. SSDs are frequently assumed to offer negligible latency, but SPC tests reveal this not to be the case. Drives that exhibit excessive response delays or vary too much in their response times will fail testing since they could not be relied on under high-load enterprise conditions. With a three-minute test, devices A and B might have passed with flying colors. Useful benchmarking tests must examine the consistency of performance over time.
