Sign in with
Sign up | Sign in
Almost 20 TB (Or $50,000) Of SSD DC S3700 Drives, Benchmarked
By ,
1. Toying Around With 18 TB Of Solid-State Storage

For as long as SSDs have been around, power users and enterprise professionals have been configuring them in RAID arrays. Connect a few low-capacity solid-state drives, and you get one spacious and lightning-fast volume. There are a number of great reasons to build such a potent arrangement, and some compelling reasons not to. But perhaps conventional wisdom is up for review now.

You could argue that there are actually fewer reasons to team up a set of solid-state drives nowadays. Price per gigabyte continues to fall as capacity creeps higher. And folks looking for the ultimate in performance have a number of PCI Express-based options available to them. But we don't share that opinion, particularly after Intel sent us 24 of its high-end SSD DC S3700 to toy around with (check out our review: Intel SSD DC S3700 Review: Benchmarking Consistency).

The SSD DC S3700 family boasts impressive specs. At its peak, the largest model is capable of sequential reads of up to 500 MB/s and writes as high as 460 MB/s. Random 4 KB reads clock in up to 76,000 IOPS, while writes plateau at 36,000. Of course, the real reasons to want one of these drives are their bolstered endurance, end-to-end data protection, resilience against power loss, and a price tag just north of $2/GB. 

As we know, the SSD DC S3700 ships in capacities as low as 100 GB. Two dozen of those smaller drives could do some real damage in the right hands. After all, you'd be looking at 2.4 TB in RAID 0. But we got the 800 GB version for our little exhibition. At about $2,000 each, that's roughly 50 grand worth of flash-based storage.

That comes out to a mind-boggling 24,576 GiB, by the way. Each flagship 800 GB SSD DC S3700 features a full terabyte of flash on-board. Even after you factor in over-provisioning, we still end up with 745 GiB of usable space on each drive, giving us an astounding 18 TiB, all-told. Considering these things are designed to withstand up to 10 full writes per day for five years, the possibilities seem endless.

If your life happens to revolve around solid-state storage, then two-dozen 800 GB SSD DC S3700s in one place are like having a bespoke Rolls Royce trimmed in fragrant stegosaurus hide. It seems too opulent to even exist. Fortunately, a conversation with the right folks at Intel made it possible for us to line this up. Now, what to do with all of our high-end hardware?

The mandate seemed clear: let's stripe these bad boys together and see what sort of performance is really possible.

Intel and LSI Hardware RAID ControllersIntel and LSI Hardware RAID Controllers

We're presented with a few challenges, though. If we only had eight drives to deal with, our situation would be simple. Many hardware RAID controllers offer eight ports of connectivity. An octet of SSDs would give each drive its own port and we'd be off to the races. But 24 force us to consider alternative configurations. We could use three RAID cards, but then we wouldn't be able to create a single volume. We could also run dozens of drives from one controller using an expander, but that only makes sense for mechanical disks that don't saturate a 6 Gb/s link. We'll tackle this conundrum shortly.

Then there's the sad fact that so many drives and their associated connections are physically difficult to manage. For every SSD, you're looking at one power and one data cable. So, we need a backplane to provide both in one convenient package. And because we also need a lot of host resources to tax this gratuitous storage subsystem, we can address the setup side by using a server equipped with a 24-port backplane. Intel heard our request on that end, too, and followed up our package of SSDs with a dual Xeon E5 machine exposing 80 lanes of third-gen PCI Express and a number of storage-centric features.

And with that, the hardware is ready for action. Pair 24 SSD DC S3700s with a dual-processor 2U server and let 'er rip. But we're still missing one piece of the puzzle. As a result of the way these drives are set up, we must rely on oft-maligned software RAID. Depending on whose office you happen to be standing in, those two words together can get you slapped across the face. But that's alright by us. Software-based RAID functionality has come a long way over the past 15 years, and although it saps host resources, our 16-core server has plenty of horsepower in reserve.

At least for this first round of experimentation, we're skipping the most responsible, performance-robbing RAID levels (like 5 and 6) in favor of the far more exciting (and dangerous) RAID 0, which should let us get to all of the performance and capacity these drives can manage.

Member Drives
Total Capacity
1 x 800 GB DC S3700
745 GiB
4 x 800 GB DC S37002,980 GiB
8 x 800 GB DC S37005,960 GiB
16 x 800 GB DC S370011,920 GiB
24 x 800 GB DC S370017,880 GiB


Just one of our SSD DC S3700s is larger than 12 of Intel's original 64 GB X25-E enterprise drives. To match the capacity of our striped 24-drive array built using 800 GB repositories, you'd need more than 300 of those X25-Es. Yeah, we're pretty excited about having so much flash at our disposal.

2. The Platform: Built For Storage

Our testing platform centers on Intel’s S2600IP motherboard. The IP represents the board’s pre-release code name, Iron Pass, while the 2600 refers to the Sandy Bridge-EP-based Xeon E5 processors that drop into the platform's LGA 2011 interfaces. With 16 DDR3 DIMM slots and tons of PCI Express connectivity, there's plenty of room to scale this server up. 

If you want, you can start with a single LGA interface populated. However, we're going to want both bristling with eight-core chips. Our server sports twin Xeon E5-2665s running at 2.4 GHz, but able to hit 3.1 GHz in lightly-threaded applications.

The S2600IP MainboardThe S2600IP Mainboard

The S2600IP motherboard employs Intel's C602 platform controller hub, formerly referred to as Patsburg. As mentioned, PCIe connectivity is excellent, as it should be, given two CPUs armed with 40 lanes each. The board exposes four x16 slots, three eight-lane slots, and a single x8 slot limited to second-gen x4 signaling. Intel also implements a proprietary eight-lane mezzanine slot able to accept some of the company's optional storage controller add-ons.

Otherwise, on-board storage is a bit of a misnomer. The Iron Pass platform maintains both 6 Gb/s SATA ports you expect to find on Intel's 7-series desktop boards. The C602's integrated Storage Controller Unit then adds eight SAS ports, which can either be driven by Intel's RSTe software or LSI's MegaSR driver. Very handy, but we won’t be using that subsystem today. We're merely going to boot from one of the integrated SATA ports.

Of course, a server like this needs plenty of memory to keep each processor fed with data. Kingston helped us out with our experiment by sending 64 GB of its 1.35 V DDR3-1333 ECC-capable RAM. Each 8 GB KVR13LR9D4/8HC module is part of the company's Server Premier family. The line-up consists of many memory types, and is characterized by a locked bill of materials. This is important to builders because it means not re-qualifying memory products based on different components down the road. That'd be a nice guarantee to have on the desktop side, too. Several times we've stumbled on a kit we really like for overclocking, only to find that a second kit with the same model name uses different ICs and behaves differently.

The chassis itself plays host to a few storage-specific features. Twenty-four 2.5” drive carriers and backplane slots grace the front of the 2U enclosure. These are tailor-made for the RES2CV360 expander. With nine ports, we could connect all twenty-four drives to one HBA or RAID adapter. Unfortunately, the LSI-controlled expander adds too much latency, and just chokes the life out of so many SSDs. Were you to swap in a full complement of 2.5” mechanical disks, this setup would be far more desirable. So, we’re bypassing the expander altogether, instead going direct from the backplane to the storage controller. Ideally, we wouldn't have anything in between, but in this case, the backplane can't be helped. In fact, it turns out to be a lifesaver with so many drives.

3. Test Setup And Components

HBAs and Hardware RAID

If you've looked at any motherboard based on an Intel or AMD chipset lately, you probably noticed that it didn't have anywhere close to 24 SATA ports on it. It goes without saying, but we need some help in that department to facilitate communication with our SSDs.

Intel's HBA/Integrated RAID CardsIntel's HBA/Integrated RAID Cards

Intel markets its RMSKB080 and RMSJB080 (shown above) as entry-level RAID cards. It's true that they're hardware-based RAID controllers. But really, these cards are just HBAs in disguise. The KB and JB are identical, feature-wise. The JB simply slots into that proprietary mezzanine connector we mentioned on the previous page.

Our controllers center on LSI's SAS2308 PowerPC-based silicon. It might even help to think of them as mostly rebadged LSI 9207-8i HBAs. Whereas the 9207-8i ships without RAID functionality by default (Initiator-Target mode), the RMSKB080s do ship with firmware that enables this feature (known as Integrated RAID mode). We're not really interested in using them for their hardware RAID capabilities, but rather their ability to pass a drive through directly to the host. Then, our server can handle all of the RAID calculations and overhead in software.

We do have one Intel RMS25CB080 adapter on-hand to try a little hardware RAID action, if the need arises. But with just one card, it's hard to harness the performance of 24 drives in an appropriately speedy fashion. Based on LSI's Gen3 PCIe SAS2208 RAID offerings with 1 GB of DDR3 cache and a beefier PowerPC processor, the CB handles the computationally-intense parity RAID levels (5/6) that the lighter KB cards cannot. RAID 0 and 1 calculations aren't very taxing, but the parity calculations involved in RAID 5 and 6 necessitate more serious muscle.

It's worth pointing out that these three Intel storage products only work in the company's Xeon E5-compatible motherboards. You have to be using an LGA 2011-equipped platform and it has to be Intel-branded, else the cards don't even power up. The mezzanine add-in employs a proprietary form factor anyways, so that's less of an issue. The RMSKB080 can be found for a third of the price of LSI's 9207-8i, but as far as we can tell, there's no way to cross-flash it for broader compatibility. Also, it doesn't appear that flashing the firmware from IR to IT mode is supported. Intel does sell products intended for more general compatibility. However, the models we have here are basically upgrades for this platform specifically.

Software RAID: Not Evil After All

Armed with three HBAs, we'll be using the server's operating system to create RAID volumes. Windows has long supported striping, mirroring, and even RAID 5. But its performance is generally pretty poor, and there's a complete lack of flexibility in terms of settings. Windows 8 introduces some interesting new concepts through Storage Spaces, but these aren't useful to us for this exhibition.

Linux is a different beast. Modern Linux distros include a number of RAID options. Somewhat analogous to Windows' disk management RAID modes, logical volume management provides RAID through the file system. But Linux's true ace is mdadm, which facilitates the creation of RAID 1/0/5/6 volumes (plus compound modes like RAID 50/60). We can define the strip size and even allocate system memory for cache, the same way a hardware-based RAID adapter would. This is a far more alluring prospect, essentially turning our Xeon server into one big RAID controller.

RAID 5/6 levels require some truly sophisticated math to create and recover arrays. Those algorithms benefit from instruction extensions built into architectures like PowerPC. Fortunately, x86 processors can accelerate these calculations when the software is designed to exploit them. mdadm has been worked over to take advantage of these benefits wherever possible, and the open source community can continue to improve upon it when necessary. 


Test Configuration
Server
Intel R2224IP4LHPCBPPP
Mainboard
Intel S2600IP4 "Iron Pass", Dual Socket R/LGA 2011
Processors
2 x Intel Xeon E5-2665 (Sandy Bridge-EP): 2.4 GHz Base Clock Rate, 3.1 GHz Max. Turbo Boost, 32 nm, 8C/16T, 115 W TDP, LGA 2011, 20 MB Shared L3 Cache
Memory
8 x Kingston KVR13LR9D4/8HC 1.35 V, 1,333 MT/s ECC LRDIMM
Chassis
Intel Server System R2200GZ Family, 24-Drive Bay Backplane, 2U Rack Chassis
PSU
2 x Intel Redundant 750 W, 80 PLUS Platinum, FS750HS1-00
Expander
Intel RES2CV360 36-Port SAS2 Expander
Storage Controllers
2 x Intel RMS25KB080 Integrated RAID Modules
1 x Intel RMS25JB080 Integrated RAID Module, Mezzanine
1 x Intel RMS25CB080 RAID Controller, Mezzanine
Intel C600 AHCI SATA 6Gb/s
Boot Drive
Kingston 200 GB E100, SATA 6Gb/s, FW: 5.15
Test Drives
24 x 800 GB Intel SSD DC S3700, SATA 6Gb/s, FW: 5DVA0138
Operating Systems
CentOS 6.4 x86_64
Windows Server 2012
Management
Intel RMM4 BMC Remote Management System
4. Results: 4 KB Random Performance Scaling In RAID 0

Thread Count vs. Queue Depth

If we were just testing one SSD, we could simply step through increasing queue depths. One system process would stack commands, creating various QDs. That works well for SATA-enabled SSDs, particularly since there isn't much point to testing them at queue depths greater than 32. That all goes out the window when it comes to testing PCIe-based SSDs and RAID arrays. Eventually, workload generators begin running out of steam at higher QDs with only a single thread. Instead, we have to run multiple threads at multiple QDs to extract maximum performance.

It helps to have a visual representation this dynamic in action, and that's exactly what this chart represents. We start with a queue depth of one using one thread, and eventually arrive at 32 simultaneous threads each bombarding the array at a QD of 32. With just one thread, a 4K random write peters out just north of 100,000 IOPS. If we throw 32 threads at the array, we can beat 1,000,000 IOPS. That's why we're locking the thread count down at 32 and varying queue depth from here on out.

4 KB Random Read Performance Scaling in RAID 0

One thing you definitely want to see in a RAID array is scaling. As the number of drives increases, we want to observe a commensurate increase in performance. Then, as we apply a more demanding workload, we want performance to rise in tandem with intensity. That shouldn't be too hard to accomplish in RAID 0. After all, there isn't much overhead, and no pesky fault tolerance to slow us down.

Right out of the gate, we can glean a few tidbits from this chart. First, getting to 1,010,755 IOPS with a 24-drive RAID 0 array is extremely satisfying on a personal level. That's almost exactly 4 GB/s of throughput. Second, the gulf between 8 x SSD DC S3700s and 16 drives is enormous. It's almost 100%, or 400,000 IOPS. That's also the same percentage increase from four to eight drives. The bump from 16x to 24x should be close to 50%, but we're clearly hitting a bottleneck, achieving a comparatively modest 25% increase over the 16-drive array.

That's still excellent scaling in the grand scheme of things. It's completely unreasonable to expect exactly 4x/8x/16x/24x the performance of one drive. When we divide our peak 24-drive random read results by the number of member SSDs, each SSD DC S3700 contributes an astounding 42,114 IOs every second. A single 800 GB S3700 is rated for 76,000 4 KB read IOPS. So, at first glance it looks like we're losing a ton of performance. But considering the realities of scaling, we're still happy if we only see half of that peak. The fact that we get 42,000 IOPS/drive with 24 SSDs and 50,000 with 16 or less attached is pseudo-miraculous.

If your application requires tons of flash and you can supply a beefy workload, this is a viable option. Of course, large RAID 0 arrays expose you to a significant risk of failure over a period of years. But being responsible is rarely this fun.

4 KB Random Write Performance Scaling in RAID 0

When we switch to writes, we get more of the same awesomeness.

The fantastic scaling is still apparent, and the parity between read and write performance yields results just a few percentage points away from our read numbers (that is, except for the four-drive array, which doesn't lose anything compared to the read results).

This is a good place to point out that these are basically out-of-the-box scores. We haven't put enough writes on the SSDs to get them into a state where garbage collection is necessary. Limitations on the amount of time Intel could let us keep these things made drawn-out write sessions impractical. However, these 800 GB flagships are steady-state rated at 36,000 IOPS. So, we wouldn't expect a massive performance drop. The 24-drive array already peaks north of 980,000 IOPS, giving us around 41,000 write IOPS per SSD. We could conceivably rip off 860,000 4 KB write IOPS for years with a 24-drive RAID 0 array. How cool is that?

If these were conventional desktop drives, we'd want to over-provision them in an array. Without TRIM, performance could be expected to degrade substantially, and the only preventative measure would be sacrificing usable capacity to maintain speed over time. Intel's SSD DC S3700 line-up, and most other enterprise-oriented SATA drives, are over-provisioned for a reason. In the 800 GB model's case, only 745 GB of 1024 GB is usable. In exchange, though, you get consistent steady-state performance. Consequently, this also translates into lower write amplification, hence the 10 full random drive writes per day over five years that Intel claims the S3700s can endure.

5. Results: 128 KB Sequential Performance Scaling In RAID 0

128 KB Sequential Read Performance Scaling in RAID 0

Testing with 128 KB sequential reads, we get almost 2 GB/s from the four-drive array and more than 4.2 GB/s using 24 of the SSD DC S3700s. If we were getting an even 500 MB/s per drive, as Intel specifies, the 24x array would yield around 12 GB/s. Each of our Intel controller cards uses eight third-gen PCI Express lanes, so each one should be able to push more than 4,000 MB/s. 

Still, we're seeing a massive amount of throughput. It's almost like reading an entire single-layer DVD every second. And we can do that speed from the first LBA to the last because we're not relying on any caching. This is all-flash performance.

128 KB Sequential Write Performance Scaling in RAID 0

We get even better performance with writes. The 24x array encroaches on the 5 GB/s mark, falling just short at 4.8 GB/s. The 16x and 8x configurations group together around 3 GB/s, while the four-drive array backpedals just a few percent compared to its read numbers.

When Keepin' it Real Goes Wrong

Make no mistake; these are breathtakingly awesome numbers. But it's hard to shake the feeling that something is robbing us of achieving epic, face-melting benchmark results.

So, what gives, then?

Could it be the strip size we chose for these RAID 0 arrays? No. After extensive testing, we settled on 64 KB chunks. Each 128 KB transfer is serviced by two drives, since 128 KB divided by two equals 64 KB, our chunk size. With enough parallel requests, everything should be good to go.

As we saw on the last page, we still get great scaling with 4 KB random transfers. So, it's probable that we're encountering a throughput issue. Since each drive is capable of large sequential transfers in excess of 400 MB/s, the issue is most pronounced on this page. The SSD DC S3700s can't put out more than 300 MB/s in our 4 KB random testing, and we expect to lose much of that anyway. So it makes sense that we'd run into bandwidth-sapping limitations here, and not there.

After investigating the RMS25KB/JB IR modules, we discovered that they were running in full PCI Express 3.0 mode and fully capable of pushing data back and forth from our SSD DC S3700s with minimal performance impact. As it happens, the culprit is the one thing we really need: our 24-bay SAS/SATA backplane.

Sad but true. Twenty-four total bays are enabled by a trio of eight-drive bays grafted to our server's exterior. Each possesses two SAS 8087 ports (one for every four drives), and they're just not able to get data through unmolested. Whether any backplane would work in this situation is uncertain, and bypassing ours simply wasn't an option for today's experiment.

6. Results: Server Profile Testing

Workload Profiles

For as long as programs to exercise I/O activity have existed, professionals have sought to emulate specific workloads with them. We put our SSD DC S3700s through their paces with a handful of tasks based on basic enterprise-style I/O patterns

Workload Profiles
Web Server100% Read / 0% Write
0.5 KB 22%, 1 KB 15%, 2 KB 8%, 4 KB 23%, 8 KB 15%, 16 KB 2%, 32 KB 6%, 64 KB 7%, 128 KB 1%, 512 KB 1%
Database
67% Read /
33% Write
100% 8 KB
MS Exchange Server Emulation
62% Read /
38% Write
100% 32 KB
File Server
80% Read /
20% Write
0.5 KB 10%, 1 KB 5%, 2 KB 5%, 4 KB 60%, 8 KB 2%, 16 KB 4%, 32 KB 4%, 64 KB 10%


Web Server Workload Profile

The Web server profile is fairly complex, consisting of eight different transfer sizes. Though dominated by .5 KB, 4 KB, and 8 KB transfers, there are several other sizes in the mix too. This profile involves 100% reads, but our SSD DC S3700-based arrays are almost equally strong when it comes to random writes.

Scaling appears identical to what we saw previously as drives are added. Again, performance doubles from the four- to eight-drive arrays, while the 24x configuration is a bit over 100% compared to the 8x setup. We could serve quite a few webpages with two dozen SSD DC S3700s.

Database Workload Profile

The database workload is super simple. It's just an 8 KB transfer size split between 67% reads and 33% writes. 

The 24x array just touches the 500,000 IOPS mark. That's hardly surprising, since we should see almost exactly half of the I/O every second with two times the 4 KB transfer size. Since both the 4 KB read and write tests yield close to 1 million IOPS, 500,000 IOPS in the database profile is expected. Bandwidth is just IOPS multiplied by transfer size, so it stays essentially the same in all of these tests.

MS Exchange Workload Profile

Most traditional Iometer-style email server workloads are perfectly adequate. Just to mix things up, we're emulating MS Exchange mailbox activity with 32 KB blocks split between 62% reads and 38% writes.

These are really the first results we can't adequately explain. Run after run, we're touching 200,000 IOPS with our 24x array. That wouldn't be so strange, except 200,000 IOPS is a monumental 6.25 GB/s. The 3 TB array consisting of four SSD DC S3700s maintains 40,000 IOPS, regardless of queue depth. At over 1,200 MB/s of throughput, those are totally consistent results. We double that with eight drives and quadruple it with twenty-four. Perhaps we've found the perfect storm of settings? Since the outcome seems too good to be true, take these results with a grain of salt.

File Server Workload Profile

The file server profile is almost as complex as the Web server profile, except 20% of the transactions are writes and the majority of accesses are 4 KB. It's not a terrible representation of an average consumer/client workload, either.

On average, each I/O is worth approximately 11 KB of throughput.

The scaling is absolutely smooth and beautiful as more SSD DC S3700s are added to the mix. From 50,000 IOPS with the four-disk array up to 325,000 with the 24x arrangement, performance increases linearly with both drive count and workload intensity.

7. Results: Going For Broke

Up until now, we haven't been using Linux's disk caching to inflate our benchmark numbers. Unlike most mainstream and higher-end hardware RAID cards, we don't necessarily have a direct equivalent while using mdadm and Linux. At least, not in RAID 0. We do get to get to allocate a certain number of memory pages per drive in RAID 5/6, so that does translate into a direct analog to the high-speed DRAM caches found on cards like LSI's 9266-8i.

Linux is crafty, though. Any memory not explicitly allocated is used for caching, and if that memory is needed for other purposes, it gets released. In our case, that 64 GB of DDR3-1333 from Kingston is mostly free for caching drives or filesystems, since our server isn't using much on its own with CentOS 6.

When it comes to benchmark drive arrays, we want to disable caching to get a truer sense of what the SSDs themselves can do. But now that we already know those performance results, let's see what happens when we mix in two quad-channel memory controllers and eight 1,333 MT/s modules.

Sequential 4 KB Testing, Cached I/O

When we let the OS cache 4 KB random transfers, we see different performance characteristics than what we already reported. The potential is there for much higher numbers, but only if the system is requesting data cached in memory. The software only knows to cache LBAs into DRAM if they've already been accessed during a run.

Four-kilobyte random reads are tested from queue depths of one to 32, using from one thread up to 32. The chart below reflects the results from 36 of those runs to create a matrix of 20-second tests.

In that short amount of time, the system doesn't really have a chance to access many LBAs more than once. With our workload generator randomly deciding which LBAs of the test range to pick, some addresses might get accessed several times, while others get touched one time or not at all.

Not CachedNot Cached

Testing with caching yields an orderly performance picture. If we allow our 24-drive to cache, we see something a little different.

CachedCached

The first and second charts are basically the same test. The former doesn't benefit from caching, while the one directly above does. Performance at lower queue depths and thread counts is significantly better, albeit less consistent. As we encounter more taxing loads, the array just can't achieve the same number of transactions per second characterized in the direct (not cached) test run. In this scenario, maximum performance is lower, but minimum performance is better.

That dynamic gets flipped on its head when we switch to sequential transfers.

Sequential 128 KB Testing, Cached I/O

By restricting the tested range of LBAs to just 32 GB, we can actually cache the entire test area into DRAM. It doesn't take long, either, especially when we're reading and writing thousands of megabytes per second. We let the test run for 10 minutes, and then take the average bandwidth in MB/s.

It doesn't matter how many drives are in the array. The entire 32 GB LBA space is cached within seconds, and after that we get up to 23,000 MB/s of read bandwidth over the 10-minute test run. We generate a ton of throughput with writes, too, a stupefying 14 GB/s. This feels a lot like cheating, but is it? Not necessarily. That's one reason DRAM caches are used in the first place. We're simply looking at what happens when we let our OS do what it does, and then make the most of it.

Max IOPS

We've seen what caching can do for sequential transfers, and how it affects random performance within a narrowly defined test scenario. If we want to shoot for max I/Os, we have to mix the two methodologies together.

First, we create a 4 KB random read workload, spread out over a 32 GB LBA space, and then let it run for a few minutes. After a while, the system caches the entire 32 GB LBA area and we get the full effect of servicing requests straight out of DRAM. How high can we go?

The answer? Up to 3,136,000 IOPS, or in excess of 12,000 MB/s. At this point, we're using just about all of processing power our dual-Xeon server can muster, pegging both CPUs. Generating the workload to push so many I/Os is hard work, and with the extra duties of handling RAID calculations get intense. After extensive trial and error, 3.136M IOPS is the best we can do. That's all our 19 trillion-byte array and 16 physical cores manage. It's a sufficiently gaudy number to end on.

8. 24 SSD DC S3700s: So Choice. If You Have The Means...

Modern SATA drives are already bumping up against the 6 Gb/s SATA interface's limits. For now, there's little way around that, though the horizon is filled with fast new interfaces and form factors. Today, circumventing the SATA performance ceiling somewhat defeats the purpose of client storage, and is entirely incompatible with the the SSD DC S3700's mission. It's all about giving buyers the right mix of affordability, flexibility, and features in a SATA-based drive.

Of course, affordability is in the eye of the beholder. But the SSD DC S3700 was designed for organizations looking for a cost-effective repository for large deployments. Intel's older client drives are being used in datacenters of all sizes, and the DC S3700 is aimed at the folks making buying decisions in those environments. It gives buyers more capacity and form options at a price similar to where the 3 Gb/s SSD 320 family debuted.

Naturally, building an almost-18 TB array with 800 GB SSDs isn't cheap. Each drive sells for $2,000 online. Want your own beastly array? Be ready to spend close to $50,000 for the privilege (and that's before the price of a suitably-fast server). Sure, you're spending a pretty penny. But relative to what similar capacity and performance would have cost previously, the SSD DC S3700 clearly benefits from faster, more economical MLC NAND manufactured using a high-yield 25 nm process. 

Just because these drives aren't cheap doesn't mean we can't have fun with them. Without question, we can extract superb performance from just a few of Intel's SSDs. Scale up to 24, and you're talking about some serious speed.

All of this could have gone horribly pear-shaped without a proper server to drop the disks in. For what we wanted to do, we needed all of the performance our Xeon E5s could give us. We successfully found the limit of what the SSDs and our test platform could do together, and the result was on the (very) high side. The RAID 0 calculations performed by the host are fairly lightweight, but creating enough of a workload to stress the storage subsystem is fairly taxing.

The rewards really are worth the effort. We were able to turn those 17,880 GiB into a single high-performance volume capable of heroic performance. Getting to 1,000,000 4 KB read IOPS is nice, but witnessing almost 1,000,000 write IOPS is even  better. Sequential performance is huge all-around. Software RAID has never been so nice.

And that might be the biggest surprise of all. If we were stuck benchmarking software RAID under Windows, we would have been stuck at a tiny fraction of the performance seen using mdadm. At the very least, this is a topic worth further exploration. Faster hardware and open source development have created a software RAID solution that might be a viable alternative to hardware in many situations. Because mdadm's options are numerous, and there isn't much documentation of user experiences when it comes to SSD arrays, it's entirely possible that further tweaking could yield even better results. Sadly, our time with the SSD DC S3700s was all too brief, and there are so many unexplored areas to tackle. That's going to have to be a story for another day though, because Intel already received our little care package back.