Slow Performance: 4x OCZ SSDs and Adaptec RAID Controller

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980
Ok, so something is wrong. I'm using a test system now so this isn't my final setup, but I should still get better results than what I'm finding now.

Configuration:
■System Setup: Dell Precision 390, Intel E4500, Intel i975x, 4GB RAM, WinXP SP2 *Note
■System HDD: Dell SAS 5/iR controller, Cheetah 136GB 10K SAS
■RAID Controller: Adaptec RAID 3805, 128MB RAM, w/ SATA dongle (/w newest BIOS and drivers)
■Test Drives: (4) OCZ Core 64GB SSD

Note: The system SAS controller is connected to the PCIe x8 slot (x4 wiring), I installed a PCI video card in the bottom slot, and put the Adaptec card in the PCIe x16 slot where the video card was.

Here are my RAID controller settings:
■32KB strip size (w/ RAID 0 setup)
■NTFS partitions w/ 4KB sector size
■Write-Back: Disabled
■Read-Cache: Disabled
■Write Cache mode: Disabled (write-through)
■Write Cache Setting: Disabled (write-through)

Here is my results with one drive:
oczcoresimplevolumeoi0.png


And then here with 2 in a RAID 0 setup:
oczcoreraid0ft5.png


Why are my results so poor? RAID 0 setup with all 4 SSDs has the same performance as the RAID 0 with 2 SSDs. I had older drivers and controller BIOS previously, but had the same results, so I thought I should check for newer versions and there were so I updated. I tried various settings for the controller with little to no difference. I also tried Iometer and got similar results. I tried a RAID 5 configuration as well to no avail. Would using the SAS connectors/dongle make a difference? What am I doing wrong?
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980
I just ran one of the SSDs off of the motherboard's SATA connectors (from the onboard Intel chipset) and got these results:
oczcoreintelchipsetep5.png


Something's up with the controller.
 

SomeJoe7777

Distinguished
Apr 14, 2006
1,081
0
19,280
1. Why are all of the cache settings disabled on the Adaptec controller? Enable these and try again.

2. Instead of using HDTach, use HDTune, and select a 64K block size. You should get similar results. Then switch block sizes to 1MB and see what you get.

3. Why a 32K stripe size? I would recommend 64K.

4. When you created the NTFS partition on the RAID array, did you use the command line DISKPART utility so that you align the partition on a stripe boundary? If not, delete the partition and re-do this.
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980


1. I disabled all the caches because that would show me the raw performance of the disks without it being influenced by the cache, correct? I also disabled the cache on the disk (as configured from Adaptec Storage Manager) since SSDs don't come with a cache on the drive.

2. Ok, I'll try HDTune and let you know what I get.

3. I just picked one, but it shouldn't make that much of a difference though, right?

4. I did not use DISKPART and I haven't heard of "align[ing] the partition on a stripe boundary". I will try this by following this example: http://support.microsoft.com/kb/929491

With regards to #3, I'll try the 64k stripe size, but what do you recommend for the sector size? The same? Thanks for your reply!
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980

Ok, so went to run the command and it said
The arguments you specified for this command are not valid.
I tried it without the align argument, and it worked. I went to microsoft.com to search for answers and found some pages that listed "align=N" as an argument and others that didn't. I found this blurb in a forum:
You need the DISKPART version 5.2 from Windows 2003 (or 6.0 from Vista) in
order to use the ALIGN parameter. Windows XP does not support this feature.
Have you run into this?

I'll see if I can find my Win2k3 discs and pull diskpart off of that.
 

SomeJoe7777

Distinguished
Apr 14, 2006
1,081
0
19,280
Ah. Yes, apparently the Windows XP version of DISKPART does not support the align parameter. I haven't run into that, my home server is Windows Server 2003. You should be able to use the Windows Server 2003 version to create the partition.

If you pick a stripe size of 64K, then use align=64 to align on the stripe boundary. If you use another stripe size, adjust the align parameter accordingly.

When the partition is not aligned on a stripe boundary, what can happen is that requests from the computer to the array that should fit within one disk will have to be spread across 2 disks.

For example, take a 9-drive RAID-5 with a 64K stripe size. Each 64K stripe is spread across 9 disks, one of the drives is used for parity of this stripe, resulting in 4K blocks on each disk and a 9th 4K block of parity. The standard NTFS cluster size is 4K, so when the computer requests one cluster from the array, if the partition is aligned properly, this request will go to exactly one drive.

If you let the Disk Management application format the array, the partition is created starting at sector 63, the first cylinder boundary after the MBR. This offsets the start of the partition and the start of a stripe by 512 bytes (1 sector). Now, when the computer requests a 4K cluster from the array, 7 sectors of the cluster are on one drive and 1 sector is on an adjacent drive, resulting in 2 I/Os to 2 different drives instead of 1 I/O to 1 drive. This can reduce array performance a lot for certain applications.

To your other points:

1. The caching is important for array performance because the controller uses that to improve the sequencing of I/O's. Turning all the caches off won't really give you a real evaluation of the array's "raw" performance, because the controller won't be issuing I/O commands as fast as it will when the caches are turned on.

3. Some controllers are specifically optimized for 64K stripe sizes. Some people have posted threads here in the forum that the Intel ICH controllers have performance degradations at stripe sizes other than 64K.

You cannot control the sector size - sectors are fixed by the device 99.9% of the time at 512 bytes per sector.

You can control the cluster size, which is the base allocation unit size for the NTFS file system. NTFS defaults to 4K clusters, and I would leave it at that. I have experimented with other cluster sizes and found that they did nothing for performance. NTFS can go up to 256TB with 4K clusters, so there is really no compelling need to go to larger cluster sizes. Further, you can make more efficient use of the controller's cache with the smaller 4K clusters.
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980
Hey, thanks a lot for your responses.

I tried pulling the diskpart.exe from my win2k3 discs but the app never ran. I guess since the kernel is different. But I'll boot into my Vista DVD and run diskpart from there and see if that works :)

I'll try your suggestions and get back to you, probably sometime Friday.
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980
Ok, some interesting results with align=64 argument:

This is with 64KB stripe size and 64KB reads using HD Tune:
hdtune64kbstripealignedod6.png


Now here with 64KB stripe size but with 1MB reads:
hdtune64kbstripealignedgj2.png


This is starting to look better. Why is the CPU usage so high?

Ok, now here are the interesting results...
Intel chipset (ICH7R) with 128KB stripe size, 64KB reads, not aligned:
hdtuneintelraid0128kbstlw3.png


And now with the same setting but align=128:
hdtuneintelraid0128kbstit7.png


I get about 5MB/s more when it's aligned but why does the CPU usage go up so much (according to HD Tune)?

Now to look at 1MB reads...
Intel chipset, 128KB stripe, 1MB reads, not aligned:
hdtuneintelraid0128kbstaz4.png


Intel chipset, 128KB stripe, 1MB reads, aligned:
hdtuneintelraid0128kbstoy5.png


Once again, why are the Intel results so much better and why does the CPU usage jump up when I align the partition?

Any comments would be greatly appreciated.
Thanks!
 

ereetos

Distinguished
Jul 10, 2008
11
0
18,510
lolz you have poo SSDs!!!!

i think that is the answer to your question :)

my 5 1/4" pork-n-beans hard drive from 1983 run faster than your silly SSD ARRAY!!!!!!!!!!!!!
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980

Thanks man, I luv u 2
 

SomeJoe7777

Distinguished
Apr 14, 2006
1,081
0
19,280
Hmmm ... apparently some people are eager to prove they have nothing to add to the discussion ... :sarcastic:

Anyway ...

I think what's going on here is the interaction between the stripe size, alignment, and the physical characteristics of the SSDs. Since the SSDs have no cache, all of the reads are directly from the flash chips. I think what you might have to do to optimize this is to find a combination of stripe size and alignment that results in the best performance. This will mean that the stripe size and alignment correspond to the chunk size that the SSDs are using internally.

At first, don't worry about alignment. Try successively increasing the stripe size: 64K, 128K, 256K, 512K, 1MB. Once you find the best stripe size, then attempt to align it to get the best performance and lowest CPU utilization. For instance, if you find that 256K stripes work best, then try unaligned, and then align=64, 128, 256, 512, and 1024.

Hard drive performance is less dependent on the alignment and physical characteristics of the device than the SSDs are, and the lack of cache on the SSDs exacerbates the problem.

This may be a lot of reformatting and retesting, but in the end you can be fairly sure you're getting the maximum possible performance out of the array. Don't forget that the RAID controller you're using is designed for hard drives, not SSDs, so there may be some optimizations that simply can't be realized.
 

Griff805

Distinguished
Jul 26, 2008
3
0
18,510
I've been having some trouble with my OCZ Raid setup as well. They don't particularly like HD Tune. Try ATTO and see what that tells you.

Might try this: http://managedflash.com/
There's a demo you can download
I have not had a chance to try it, as my drives are still being RMA'd.
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980

Except for the fact that RAID 0's don't have a degraded mode.
 

ereetos

Distinguished
Jul 10, 2008
11
0
18,510
i'm tellin ya man... its becuz of ur poo arry and crap SSSD disk!

did i evur tell u i kno whta nVidia stand for?!!!??

 

anon_reader

Distinguished
Jul 11, 2008
17
0
18,510
What's wrong?

NOTHING is wrong. You are on the leading edge of "discovery" of what will soon become known as the Flash SSD "read-write" penalty.

Congratulations for turning off the write cache -- in doing so you help expose the truth about Flash SSD.

Ask yourself, why does Intel's new "extreme performance" SSD drop from 35,000 IOPS READ to only 7,000 IOPS when there is a 2:1 read:write workload? See their spec sheet on the X25-E. Since 100% writes are 3,300 and 100% reads are 35,000 IOPS, shouldn't the combined performance at 67% read be more like 24,000 IOPS? Why only 7,000 IOPS?

You'll find out....

And, intel achieved even this miserable performance only after putting a massive (relative to disk) write cache in front of the flash, and did their tests with write-cache enabled. I noticed that no one is saying HOW MUCH DRAM is on the Intel device, but (oops) there goes the "non-volatility" argument for Flash!!! Lose power and you have lost a LOT of writes in the cache! Anybody wonder why Intel's SSD uses so much power? It's the massive DRAM write cache they needed to get decent write performance!!!

Sorry. but when all you guys start looking really hard at how NAND flash actually works, you'll discover the truth. uncached NAND flash is ridiculously slow whenever you are NOT doing 100.0000% reads. Insert even a few writes into the workload and the WHOLE THING slows down to a crawl. DRAM write cache can help a little, but not even as much as Intel's spec sheet shows -- note they had to drive 32 outstanding IOs into the queue of the disk to get the numbers they did. That kind of queue depth SIMPLY NEVER EXISTS IN THE REAL WORLD!!!

Oh...and OBTW...for those who keep saying "it will get better with time", actually the opposite is true. As MLC flash (on which all of the future cost reductions are based) goes from 2 bits/cell to 4, 8 and 16 bits per cell, this problem gets WORSE...not better.

Oops....(again)

By the way, another thing that never happens in the real world is 100% random IO -- and that is what ALL these ridiculous performance comparisons are based on. Disk gets MUCH better when the percentage of random-to-sequential IO is in realistic ranges (like 50/50) and so a huge chunk of the SSD performance benefits simply evaporate. This is why IDC's benchmarks recently found only a very small improvement for flash vs. 7,200 RPM disk (and also found several places where disk was substantially faster).

Keep it up guys...at this rate you will rapidly discover the truth of flash SSD.

Here's a hint. For your next test, compare a flash-based RAID-5 to a disk based raid-5 of equal CAPACITY. Then, pull a drive and see how long it takes to rebuild parity on the flash SSD array. You'll be blown away at how much faster spinning disks are than Flash SSD -- especially in the rebuild!
 

I love this question. Why only 7000 IOPS? It sounds so terribly slow. Until you realize that even the fastest hard drives struggle to get 500-700 IOPS read or write.
 

anon_reader

Distinguished
Jul 11, 2008
17
0
18,510
RocketSci...

1) the 7,000 IOPS number is derived with IOmeter by driving the queue-depth at the disk to 32 outstanding requests -- NEVER happens in the real world. At a more realistic queue depth of 3, the Intel device will do maybe 1000-1500 IOPS in a 2:1 read/write profile. on an IOPS/dollar basis, this hardly justifies the $1,400 price tag (20x spinning disk) for the X25-E device, UNLESS this IOPS number translated into meaningful performance. According to the IDC (and MANY other application benchmarks, it does not).

2) In the REAL world, a significantly large percentage of the read workload IO requests cannot be issued by the host (get into the disk queue) until a previous write has completed and ack'd back to the host. This is called "synchronous" IO and it is predominant in the real-world but almost never modeled in benchmarks. The asymmetrical performance of read vs. write in flash is a huge problem here. The only benchmarks that reliably model this behavior are application benchmarks such as TPC-C, which is one reason why you never see flash SSD used in that benchmark.

3) Now...the question I actually posed was...in the 2:1 read:write workload; why doesn't the Intel device do [[35,000x2]+[3,300x1]]/3 = 24,433 IOPS instead of 7,000?

If you ponder the question I actually posed, you'll move a step closer to understanding why folks like IDC are finding only marginal performance benefits for SSD, while also finding numerous areas where spinning disk is faster.



 

anon_reader

Distinguished
Jul 11, 2008
17
0
18,510
Oh by the way, anybody care to speculate why Intel are not saying just how big IS that DRAM write cache on the X25-E.?

Why on earth is Intel NOT SAYING?

http://download.intel.com/design/flash/nand/extreme/extreme-sata-ssd-datasheet.pdf

Extrapolating from the "base" performance of SLC Flash SSD on writes (about 130 IOPS according the Imation white paper), I'm guessing that Intel stuffed about 128MB of DRAM (volatile) write cache onto the X25-E, which would also explain the high power-consumption of the device. The typical spinning disk needs only an 8MB cache, because read performance and write berformance are "balanced" (roughly equal).

In the "Enterprise" markets at which X25-E is aimed, customers will simply NOT accept the risk of losing 64 or 128 Megabytes of writes in the event of a power failure or device failure -- this is why all the major Enterprise-class disk array vendors turn off the write cache on the disk. So...for Flash SSD this means it's back to about 130IOPS write, which will in turn throttle back the read performance (due to synchronous IO from applications) and...well...you know the rest.

Flash SSD = WORM device!
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980
Interesting take on all this Anon. I can see how what you are talking applies to SSDs. It's good to see someone understand why I disabled the write caches for the test, though most of my test were or should have been read only. However, I did run various iometer tests with 100% reads and I still could not get above ~140MB/s even with 4 disks in RAID 0. It doesn't seem right. Anon, do you have any experience with Adaptec's RAID controllers?
 
G

Guest

Guest
Hello, I tried 2 Ocz core v1 drives in raid0 on an Adaptec controller and I can confirm they work badly. A little better with an integrated Marvell chipset on my mobo, although the best results are obtained by far with Intel's ICH9R controller. No contest. I don't know why it's like this though.
 

anon_reader

Distinguished
Jul 11, 2008
17
0
18,510


Natively, the Adaptec 3805 is a SAS controller and uses STP (SATA tunelling protocol) to talk to SATA devices. My guess is that it's a poor implementation of STP.

Try connecting a pair of the SSDs direct to the on-board SATA ports on your MOBO and striping them using Windows disk manager.