X25-e / adaptec 5805 /raid0 throughput problem

ihartl

Distinguished
Sep 3, 2009
2
0
18,510
For my application (data acquisition) I need stream as fast as possible to disk

Inspired by the posted 2.2GB/s results posted in Tom's hardware, I set up a RAID0-array using 8x Intels X25-E ssd drive (32GB) and an adaptec 5805 RAID controller, hoping to achieve about half the speed (1.1GB/s) when using half the number of drives (8 vs. 16). Even though I need the storage for sequential writing I went with SSD technology for lower power consumption / no cooling issues.

Unfortunately the sequential writing throughput is by far lower than expected. When streaming to a single disk, I get an average of 196 MB/s writing speed, with a short peak of ~ 400MB/s at start (I assume this is fill-up of some cache). Using a RAID0-array of two disks I get 382MB/s, which seems to me correct scaling.

However if I try to add more disks to my RAID-array I am stuck at ~400MB/s.
What can be the cause of that?

Tools used: HD Tune and h2benchw (pretty much consistent in results)

System:
ASUS P6T6 WS Revolution
12GB memory
Intel i7 920 processor @ 2.67GHz
NVIDIA Quador FX 370
Microsoft Windows Vista 64bit


My guess is that there is some bottleneck either in the RAID controller or in the 8-lane PCI-e bus ???
I tried to switch PCI-e slots, checked all drives individually, played with cache settings at the controller and used different stripe sizes. I do not see any PCI-e settings in the motherboard BIOS. Only two PCI-e slots are occupied, one by the graphics card, one by the RAID controller.

Thanks for any hints where to look or what further tests I could do!

Ingmar.
 

brianhj

Distinguished
Sep 13, 2009
1
0
18,510
well each port on that card can do 3gb/s which is 375MB/s if i'm not mistaken.
so if you have 4 drives connected to one port of that card, the most throughput you'll get is 375MB/s. if you utilize both ports of the card (say, 6 drives, RAID 0) max throughput would double to 750MB/s.

unless i'm missing something.
 

ihartl

Distinguished
Sep 3, 2009
2
0
18,510
Even by using both ports I only get 400MB/s (7-disk RAID 0). I the meantime I got some slight improvement when using the newest Adaptec Vista drivers and the newest Adaptec BIOS.
Now the average write speed to a 7-disk (X25-E) RAID-0 array is 465.7MB/s,
The write speed to a 3-disk RAID-0 array (using the same port on the adaptec controller) is 456.5 MB/s, for two disks I get 385.7MB/s average.
So using more than 3 disks and using more than 1 port on the controller gives no significant improvement in throughput. I am still puzzled.
 

lma111

Distinguished
Sep 26, 2009
5
0
18,510
Why don't you use the onboard sata controler from your motherboard with 6 disks ? So you can bypass the pci-e bottelneck or pci-e specific driver problems and see how nice the southbridge is ?
 

kag

Distinguished
Dec 4, 2009
1
0
18,510
Hello

I wanted to share with you my experience.

I run a 5805 with 4 X25-E on a Linux 2.6.31 64bit box, with a dual quad-core and 8GB of RAM. Your setup is more powerful so I assume it is not a bottleneck.

A few bandwidth figures:
- the 5085 is a PCIe 8x card, the PCIe bandwidth is 8x250MB/s = 2GB/s
- I didn't find the official bandwidth of the SFF8087 port, http://www.tomshardware.com/forum/247701-32-build-flexible-expandable-soho claims 24Gbit/s
- an X25-E reaches 250MB/s, which is 2/3 of the bandwidth of each SATA link (3Gbit/s)
- Adaptec claims that the 5085 reaches 1.2GB/s

So to me, the first bottleneck is actually the 5805 card itself.

I reach 1GB/s at most in the following configuration:
- 2 disks on each SFF8087
- RAID0 with default settings, configured through the menu at boot, not the software provided by Adaptec

Note that 1GB/s is pretty much the scaling of 4 X25-E at 250MB/s. I am only interested in RAID0, I didn't benchmark other configurations.

The benchmark procedure is to read and write 2GB of data, for various block sizes, at the block level (not the filesystem level), between the RAID (/dev/sda) and the RAM (/dev/shm). It kills any data on the disk, of course, don't blame me if you lose anything.

In a shell:
for bs in 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 33554432 67108864 134217728 268435456 536870912 1073741824; do dd if=/dev/sda of=/dev/shm/junk bs=$bs count=$((2147483648/$bs)); done

For writing tests, switch "if" and "of" values in the previous command.

The throughput increases from ~80MB/s at bs=64 to a little less than 1GB/s at 16384.
It stays at 1GB/s for the following bs values: 32768 65536 131072 262144 524288 1048576
Then it decreases to 600MB/s for blocksizes higher than 2097152

One remark: the throughput is more than halved if I put 4 disks on a single SFF8087 (drops to 400MB/s at most).

I hope these indications will help you solve your problem. I would also appreciate if somebody could confirm my results, of course.

Thanks
--
Sylvain