Options for RAID with 3 drives

jpdykes

Distinguished
Aug 7, 2007
594
0
18,980
It's not so often that I'm starting new threads but here is a different one for a change. I know there is good knowledge out there!

I have been given 3x 500Gb Samsung F1 drives. I need to know what RAID types are available to me and how to set them up.

Aims:
- Increase overall system performance,
- Decreasing Vista load times
- Decreasing game load times

Background:
This is a new RAID array for the following system:

Q6600 @ stock
Gigabyte P35 DS3R
Seagate 7200.10 250Gb
4Gb Geil Black Dragon PC8500
ATi 4870
Enermax Modu 625W PSU
Vista Ultimate

System is used for gaming, programming, academic work and general web surfing. I do little to no image or video editing. I currently have about 125Gb of data stored on a fairly new install of Vista, I have some more data to add.

Questions:

- What RAID variant should I pick given the drives available?
- Run Vista from the 250Gb, use the other 3 as data stores
- Run Vista from two 500Gb drives, use the other 2 as stores
- Another variation
- How do I go about constructing a RAID array?
- I had come across block sizes, there appear to be different options what is most appropriate, 16kb?

The two obvious thoughts are R0 or R5.
- Is R0 considerably faster than R5?
- I have read a comment that R5 is very poor at writing data, is this likely to be an issue in my case?
- R5 gives me a parity bit, given that this RAID array is likely to be in this build for at least 2 years, am I going to need this facility?

I've posted this on the storage forum as well.

Jeremy
 

pinaplex

Distinguished
Aug 21, 2007
474
0
18,860
For 3 drives RAID 5 would be the way to go. But you must realize that you will not be meeting 2 of your 3 goals.

- Increase overall system performance (the RAID 5 setup will increase your overall system performance and data integrity)

- Decreasing vista load times (RAID will not affect your load times, the only way you can decrease load times is to get a hard drive with a faster access time)

- Decreasing games load times (same as above)


In your case, I don't think a RAID setup will benefit you. I would just use the 500GB drives as data drives, and think about buying a faster boot drive than your current 250GB.
 

jpdykes

Distinguished
Aug 7, 2007
594
0
18,980
From posting on the storage forum I have heard bad things about RAID 5, generally relating to poor performance in random writes - consequently not very good for OS.

I don't need large storage, 250Gb is ample for non-program stuff - music, video, doc's. Consequently I have 3 drives I want to get the most from.

Is there no performance gain, in terms of OS and programs, from creating a 3 disk R0 array compared to one 250Gb disk handling all the data?

Jeremy

 
I think your best results would come not from raid, but "short stroking"
The outermost parts of a drive contain more data than the inner, yet the rotation time is the same. That means that the transfer rate of the outer parts is faster.
If you deliberately use only a small part(10%?) of each drive for data, it will be accessed much faster. The seek time will be minimal because the arm will have a much shorter range. That is "short stroking"
 

jpdykes

Distinguished
Aug 7, 2007
594
0
18,980
What I find hard to understand is this:
(Simplified setup I know)

If I want to read an item of data say 3mb.

The item is stored over 3 disks, ie 1mb on each. To read the data each disk has to search for that 1mb and read it back. The data can be read back in parrallel, all three reading at the same time.

For 1 disk the disk has to find each 1mb part of the data and read each out sequntially. Consequently one would expect this to take 3 times as long 3 disks finding 1mb each.

I am therefore at a loss as to why running R0 on 3 disks is going to show little or no improvement over 1 disk.

Jeremy

NB: This is a very simplified setup!
 

It is not that simple.
The data transfer rate on a hard drive is perhaps on the order of 80mb/sec. It will be higher on the faster outer parts, and slower on the inner parts. A 10,000rpm drive will be 38% faster. For a 1m block, at 7.2k, that is 12.5 milliseconds. For a 10k drive, it might be 9ms.

Arm positioning time for a 7200 rpm drive is perhaps 8.5ms. That is the time for the disk arm to position itself from any random part of the drive to any other random part. If you are using only part of the drive for data, this time will be less. As the drive gets filled, the positioning time will approach the 8.5ms number. The number for a velociraptor300 is 4.2ms.

Once the arm is in place, it takes an average of half a revolution to get to the actual data. For a 7200rpm drive, that is 4.2ms, and for a 10k drive, it is 3ms.

The total for 1mb is therefore 25ms for a 7.2k drive and 16.6 for a 10k drive.
In raid-0, the positioning and data transfer can be overlapped, but completion must wait for the slowest of the three operations.
In a single drive, you just have one average positioning operation and three data transfer operations. Since the data transfer only accounts for half of the basic time, you will not get 3x the performance.

It gets more complicated when you consider that 3mb of data might not be nicely located on the same part of the drive next to each other. If you will be reading and writing two files of data to the same raid-0 array, The positioning arm will constantly be stolen by each file, greatly increasing the total time. In that case, putting a different file on each drive would be more effective.

If you are doing multitasking, there is no way to optimize things. The arm will be constantly stolen by the other tasks. It would be better to segment the multitasked data, each to it's own drive.

Best performance comes from using the fastest available devices. Today, that is the Intel SSD, but at $600 for 80gb, it is price prohibitive for most of us. The Velociraptor is next best, but that is also pricey.

My suggestion is to use a Velociraptor for the OS and highly active data, and use a larger, slower drive for bulk storage and backup.


 

jpdykes

Distinguished
Aug 7, 2007
594
0
18,980
Ok I'm still trying to digest all this:

Of course there will not be 3 times the performance using 3 drives over 1. Thats a given.

"In a single drive, you just have one average positioning operation and three data transfer operations. Since the data transfer only accounts for half of the basic time, you will not get 3x the performance."

Makes sense, but the data transfer still ought to be quicker given that there are 3 drives reading in parrallel?

One thing that keeps niggling me is that if there is no performance benefit to have 3 drives in R0, why do people make arrays?

Equally if I install this am I going to see any performance improvement over the 250Gb Seagate 7200.10 drive installed?

Jeremy

P.S. Sorry for a slow reply - that took some time to digest!!