Sign in with
Sign up | Sign in
Your question

10000rpm VS Raid0

Last response: in Storage
Share
March 23, 2004 2:36:54 PM

Hello,
consider these two set-ups:
1. Two HD as a part of a single Raid (let's say raid0).
2. One 1000rpm HD (with the OS and frequently used software on it) + another "regular" HD (let's say 7200, SATA or 8mb)for the rest of the applications + data storage.

Aside the difference in the total size (Gb) of the two different set-ups, what will give BETTER PERFORMANCE?

Thanks'

More about : 10000rpm raid0

March 23, 2004 3:09:12 PM

I have a 74 GB Raptor 10k rpm drive and a 120 GB WD SATA drive.

The raptor benchmarks just below a 2x 80 gig 7200 rpm raid 0 configuration in raw transfer speed, and faster in seek time. Overall it's probably faster than a raid 0 of WD SE 80 gig drives.

Where my system is really faster though is when I'm trying to access something from both drives. For example, copying a file from c: to d: (120 gig to 74 gig) goes very very quickly as opposed to from c: to e: (partition on the same disk).

I'd recommend all applications/games/programs/etc + OS on the 10krpm drive and all the data on the large 7200 rpm drive.

s signature has been formatted to fit your scr
March 23, 2004 3:39:14 PM

Thank's for the quick response :-)

So, on an average/daily usage, do you fill the 'speed'?
Is it worth investing- compering to the using "regular" HD setup?
Related resources
March 23, 2004 4:46:46 PM

When I had a 3-disk Raid-0 hooked up (temporarily can't, Promise raid5 card is incompatible with having other RAID cards at the same time) all applications loaded in 1/3 the wait time, and game levels loaded in 1/3 the time too. Windows bootup was also much faster. It definitely made the system feel speedier. Very worth it. (I'm assuming the same would be true of a 10,000rpm drive if it benches similar to a 2-drive raid-0. 1/2 the wait for things to load.)


While we're discussing RAID performance vs. single disk performance...
I have this theory that I want to run by you guys about raid-0 seek times. Say you have something on a RAID-0 that involves random reads and random writes. On a single drive, the drive has to seek in a linear fashion (seek to A, read, seek to B, read, seek to C, read, etc.) for a seek time we'll call 'S' where 'S' is what's rated for that drive. Here's what I'm wondering - how does this happen on a raid-0? I would assume that all drives seek in parallel (four drives seek to A,B,C,D, read, four drives seek to E,F,G,H, read, etc.) which would give you an average seek time of S/2, making a two drive raid-0 better (S/2) for random reads and writes than a single 40% faster RPM drive (S/1.4), unless the faster drive has 1/2 the seek time. Do you get a significant reduction in average seek time in raid-0? Or do controllers not do what I was describing?
March 23, 2004 9:38:26 PM

You get WORSE seek time with raid 0. The reason is that each drive has to find the information you need.

Very simple explanation: the average seek time for a single disk is 1/2 the rotation period (easy to see why). Now, if you have 4 disks, you are always waiting for the last disk to find the info.

And I highly doubt that your load times were 1/3 of a single disk. Go check storagereview.com for their raid benchmarks on raid 0 systems to see what kind of performance you really get.

s signature has been formatted to fit your scr
March 23, 2004 9:39:38 PM

No. Seek times are worse or the same. The reason is each disk has to search to the beginning of the stripe for file A then to the beginning of the stripe for file B etc. If the disks are out of synch disk0 can be at the beginning of file A and disk1 still searching resulting in an increased delay.

<A HREF="http://www.anandtech.com/myanandtech.html?member=114979" target="_new">My PCs</A> :cool:
March 24, 2004 7:19:44 AM

Well, the load times were very significantly down. Maybe not 1/3, but they sure felt like it. Maybe they were half. Maybe the specific game I was playing just happened to get the best-case sequential read scenario when it loaded levels. I've spent too much time on storagereview.com today... I may attack it tomorrow.

Too bad about that little detail of Raid-0... I'd have thought each channel would be able to know where to look for its piece of the data and seek independently. What you're saying is that all drives must have finished seeking before any transferring can start? I suppose I can see why that is. And the average "slowest seek time" in a large array will always be larger than the average seek time for any one disk.
March 24, 2004 3:34:33 PM

Let's say you had the word "MASS" stored on a 4 disk raid array, such that each disk had a single letter. This data is spread out amongst other data as well. It might look something like this if it were arranged in a single circular track 8 cells long:

disk 1: TLRSM4NA
disk 2: N*39SA*)
disk 3: LSOOENWW
disk 4: ::AOE)NS

As opposed to a single disk setup: LAMASSNL

Let's imagine that it takes one second for the disk to look at each letter. The single disk finds the word after 2 seconds, and has it transferred in an additional 4 seconds. The raid array has disk 3 find its piece after 1 second, and the others find theirs in 4, 5, and 7 seconds each. The entire word is found only after 7 seconds have gone by. This is why raid 0 is generally slower for seeks and small file transfers.

Now imagine that the word was a million characters long, so you have 250000 in a row on the raid0 disks, and 1 million in a row on the single disk. It might still (for argument's sake) take the single disk only 2 seconds to find the file, but it will take an additional 1 million seconds to transfer it. The raid array will take 7 seconds to find the file, but will be able to transfer the whole thing in only an additional 250000 seconds. This is where raid0 is faster.

You can also probably easily see why raid1 will be faster than a single disk on seeks as well...

s signature has been formatted to fit your scr
March 24, 2004 8:44:18 PM

Is this because the array does not know the disk location of the next chunk until it has found the first chunk?

i.e. does one drive take 7 seconds to find its chunk because it took a long time for it to seek, or because it had to wait for the other drives to finish seeking before it could start seeking?
March 25, 2004 6:05:32 AM

It has to do with the fact that the heads don't read the entire disk at once. The hard drive rotates around, and the information passes under the heads to be read. There's a chance that the data could be just under the head when it's requested, resulting in it taking very little time for the disk to rotate around so the information is readable (1 second seek time in my analogy). On the other hand it might have just passed the head, in which case the disk has to rotate all the way back around in order to get at the information (a 7 second seek). Assuming it takes 8 seconds for the drive to fully rotate around once, the average time a single disk will have to wait is 4 seconds, whereas it'll be longer for an array of more disks.

When it comes to seeks think of it like this: You're with some friends, you have your baseball glove and want to go play some catch at the park. No one else has their gloves though.

Raid 0: You want all your friends to come play, so you send them all home to get their stuff, and you wait for everyone to come back, then you head to the park. You're going to be limited by whoever lives farthest away.

Single disk: You pick a friend in particular and send him home. He gets his glove and you go play.

Raid 1: You send all your friends home to get their gloves, then you go with whoever shows up first.

So Raid 1 is fastest on seeks, then a single disk, then raid 0.

s signature has been formatted to fit your scr
March 25, 2004 5:11:22 PM

Okay so it's the "average longest seek among many disks" scenario, which is longer than the "average seek of each disk".

Too bad we probably can't look at individual implementations in cards, since they're probably considered private intellectual property by the companies. It would be interesting to know if any of the card makers have intelligent systems of getting around these limitations. Like, if you had two requests enroute to a raid-0, for disk 1 and disk 2. At request 1, disk 1 seeks in 3 seconds, disk 2 in 8. At request 2, disk 1 seeks in 8 seconds, disk 2 in 3 seconds. In your system, both seeks would take 8 seconds. I wonder, if they fed the data through a cache, if the two requests would complete faster since drive 1 could begin reading request 2 as soon as it was done with request 1, rather than waiting for all drives to sync again? As in this extremely poor graphic I just made in photoshop: <A HREF="http://www.bkgrafix.net/filepile/raid0.gif" target="_new">graphic</A> where the average seek time ends up being close to the individual drive seek time. (where the "cache" is yellow, it is filling with data, where it is pink it is matching cached reads with new reads and returning them, where it is green it is emptying cached reads by matching them with new reads from the slower disk.)
March 25, 2004 6:41:30 PM

It doesn't matter. Whatever system they use the seek time can't ever be less than that of a single disk. Seek time is the time it takes the heads to get to the start of a block and start transferring. Even if disk1 manages to do that in 2.3ms and disk0 in 4.6ms the seek time for the file request is still 4.6ms not (4.6+2.3)/2.

<A HREF="http://www.anandtech.com/myanandtech.html?member=114979" target="_new">My PCs</A> :cool:
March 25, 2004 11:22:53 PM

Yes, improving the system might make the average seek time of a raid-0 closer to the average seek time of a single disk, instead of the average slowest seek time of all disks in the array, that's what I was thinking. Meaning, if the raid-0 average seek time is the same as a single disk average seek time, then that is the ideal fastest situation, but currently raid cards probably use a system that makes the average seek time of a raid-0 much slower, and I was proposing a cache might fix that.

At this point it's just a cool discussion and has left reality far in the dust, as I have no idea what methods raid cards actually use to access data. I'm just having fun with the ideas and using them to learn more about the system.
March 27, 2004 12:29:22 AM

It's not a matter of the card at all. It's a physical matter of where the info is on the disk compared to the heads at any given time. There's no way you can get around it.

s signature has been formatted to fit your scr
March 27, 2004 1:25:53 AM

Take this - theoretically, if you have a pipelined processor, you will have huge gaps where the processor cannot take instructions because there is a branch, and the result of the branch isn't determined yet, so the processor would not know which instructions to feed in next. This would (theoretically) make pipelined processors extremely inefficient because they constantly have to have these "bubbles" of empty spaces in their pipeline. This is a limitation of any pipelined system. <i>But</i>, processors get around this by using intelligent prediction of which branches will be taken and which won't, and will feed in the instructions from the branch it *thinks* will happen, and only if it guesses incorrectly will it have to empty its pipelines and endure the "bubble" effect. So intelligent design can get around a problem that exists in theory.

So, in a raid-0, if there's a request in the queue, and each disk begins seeking to the next request immediately upon finishing the first instead of staying idle until all other disks have had their platters rotate under their heads, the system should be able to reduce the latency of the array from "average slowest of all drives" to become closer to the average seek time of the kind of drives you used. Because, if you begin processing the next request immediately and cache the reads while the other disks are still seeking, the seek times would average out. This would happen, I think, because it's statistically improbable that one disk would repeatedly have a seek time very far below its average. If a disk did do this, and the cache filled up waiting for one disk, the cache could simply be flushed and the system started again.

Unless, I am not understanding what you initially meant in terms of the latency of a raid-0 array being slower than the latency of a single disk.

I realize that you cannot get around the average seek time of a single disk, that is a barrier that my cache idea would not pass. But I think for an array of disks, the average "request-to-delivery" time could be made closer to the average seek time of a single disk, rather than close to the 'average slowest seek time' which can be several milliseconds longer.

Course, there's no way to test this short of building a Raid-0 card, and I'm not about to do that.
March 27, 2004 2:23:20 AM

Okay, let's make this even simpler:

Don't you think someone would have thought of that by now if it was possible?

s signature has been formatted to fit your scr
March 28, 2004 1:40:11 AM

Probably! :tongue: I'm still not convinced that they don't. Seems to me having a cache would be necessary on a raid-0, given it's impossible for all N drives to have the chunks they are supposed to be reading from all directly under the read heads at the same time. A cache seems like the only way to sync the information being read, and piece it back together.

I told you I'm just having fun poking around with the concepts. If you're convinced I'm completely lost here, I'll just let the thread die, because nobody else is reading this. :cool:
March 28, 2004 4:42:36 AM

He's thinking linearly, not circularly....ie vector vs linked list with iterator always starting at 0....

SEX is like math. Add the bed, subtract the clothes, divide the legs, and hope you dont multiply
March 28, 2004 1:00:48 PM

Hi!
Two IBM 7K250 SATA in RAID 0 for a total of 500 GB score about 70000 with Sandra 2004 (more precisely they score 73000 with no data on disks and 67000 with about 10 GB on the first partition (32 GB NTFS) of the drives including OS (WXP), applicz, data and other stuffs and after degragmenting the array). The SATA RAID controller is an onboard Sil 3112A controller as part of the Asus A7N8X mainboard with a AMD 3200 not overclocked.
As for the access time the IBM array is one of the fastest within the 7200 rpm drives, and the performances stay just above a RAID array composed by two WD Raptor 36 GB in RAID0.
The noise is quite inaudible.
But I know that the raptor 74GB is much better than the 36 GB model.


______________________ <font color=red>A straight line never ends……………….until it comes up against a vicious circle</font color=red>
March 30, 2004 10:19:34 PM

Umm, dude, cache is memory, what do you think the disk controller is doing when it reads a file? It's writing it into RAM. Hence the controller allocates RAM space and directs the data from the hard disk into RAM. That's what controllers do.

It's simple to understand,

Read time = Access Time + Data Transfer Time

say you have a 128KB file, 64KB on disk0, 64KB on disk1. The Data Transfer Time (DTT) is 1/2 the DTT for a single disk because both disks read together. The Access Time is the same or less on average because even if disk 1 gets to the start of it's 64KB block before disk0 if the blocks are out of synch then there will be a latency delay before disk0 can start reading. The total Read Time can't be any faster than the slowest disk in the whole array.

<A HREF="http://www.anandtech.com/myanandtech.html?member=114979" target="_new">My PCs</A> :cool:
March 30, 2004 11:39:52 PM

I was referring to memory on the card itself, and for situations involving a queue of read requests that were waiting in line. If one disk seeks to its information faster than the others on any given request, it could start reading the next request immediately when it finishes instead of waiting, put the info into a cache, where it would wait until it got its matching blocks from the slower disk. Then the disk that read faster <i>that time</i> can start seeking to the next block while the other disk finishes reading, because chances are even though that disk was faster <i>that time</i> the other disk could very well be faster the <i>next time</i>, and the array would have a good chance of staying relatively in sync overall even though each disk begins processing the next request immediately instead of waiting for the other disks to finish.

I'm pretty sure that the information a raid card reads from a raid-0 is sewn back together on the card itself, and not inside the operating system, so I don't think it has anything to do with system memory because the info can't be sent to system RAM until after it's reassembled, and what I'm talking about has to do with the reassembly process and when exactly seeking and reading could take place for maximum performance.

But really it's just a theory and I don't even know if raid cards actually do this or not, or if there are details that make it not feasible, or even what the difference could be if any.

I'm just gonna let the thread die because it's too far outside of the practical world to hold a real discussion about it, and too hard to actually communicate what I'm trying to say when all I have is plain text.

<P ID="edit"><FONT SIZE=-1><EM>Edited by Grafixmonkey on 03/30/04 06:40 PM.</EM></FONT></P>
March 31, 2004 5:08:14 AM

If it's a REAL hardware raid card, then yeah, it will actually do it, but most of the raid cards a regular user would get are software.

s signature has been formatted to fit your scr
March 31, 2004 6:34:19 PM

Quote:
I'm pretty sure that the information a raid card reads from a raid-0 is sewn back together on the card itself, and not inside the operating system, so I don't think it has anything to do with system memory because the info can't be sent to system RAM until after it's reassembled

This is clearly not the case otherwise all drive controllers would have to have cache, which they don't, and the file size would be limited to the size of the cache, which it isn't. When a disk controller reads information from the hard disk that information is directly sent through to the system RAM. It has nothing to do with the OS, the whole thing is controlled by the system I/O architecture. With a RAID0 array the addressing could simply be such that the appropriate block is written to the correct part of RAM irrespective of the order in which they come. I doubt a RAID0 array can even theoretically begin reading another file from one disk before a file has completed on another because the controller may have to consult the MFT/FAT before it knows what cluster to start reading from.

<A HREF="http://www.anandtech.com/myanandtech.html?member=114979" target="_new">My PCs</A> :cool:
!