Should not RAID 1 be faster than RAID 0 for reads?

MxM

Distinguished
May 23, 2005
464
0
18,790
As I understand, a lot of time is lost when HD is just searching for the right sector. Consider the situation with only two drives. With RAID 1 you can have TWO independent searches at the same time if you have two different read request. It can be done ALWAYS, because the drives are exact copies . With RAID 0 you can not do it in at half of the times, so you can not always perform two independent read requests.

Does it actually work as I have described? (somehow I doubt, because I always hear that RAID 0 has better performance)
 

darkguset

Distinguished
Aug 17, 2006
1,140
0
19,460
RAID 1 does not work that way, although i suppose it could be done for reads, but it would require a very complex and expensive controller to do individual seeks. The controller simply duplicates the commands to the first drive on the second as well. Hence, no you cannot ask the head actuator on the second drive to search somewhere else. That would only improve only the seek times though and probably not by much.

The rest remain as you know, RAID 0 is faster in general due to more head actuators ,space etc etc etc...
 

MxM

Distinguished
May 23, 2005
464
0
18,790
I thought that RAID 1 can perform at least with the same speed as RAID 0 for reads. I mean most controllers support both configuration, so what prevents controller from reading two sequential data pieces at the same time in RAID 1 in the same way as it is done in RAID 0?
 

choirbass

Distinguished
Dec 14, 2005
1,586
0
19,780
well... the main difference is that the data is split up into smaller stripes (raid 0, giving reach hdd less to do), instead of doubled over in size (raid 1, making each hdd have more to do)... given that, raid 1 should actually be slower for reads, as well as writes (compared to a single hdd)... but, its only really slower for writes... ...and maybe for reads, its still only using 1 hdd to read from really (practically), because its reading the same data simultaneously on both hdds... youd think it would be faster though, because it has the bandwidth of 2 hdds available (among other components, such as actuators and such that darkguset brought up)... but, its not any faster really than a single hdd for reads... the components themselves arent moving any faster, or even alternating, theyre just serving the purpose of redundancy, if one hdd fails while doing something, the other is still there doing the same thing, as if nothing had gone wrong... ...if they had been alternating however, or working on seperate data even (similar to raid 0)... if one hdd were to fail, you would lose data as a result, as there would be no redundancy then... no component to fall back on, just in case

but, for what youre asking... raid 5, or even 1+0 would fit
 

DoubleE460

Distinguished
Oct 4, 2006
36
0
18,530
I thought Raid 0 stores/reads one part of the info from one harddrive and the other part(s) from other hardrive(s) and gaines speed by only needing to access half or less data per physical hardrive, performance would increase with more hardrives involved. As the files would be physically dispersed, a failure in one drive would ruin all data in a Raid 0 configuration.

Raid 1 would in contrast copy/read identical information to a pair of hardrives. I.e if a file is 1M then the entire 1M would be copied/read to/from two separate drives, i.e at least twice as much as for Raid 0. However, one corrupt drive would mantain all data in the Raid 1 configuration.

The read access times would in my nut work out like....
The slowest Raid 0 read access can be equal to the quickest Raid 1.

See http://www.m-techlaptops.com/raid1.htm
 

choirbass

Distinguished
Dec 14, 2005
1,586
0
19,780
I thought Raid 0 stores/reads one part of the info from one harddrive and the other part(s) from other hardrive(s) and gaines speed by only needing to access half or less data per physical hardrive, performance would increase with more hardrives involved. As the files would be physically dispersed, a failure in one drive would ruin all data in a Raid 0 configuration.

yep. basically equally distributing 'sections' of all data among the hdds in the array... giving each hdd less to do for each 'file', to effectively boost read/write speed... if a hdd in the array becomes unusable however, or is even temporarily unavailable, the whole array wont work, until that same drive is restored with the data in the same condition (if it was removed)
 

MxM

Distinguished
May 23, 2005
464
0
18,790
From wikipedia:
Additionally, since all the data exists in two or more copies, each with its own hardware, the read performance can go up roughly as a linear multiple of the number of copies. That is, a RAID 1 array of two drives can be reading in two different places at the same time, though not all implementations of RAID 1 do this[1].

So I thought that more or less all hardware implementation of RAID 1 has this read performance increase. Isn't it so for the NV_RAID?
 

choirbass

Distinguished
Dec 14, 2005
1,586
0
19,780
from http://techreport.com/reviews/2005q4/chipset-raid/index.x?pg=1

RAID 1 — Also known as mirroring, RAID 1 duplicates the contents of a primary drive on an auxiliary drive. This arrangement allows a RAID 1 array to survive a drive failure without data loss. The enhanced fault tolerance of a RAID 1 array (or of most other RAID levels) is no substitute for a real data backup solution, of course. If the primary drive in a RAID 1 array is plagued with viruses, malicious software, or data loss due to user error, the mirrored auxiliary drive will suffer the same fate simultaneously.
Although RAID 1's focus is on redundancy, mirroring can improve performance by allowing read requests to be distributed between the two drives (although not all RAID 1 implementations take advantage of this opportunity). RAID 1 is one of the least efficient arrays when it comes to storage capacity, though. Because data is duplicated on an auxiliary drive, the total capacity of the array is only equal to the capacity of a single drive.


as far as nv_raid, i guess it doesnt improve read speeds:

http://techreport.com/reviews/2005q4/chipset-raid/index.x?pg=13

Although sustained read performance scales very well from one to two drives, adding a third or fourth drive yields much more modest performance gains. Unfortunately, mirroring doesn't improve read speeds much, either. There's no performance difference between our single-drive and RAID 1 configurations, and RAID 10/0+1 performance is nearly identical to that of a two-drive RAID 0 array. At least RAID 5 fares relatively well.

i guess you were right in asking... and that should actually put to rest whether raid 1 actually benefits read speeds... or more accurately, that its possible for it to, but it also depends on if the controller itself allows that
 

JMecc

Distinguished
Oct 26, 2006
382
0
18,780
So if Raid1 can pick the closest head (or 1st one to arrive to the data) as you say, its seek time would decrease below that for a raid0 set of the same drives that is also full (2x capacity) although for the same # of gigs you could [theoretically] half-fill the raid0 drives to better seek times.

Thing is really though that even if you get a better average seek time with raid1, raid 0 will kill it in read speed. You want a contiguous 250MB file, you need to seek it 1st, then read it. So perhaps raid1 seeks in 7ms and raid0 in 9ms (just making up #'s here), but then raid1 reads at 40MB/s (6.25s) and raid0 at 75MB/s (3.33s) so raid0 still wins by a lot unless the file size is tiny. For the #'s above, any read above 18pages of 4k each would be faster on raid0. I don't know if the #'s are accurate but anyway for a billion tiny tiny tiny reads raid 0 may be faster, but for any time consuming read (the ones we get ticked waiting for), raid0 is faster.

Jo
 

MxM

Distinguished
May 23, 2005
464
0
18,790
Thing is really though that even if you get a better average seek time with raid1, raid 0 will kill it in read speed. You want a contiguous 250MB file, you need to seek it 1st, then read it. So perhaps raid1 seeks in 7ms and raid0 in 9ms (just making up #'s here), but then raid1 reads at 40MB/s (6.25s) and raid0 at 75MB/s (3.33s) so raid0 still wins by a lot unless the file size is tiny. For the #'s above, any read above 18pages of 4k each would be faster on raid0. I don't know if the #'s are accurate but anyway for a billion tiny tiny tiny reads raid 0 may be faster, but for any time consuming read (the ones we get ticked waiting for), raid0 is faster.
There is nothing to prevent RAID 1 from having the same read speed as RAID 0. Because two RAID 1 drives can still read the data in exactly the same way as RAID 0. The particular implementation, though, may not use this advantage, and it looks like it is the case for NV RAID.
 

JMecc

Distinguished
Oct 26, 2006
382
0
18,780
Good point MxM; now I get what the debate is about. The controller would have to get the length of the block to read, divide by 2 and one disk starts from the start and the other from half to avoid skipping around with the actuator arm (a small seek as when we move over 2 tracks we have missed the start point of our next partial read and have to wait the rotattional latency for it). If we do read 1st half from one and 2nd half from the other we just need the OS to know this to store it in the right place in mem (since our HD buffer will not be large enough to mimic mem). So with OS support it may be possible to do this fast read - that would be pretty cool.

Jo
 

russki

Distinguished
Feb 1, 2006
548
0
18,980
I think you guys are wrong. Hard drives achieve the peak performance with sequential transfers, that's what RAID 0 does - it breaks up the file among the different harddrives, where it is arranged for sequential access. To read a file, you need one seek and then parallel sequential read - peak performance for each of the drives in the array, possible because you can process the data faster than the mechanics allow the drives to read it.

With RAID 1 the file is not broken into blocks, so if you wanted to do parallel reads you'd have to make the heads jump quite a bit - which is a seek operating and takes a relatively long time - so you will not be able to parallel effectively. I think all you can shoot for is matching a single drive performance, which pretty much happens in reality.
 

SomeJoe7777

Distinguished
Apr 14, 2006
1,081
0
19,280
The difference here in whether a RAID 1 controller can accelerate the disk reads in the same manner as a RAID 0 controller is whether the RAID 1 controller has enough on-board cache to perform the block reassembly.

First, let's be clear that you cannot think about what a RAID controller is doing in terms of files. The controller knows nothing about files and file systems, it only knows about blocks.

If the computer wants to do a 25MB read from the controller (50,000 blocks), the controller receives a request similar to: "I need blocks 100,000 through 150,000 from the disk". Let's see how a RAID controller could answer this request in each case.

Let's assume we have a 2-drive RAID 0 with a stripe size of 32K. That means that every 16K (32 blocks) is stored on the other disk. i.e. Blocks 100,001 through 100,032 are on disk 0, blocks 100,033 through 100,064 are on disk 1, and so on.

To answer the computer's read request, the controller will begin sequentially reading blocks on both drives. As it gets each group of 32 blocks from drive 0 and the 32 following blocks from drive 1, it assembles them together into 64 blocks in the correct order and sends them to the computer. The computer believes it is getting blocks from a single disk, in order, and is none the wiser. The RAID controller needs at maximum a chunk of memory equal to the stripe size (32K) to perform the block reassembly.

Now let's look at a 2-drive RAID 1. In this case, all blocks are duplicated on both drives. Blocks 100,000 through 150,000 exist on both drives. Like russki said, you do not want the actuator arm jumping around on each drive because that will destroy any read time advantage. Thus blocks must be read sequentially if you're to have any hope of improving the read speed.

What you can theoretically do is take the 50,000 block read request and divide it in 2. Have drive 0 seek to and read blocks 100,000 through 125,000, and have drive 1 seek to and read blocks 125,001 through 150,000.

The problem is that the computer must receive the blocks in order. Thus, you must buffer drive 1's read data in the RAID controller's cache, waiting for both drives to complete the 25,000 block reads. Once drive 0's read is finished and you send blocks 100,000 through 125,000 to the computer, you can now send blocks 125,001 through 150,000 to the computer from cache, which is much faster.

However, in this case, the RAID controller needs 12.5 MB of on-board cache to perform this trickery.

Thus, a typical motherboard RAID 1 implementation (like the NVidia or Intel south bridges) cannot perform accelerated reads in RAID 1 due to the lack of cache. But high-end RAID cards with on-board cache can and do use parallel reads in RAID 1 to achieve faster read speeds. The LSI MegaRAID controllers (who make many of Dell's PERC series of RAID controllers) can do this.
 

MxM

Distinguished
May 23, 2005
464
0
18,790
I am not sure that jumping from the end of block 109 to the beginning of block 120 (i.e. skipping ten) is expensive operation. It is probably faster then actually reading those blocks.

So in principle controller could read blocks 100-109, 120-129, 140-149 by one head and 110 - 119, 130-139, 150-159 by another head. It will be slightly slower than RAID 0, but still should be very close to it. At the same time, you need only 10 blocks (10Kb?) of memory for that.

Where RAID 0 may have advantage is in processing many independent requests at once. In this case it will have much shorter search times!
 

Illumynization

Honorable
Nov 12, 2013
817
0
11,360
Raid 1 reads - two twins are going on an easter egg hunt. They are given the same map and the same setup in two different locations. They both come back at the same time with the same number and color of eggs.

Raid 0 reads - two twins are given half of the same setup and half of the map. They take half as long finding their half of the eggs and come back at the same time.

This is probably a bad example I just thought of off the top of my head. Makes sense to me though.