RAID Scaling Charts, Part 2

FITCamaro

Distinguished
Feb 28, 2006
700
0
18,990
One question is you mention with RAID 5 and 6, performance in your test starts to lag with more drives due to the limitations of the card in XOR operations.

Is there any benefit to using the system processor with software RAID 5 or 6 when you have a fast processor? If you can devote a full core to XOR operations, would that perform said operations faster than a chip onboard the RAID card?

Obviously this would only be good for an instance where nothing else is taxing the system like in a file server. For a media encoding machine this wouldn't be good since a multi-threaded encoder would eat up all the CPU time and leave little for the XOR operations.
 

robaustin

Distinguished
Aug 7, 2007
1
0
18,510
"RAID 1 does the opposite, as it cuts the chances of data loss by half". This statement is only correct if one has a hard drive fail and never replaces it. The chances of both hard drives failing within such a narrow window of time that the first defective hard drive has not been replaced is vastly smaller than 1/2.
 

sandmanwn

Distinguished
Dec 1, 2006
915
0
18,990
people make mistakes: It happens all the time that an administrator replaces the wrong drive, accidentally eliminating the RAID array.

:lol: I got a good chuckle out of this. If you really think about it people using 4-8 drives in a system generally go with hot swap drive bays that typically have indicators for each hard drive. Would have to be one really thick headed person to pull any other drive than the one with the red dot in the field of green dots.

Secondly I take issue with this test. If the performance of this test was limited by the XOR processor on the card then the numbers are innately false and therefore lends itself to criticism and to ultimately discredit the test. This error should have been rectified as no intelligent IT person would purchase an 8 hard drive system with an inferior controller that cannot do the job properly or artificially lowers the performance of the system. Just doesn't make any sense.
 

marks_

Distinguished
Aug 7, 2007
1
0
18,510
Pardon my ignorance; however, I have a question. With Windows 2003 Enterprise Edition, are we limited to a maximum total drive size of 2TB? So if I made a RAID 6 array with 8 drives (with a RAID card that supports XOR in hardware and large drives) and each drive is 500GB, I would have a usable space of 3TB. Will I be able to have a single 3TB drive? If not, is there anything that can be done to have one large array (with Windows products)?Someone advised me to simply use 64 bit Vista or XP and share out the one large array, however, does anyone have any idea what the performance impact of going from Windows 2003 server to Vista 64 bit would have overall?
 

FITCamaro

Distinguished
Feb 28, 2006
700
0
18,990


Yes it will appear in Windows as a single drive. If there is a 2TB limit, it will still appear as a single 2TB drive. The other TB would be lost.
 

davidgrenier

Distinguished
Aug 7, 2007
13
0
18,510
Can you please compare these results with a 3-drive raid 0 setup of SanDisk's SSD SATA 5000 2.5"...

here

You could probably lay hand on that for a grand or two? It used to be sold in laptops for an extra 500$ US I think.

Do it,

Thanks!
 

zenmaster

Splendid
Feb 21, 2006
3,867
0
22,790


Well, I've worked in IT for quite a long time and I've seen RAID Arrays loss to dumb human mistakes.
I've seen RAID lost due to bugs in RAID software.
This is not to say RAID is bad, I would not install a server without RAID.
I'm just saying that "Everytime man creates something idiot proof, God makes a bigger better idiot."


I really did not analyze the results for the XOR processor to agree or disagree with their conclusion, but just because their system hit a limit does not mean they would have to redo the test. They simply report and analyze the results.

Yes, you could buy a faster controller but then why not get faster drives to meet the performance?
Better yet, why not use Linux and one of the faster file systems such as EXT3?

The answer is that you analyze your needs and build your system to match it.

Often with an 8-Drive raid, you biggest concern is space. There is never enough in many cases, so speed may not matter.

Why use 7200RPM and not 10,000RPM Sata Drives? It could be cost or need for space.
Why use a SATA RAID Controller at all? Clearly 15K SCSI Drives would be the way to go if it was only about speed, but we know that is not the case. We all work with Budgets.

Why use Windows and presumably an NTFS File system when EXT3 is faster?
Well, perhaps we want some of the features in NTFS or our program only runs on Windows and we prefer it to the other options, or ..........
 

SomeJoe7777

Distinguished
Apr 14, 2006
1,081
0
19,280


All versions of Windows Server 2003 with SP1 or higher (including R2), Windows XP Pro x64, and all versions of Windows Vista can use GPT disks, which can be any size up to the current implemented limit of the NTFS file system (currently 256 TB).

The controller you're using must support >2TB arrays for that to work.

You can only use GPT disks on Windows Server 2003 and Windows XP x64 as data disks, not boot disks.

Windows Server 2003 IA-64 (for the Itanium processor) and Windows Vista can boot from a GPT disk, but only if the system has an EFI BIOS.

See the Windows and GPT FAQ for more detailed explanation.
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980
Why didn't you include a single drive in the IO charts? That would clearly show the benefits of an array.

I don't deal much with this personally since I don't work with servers (I work with networking devices) but I always like to learn new things and this clearly shows how RAID scales. Thanks for the work.
 

FITCamaro

Distinguished
Feb 28, 2006
700
0
18,990


They did...its the first thing in every chart.
 

enlightenment

Distinguished
Mar 9, 2007
111
0
18,680

Actually, XOR is such a simple operation your CPU could do more than 1GB/s easily, it might even be bottlenecked by memory speed (~8GB/s). The real CPU eater is I/O request combining to achieve high throughput speeds. Waiting for requests to come in, combine them, split them and write them out. If an I/O request cannot be combined, it would go out as a 2-phase write, causing 5 seperate I/O requests in itself. You can see that this puts quite some overhead on the system performing the RAID. Your CPU is actually much faster than those "XOR chips" (which are I/O processors not just XOR!).

Obviously this would only be good for an instance where nothing else is taxing the system like in a file server. For a media encoding machine this wouldn't be good since a multi-threaded encoder would eat up all the CPU time and leave little for the XOR operations.
Its senseless to save CPU cycles for storage in this scenario. Since with a slow storage backend, the encoding software would begin idling waiting for the writes to complete and able to process a new chunk. Your storage should not be a bottleneck in this case.
 

frr

Distinguished
Feb 16, 2007
6
0
18,510
I have a couple notes:

I believe the Areca ARC-1220 uses an IOP-333, rather than IOP-332.
Then again, Areca do seem to update their board designs during the years.

Specifically, all Areca controllers used to have one common outstanding feature:
the separate onboard RAID5/6 XOR ASIC chip, which provided sustained sequential
transfer rates of up to 250 MBps (writes). The reads were actually a tad slower,
maybe 220 MBps. All of this meaured using "cp /dev/zero /dev/sda"
or "cp /dev/sda /dev/null" in Linux, i.e. at the raw block layer.
All of this was available several months before Intel managed to add
RAID6 capability to the later revisions of their IOP333/331 silicon.

This sustained performance level used to be fairly uniform across all the
Areca controllers of that generation. Areca even used to make (or still makes)
external controllers based on the Intel IOP219, which has no on-chip XOR
accelerator at all, and still the performance is the same, owing to the
Areca XOR ASIC.

Looking at the current version of the Areca product web, I have to admit
that it seems as if they ditched their (TM) auxiliary XOR chip in favour of
the Intel IOPs' on-chip "application accelerator" subsystem, even in that
traditional product line.
This would explain the different figures that the THG benchmarks report,
specifically the lower write throughput. The figures are now closer
to competing products that are based on the bare IOP331/333 silicon.

Note that in the meanwhile, there's a new generation of Areca RAID
products - the various MultiLane boards with SFF-8087 connectors,
such as the ARC-1261ML.
http://www.areca.com.tw/products/pcie341.htm
They're based on the Intel IOP341, and though I've never had a chance
so far to benchmark any specimen of that family, based on analogy
with recent external Areca controllers I am fairly confident that with
SATA drives, these controllers will achieve between 500 and 600 MBps
sustained sequential transfer rate into RAID5 and RAID6.

The external ARC-8100 model actually exceeds 800 MBps,
http://www.areca.com.tw/products/sas_to_sas.htm
but that's with SAS drives. It seems that SATA-II drives come with a
performance penalty on RAID, difficult to say if this is due to the electrical
interface, or due to other factors (latency?), generally related to the
"desktop vs. enterprise" comparison between SAS and SATA-II
disk drives on the market today.

For comparison, in my tests, the recent Adaptec 3805 has maxed out at some
380 MBps (read rate from 5 SAS drives), I believe I've observed a write rate
of about 320 MBps. The Adaptec doesn't seem like anything special,
with its fixed cache size - but it's actually pretty powerful at the price.

The new Arecas with 12 ports and above can again have the cache size upgraded.
Also, despite being SATA-II only (due to Marvell HBA chips), the SAS connectors
also contain the SGPIO signals, and the controllers should thus be compatible
with a number of standard SAS hot-swap backplanes (passive = expander-less).

Historically, RAID hardware based on the IOP321 alone would achieve about
100-150 MBps into RAID5, and IOP302/303 based controllers used to provide
about 50 MBps into RAID5 (Adaptec 2120 or LSI MegaRAID 320-1).
These controllers are still on the market today, there are new RoHS versions,
and the prices haven't gone seriously down. Apparently there are regular
customers for whom the apalling performance isn't a problem, and who are
willing to pay a premium for the 100% backwards compatibility.
Industrial process control hardware is all about this.

As for software RAID: it's indeed possible to achieve higher XOR/RS throughput
with a generic x86 CPU - that is, if the Linux native software RAID's boot-up
benchmarks are to be believed :) I seem to recall MMX/SSE throughput figures
of around 3 GBps on a Netburst CPU around 3 GHz. Obviously this only makes
sense if your CPU is generally idle, so you can afford to spend some of its horsepower.

An important issue is management comfort. The Areca firmware seems to provide
pretty good comfort, both during array configuration and during critical situations.
It's actually fairly difficult to rip out the wrong drive, if you have a comfortable
management utility for the RAID and a red failure LED flashing at the front
of the respective drive bay.
The Linux RAID can also be dealt with, but it may take more study to get
the drive replacements right. Some commercial software RAID solutions
provide better level of comfort, but hardly any is on par with Areca.
Don't get me wrong, I'm not talking about wizards and other gimmicks
- I prefer simplicity, snappiness, clarity and straight-forward presentation.

Then there's the autonomy of operation - if you have a hardware RAID,
your RAID controller's embedded OS is alive regardless of the state
of your host PC's OS.

Given the current state of compatibility with hot-swap backplanes,
you definitely want a hardware RAID to get the failure LED's to work at all,
and even with a HW RAID, you have to carefully select/integrate your RAID
controller with your hot-swap backplane, to make the LEDs work just right.

As far as video editing is concerned, I believe that whenever you need
to get some transcoding done, you won't have a problem with the disk throughput,
because your data flow will be throttled by the CPU. There are encoding
accelerators, but still. Only at some HD Video or cinema-level resolutions,
while grabbing raw video with (nearly) zero compression, you'll ever exceed
400 MBps. And, in a video editing station, you definitely want to have all the CPU
horsepower available for the video encoding jobs. Get an Areca 1261/1280,
and you'll never be starved of disk IO bandwidth.

Let me finish off with a funny story:
I seem to recall two or three drive replacement accidents involving various
Adaptec AACRAID controllers. Suppose you have a mirror of two drives,
and one of the drives gets out of sync for some silly reason (cabling mess,
power plug pulled out), but is really still healthy. What I did, I attached
the "repelled" drive alone to an AACRAID controller, and removed the
logical array from that drive only, by unconfiguring it in the AACRAID BIOS.
I then attached the now "empty" drive to the RAID controller together
with the drive that has survived, with the idea that I'd rebuild the RAID
from the survivor to the emptied drive. Guess what: the AACRAID
purged the logical array from the survivor drive, right at boot :)
The apparent logic is, that my delete of the logical array was the
last known operation on the array, and therefore it was replicated
onto the "survivor" drive...

Never had anything like that happen with an Areca controller.
You plug the drive back in and you can forget about it.

Areca seem to maintain a uniform firmware code base across all
the hardware platform that they're using. A new firmware version
is usually available for all the plafroms in sync. Compared to the
competition, this would hint at pretty solid software engineering.
I've never discovered a serious firmware bug in their products
- I've only ever read about bugs fixed in the firmware release notes.

It certainly is silly to use a headline speaking of "RAID scaling of different
RAID levels vs. the number of drives", and then use a particular older
off-the-shelf RAID controller implementation with a pretty deterministic
cap on throughput, characteristic for the particular embedded CPU :)


Regarding volumes over 2 TB: there are several ways of how to achieve
this, not all of them available in all Windows versions.

Windows 2000 (Server?) can work with multiple <2TB volumes striped together
in software, using the Windows disk management. Generally if your Windows
version can do software striping, it can merge several <2TB disks with a normal
sector size (512B) into a volume of >2^32 blocks. The complete workaround
is that you set up your RAID controller to present several volumes
smaller than 2 TB and stripe them together in software.

Windows 2000 and above can also work with non-standard sector sizes,
namely 2^n multiples of the standard 512B, up to 4 kB per sector.
This is the #1 choice for "volumes over 2TB" with most RAID controllers
out there. Some of them just call this option the "Windows solution
to the 2TB problem". This workaround gives a maximum disk volume of
16 TB (4 kb * 2^32).

And the most progressive way is to use the standard 512B blocks,
but with a greater than 32bit address. On SCSI, this is called LBA64
(64bit LBA address), which is encapsulated in CDB16 SCSI frame format.
This compares to the older SCSI standards of LBA32/CDB12.
(Note that on IDE/ATA there's LBA48.)
As all the PCI-based internal HW RAID controllers are really ported to
the SCSI subsystem in the host PC's OS, and likely even use SCSI
as the transport framing to the controller's private CPU (IOP) "mailbox",
I believe that LBA64+CDB16 is a valid label for this, even with the internal
PCI RAID controllers.
This 64bit address length and CDB16 need to be supported by the host
operating system. AFAIK, this is only true in Windows 2003 Server SP1
(including 32bit) and the 64bit versions of W2K3/XP/Vista (not sure about
Vista 32bit). And of course any recent version of FreeBSD or Linux.
Note that a modern Linux kernel alone is not enough if your distro is out of
date - the user-space utilities (and glibc?) may have a problem too.

Also, as the classic PC BIOS partition table is limited to 2 TB,
so your OS either has to use the volume >2TB raw ("dangerously
dedicated"), or has to support the GPT partition tables.
Support for GPT partition tables generally goes hand in hand
with support for LBA64/CDB16, but don't expect your PC BIOS
to boot from that :)
 

kmagill

Distinguished
Sep 6, 2007
4
0
18,510
Hi,

This is quite a well timed article as i am specifying a file server this week. From the information I’ve read today, I'll be going for a 64bit windows dual core system with a raid 0 boot disk, and a 3TB raid 5 storage arrays.

Great, massive amounts of redundancy, and I don't believe more than 1 HD will fail within the time it takes me to get a replacement... ... BUT:

What if it's the system motherboard, or CPU, or raid controller that fails? What should be my plan of action then? and what if one or more of the system components have been phased out? Should i keep a spare part of each component in the redundancy cupboard, just in case?

Your thoughts please…