Sign in with
Sign up | Sign in
Your question

Raid5: 6x 1TB vs 4x 2TB

Last response: in Storage
Share
June 12, 2011 10:09:44 AM

Hello everyone,
i'm looking to upgrade my current harddrives. I'm running various raid0's on my pc for around 8 years how (current setup is 2x 500GB raid0 Samsung HD501LJ + 2x 320GB raid0 Samsung HD321KJ). I'm looking for an upgrade. Since i've start working, I'm thinking of having my data more secure in addition to the performance for gaming and video recording.

My mobo is Gigabyte GA-MA790XT-UD4P which supports Raid5 on its AMD SB750 southbridge (funfact: linux dmraid uses nvidia kernel driver for this). I was looking into nowadays 1TB and 2TB drives, and as i'm really satisfied with samsung (not only for hdd, but i also use their DDR, monitors, HD TV, android phone and microwave :D ), i chose 2 disks which may fit:

SAMSUNG EcoGreen F4 2TB 5400RPM 32MB - HD204UI
SAMSUNG SpinPoint F3 1TB 7200RPM 32MB - HD103SJ


My budget isn't unlimited, therefore buying 2TB 7200rpm drives are out of question, as they are almost twice the price of 5400rpm ones. Buying 4x 2TB or 6x 1TB drives will come out about the same (240-280€). So my dilemma looks as follows:

4x 5400rpm 2TB = 6TB data on "3x" increased speed of a single hdd
6x 7200rpm 1TB = 5TB data on "5x" increased speed of a single hdd

My question is, which of those 2 setups will give me more performance boost? As i read, F4 series is much faster, cooler and quieter than F3 series. Would i gain the "full" speed of 6 drive raid5, or will the motherboard be a bottleneck?

Thanks in advance for the answers and ideas :) 

More about : raid5 1tb 2tb

a b G Storage
June 12, 2011 2:38:26 PM

NEITHER! "Green" drives should not be used in RAID-5. Such drives take too long to report read errors (even the common, correctable ones), which can cause them to be dropped from the RAID. For RAID-5, your first choice is a drive with something called "TLR," or Time Limited Reporting. These are usually enterprise-class drives, 2x-3x the price of most consumer drives.
Your second choice, which is riskier, is to find non-TLR drives that, nonetheless, others have reported to work well in RAID-5.
Your third choice is to use RAID-1 for data security, and continue to use RAID-0 for the scratch area for your video work, although I was unaware that video recording itself needed that sort of speed. As for gaming, only level load times are likely to improve. If you can, $200 on a 100GB-120GB SSD will make more sense for your games.
Related resources
a b G Storage
June 16, 2011 1:48:59 AM

I'm sure if you spent enough money on the controller(s), you could overcome the limitations of non-TLR drives. The OP says he cannot afford 7200RPM drives over 5400RPM ones, so to suggest he buy a $220+ controller to make the low-RPM drives work does not make sense.
June 16, 2011 3:19:56 AM

Thanks for replies. I wasn't aware of he limitations of raid5 setup. I can't find any information about TLR on wiki or google, so i'm not rly sure what exactly it does. Could u give me pointers?

The 2nd drive i wrote is 7200RPM, would that be more suitable than the first one then? How does the RPM affect reliability anyway?
a b G Storage
June 17, 2011 12:55:55 AM

No! That is not the point!...

I don't believe in TLER issue that you're mention... I have too many RAID using desktop hdd

My conspiracy theory is:
1 - HDD manufacture wants to milk $ from customer - flash the FW to disable TLER and the price is double.

2 - Most raid manufacture just lazy to change the FW to wait longer for the Repair Error Cycle

But there are some raid manufactures specialize in the desktop raid, so the regular desktop hard drives can be used like SPM394/393

BWT SPM393 from AMAZON only $119.00
a b G Storage
June 17, 2011 1:19:17 AM

Well, back in the days of 4.5GB-9GB SCSI drives, I wasted more than one weekend rebuilding RAID-5s (on Adaptec controllers) because a random drive in the array would drop during the week. If the OP can afford it, a better controller is certainly a great idea, but I just don't think he has the option. If that's the case, he should forgo RAID5 unless he gets TLER drives.
June 17, 2011 2:49:26 AM

TLER is for enterprise data that is worth millions of $ if lost or otherwise unavailable. Its something to protect you from the 0.00000001% chance of an unreported / unreportable error happening. For desktop use it's about as useful as ECC memory.

emsy,
Go with the 1TB setup although I'd recommend against RAID5 if your using a "software" implementation. Without a dedicated I/O processor the parity information must be calculated on the main CPU, normally this isn't an issue as modern day CPU's are more then capable of doing this with minimal performance impact. Unfortunately without an I/O controller you also don't have local HBA caching memory. This means every read / write operation must go back and forth from the CPU to the HBA chip to the disk controllers. This cripples your performance during actual use. I've seen this on every "fake raid" implementation ever done. The only one I've seen with high performance is the Solaris ZFS raid-z, which handle's I/O differently due to how ZFS is structured.

Short story, go with 4~5 1TB 7200 RPM disks in RAID0. Since your on a budget this is the way to go. Ensure your data is frequently backed up. How much data are we talking about here? I'm running 4 x 320 GB 7200 RPM disks in RAID0 but I have a 2TB 5400RPM disk for backup. I'm also running 4 x 1TB 7200 RPM disks in an external enclosure as RAID0, this is shared storage.
a b G Storage
June 17, 2011 3:55:05 AM

I'd be careful running RAID 0 ANYWHERE if you care about that data at all. RAID 0 INCREASES the likelyhood of losing data over having a single drive. 4 Drives in RAID 0 is MORE likely to fail than a single drive. Why not run 4 1TBs in RAID 10 if you care about performance AND data integrity.
June 17, 2011 4:09:26 AM

tokencode, there is some misunderstanding in what your saying. The chances of a drive failing in RAID0 depends on the quality of the drive, modern drives rarely fail although some still do.

But here is what we've discovered about RAID1 / 5 / 10, if one drive failed chances are that another drive failed too. If the cause for failure was improper maintenance of the enclosure (dust / dirt / ect..) then all drives will be affected. If the cause was a manufacturing defect, then chances are all the other drives in the same lot were also effected. Those 2~3 other drives you purchased at the same time as the failed drive are most then likely from the same lot and thus will fail at the same time. So basically, the only way to really protect your data is to use drives from different manufacturing lots, preferably different factories. Or just keep backups on hand. That is honestly the best advice possible, backup as often as you can.

I've been running RAID 0 at my home for years, never had a problem. Just choose high quality drives, don't go with "bargain bin" drives and NEVER use refurbished drives. Otherwise your problems will stim from the system not from the drives.
June 17, 2011 9:02:08 AM

Hi palladin9479, thanks for the opinions, they seem to be really rational. FireWire2 pretty much sums up the point of TLER suggested by jtt283.

Currently i got around 1.5TB of data on my 2 raid0s which is tend to increase. The important ones are less than 100GB, for the most part i got videos/movies. I just wanted to try new things and it would save me downloading terabytes of tv series after disk failure :p  I don't use windows on my desktop, so i might consider the zfs, although from what i read it is a "software raid"?

The motherboard i mentioned has the raid0,1,5,10 built in on SB750, is that enough to cover the issues u mentioned or is that still a crappy controller? <--- this is what i wanted to ask, but searching for an answer lead me to an article i haven't seen before:

http://www.tomshardware.co.uk/ich10r-sb750-780a,review-...

and... wow, my southbridge is AWFUL. Also the article clearly states how is raid5 slower than raid0 (with the same amount of drives). What is more, for most tests the increase from 2 disk array into 6 disk array isn't that big. (I get over 100MB/s read on mine)

So all in all, the speed increase (if any) for the desktop raid controllers isn't really worth going for raid5. Rather than going for 6-disk raid0, i'll probably switch to (2x 2 disk raid0 or 1x 4 disk raid0 or 3 disk raid0 + 1 ssd for system) and a 1x raid1 for the important data. Raid 0 haven't failed me so far (and i've had a few), and with the increasing hdd reliability i don't expect it to happen any time soon, especially when swapping drives every 2-3 years.

June 17, 2011 9:59:30 AM

RAID5 on just about every "fake RAID" setup is going to have awful performance. The R/W operations have to cross too many subsystems and go back again for it to be efficient. Either get a discrete RAID card or just use RAID 1 / 0 / 10.

ZFS is completely different. Being a Solaris admin I've used it extensively. For one its not "RAID" like the other devices, it doesn't try to fake hardware abstraction. Its a file system that recognizes that data can be list and chooses to build in its own parity storage. Technically its an object file system, metadata and real data are different entities inside ZFS. For every block of data there is a block of parity data built and stored across member volumes. Should a member volume become unavailable there are sufficient parity blocks available on other volumes to represent the member volume and rebuild if necessary. Very similar to RAID5 except ZFS doesn't require all the I/O requests to be sent to and from the HBA nor for there to be a virtual "RAID Device" faked to the kernel. It even scales with the ability to calculate as many parity blocks as you want, you can do zpool create raidz8 -d X X X X X and have "RAID8" with three separate parity blocks created, of course you'd lose capacity equal to three volume members.

If you have access to ZFS do not use software raid, instead create a zfs storage pool. I was assuming you were running a Windows NT based OS. If your using a Unix OS or a Linux variant with ZFS support then please the smartest move would be to use ZFS.

Store your data on four 1TB disks and use them in a raidz5 setup. You won't experience the penalizes associated with "fake raid" parity generation and get data redundancy. Also spend the money and get a large slow 5400 RPM backup disk and store compressed dumps in there.
June 17, 2011 11:19:30 AM

wow, i'm quite impressed by the ZFS. http://en.wikipedia.org/wiki/ZFS I really like this part: "It is possible to swap a drive to a larger drive and resilver (repair) the zpool. If this procedure is repeated for every disk in a vdev, then the zpool will grow in capacity when the last drive is resilvered."
There even seem to be projects porting it to linux (i'm using gentoo with reiserfs4 booting into those raid0, and NTFS for large data - leftover from windows times)

http://zfs-fuse.net/
http://zfsonlinux.org/

However, few questions arise about the raidz5:
1) i couldn't find anything about performance. Is the read/write utilizing all vdevs, thus increasing the speed? e.g. is the data distributed throughout the physical drives or is it treated like a JBOD?
2) with 4 1TB disks i'll get 3TB of data space, correct?
3) why would i need the extra drive for backups if the raidz5 provides the data redundancy and recovery?
a b G Storage
June 17, 2011 5:05:43 PM

palladin9479 said:
tokencode, there is some misunderstanding in what your saying. The chances of a drive failing in RAID0 depends on the quality of the drive, modern drives rarely fail although some still do.

But here is what we've discovered about RAID1 / 5 / 10, if one drive failed chances are that another drive failed too. If the cause for failure was improper maintenance of the enclosure (dust / dirt / ect..) then all drives will be affected. If the cause was a manufacturing defect, then chances are all the other drives in the same lot were also effected. Those 2~3 other drives you purchased at the same time as the failed drive are most then likely from the same lot and thus will fail at the same time. So basically, the only way to really protect your data is to use drives from different manufacturing lots, preferably different factories. Or just keep backups on hand. That is honestly the best advice possible, backup as often as you can.

I've been running RAID 0 at my home for years, never had a problem. Just choose high quality drives, don't go with "bargain bin" drives and NEVER use refurbished drives. Otherwise your problems will stim from the system not from the drives.



I respectfully disagree, if he has the budget for 6 1TB drives, he should be using RAID 10, not RAID 0 across 6 drives. RAID will give him 3x the performance of a single drive and redundancy. I ran RAID 0 on my home machine and only did so if I imagined my drive every night. That RAID 0 array was velciraptors, hardly "bargain" drives and it failed. As will all things, it comes down to budget, so if you have 4 or 6 drives, I'd do raid 10, if you have 2 drives do RAID 0 and image it.
a b G Storage
June 17, 2011 5:58:05 PM

One thing most of the end-user miss to take advantage of is S.M.A.R.T.

SPM394/393 has email notify built-in you can get an email report, if there is any write/read error
then you know the drive just about to take a crap!.

Replace it, that's simple

This is the lowest cost, easiest, pain-free to have a hardware raid5.
a b G Storage
June 17, 2011 6:43:29 PM

Yes, or if you want good write performance, do the same thing in RAID 10, far more reliable than RAID 0
a b G Storage
June 18, 2011 12:46:49 AM

Why do you want to loose 50% of storage space in 4x drives raid10 and not much of gain from speed compare to RAID5?

a b G Storage
June 18, 2011 1:25:31 AM

RAID 5 is MUCH slower when it comes to writes, especially if you are using software RAID. 4 Drives in RAID 10 would perform better in most circumstances than 4 drives in RAID 5, especially if your page file is going to reside on this volume. Reads on the RAID 5 would be faster, but writes would be much slower.
June 18, 2011 6:36:24 AM

I ruled out raid10 because i'd lose too much space in comparison with raid5 solution. I rly thought write performance is much better than it in reality is.

SMART: I'm not able to read smart from separate disks in my raid0. Neither on windows, nor on linux. Is there any special trick how to get it?
June 20, 2011 2:38:13 AM

emsy said:
wow, i'm quite impressed by the ZFS. http://en.wikipedia.org/wiki/ZFS I really like this part: "It is possible to swap a drive to a larger drive and resilver (repair) the zpool. If this procedure is repeated for every disk in a vdev, then the zpool will grow in capacity when the last drive is resilvered."
There even seem to be projects porting it to linux (i'm using gentoo with reiserfs4 booting into those raid0, and NTFS for large data - leftover from windows times)

http://zfs-fuse.net/
http://zfsonlinux.org/

However, few questions arise about the raidz5:
1) i couldn't find anything about performance. Is the read/write utilizing all vdevs, thus increasing the speed? e.g. is the data distributed throughout the physical drives or is it treated like a JBOD?
2) with 4 1TB disks i'll get 3TB of data space, correct?
3) why would i need the extra drive for backups if the raidz5 provides the data redundancy and recovery?


Ok if you have access to use ZFS then don't even think about going to RAID10. ZFS stores it's own parity data for the volume. You don't need to "abstract" a fake disk for there to be data redundancy. The filesystem itself assumes responsibility for tracking parity's and moving file pieces around. Files are stored as blocks and ZFS will try to scatter the blocks evenly across all member volumes. This isn't always guaranteed and sometimes you get more of a file on one volume or another. But over time ZFS will eventually even out your data across all volumes. Resilvering it will force that to happen immediately but your system will take a large performance hit while it does this, and this operation can take a long time depending on how much data you have. ZFS does a few interesting things. First you create the storage pool, it will be spread out across all member volumes and can have as many parity blocks are you chose, each parity block will consume one additional member volume so be careful. Now you then slice the storage pool into smaller sub-pools that you mount and store files in. Also there is no set "stripe" size, but rather each file will have a stripe size set to maximize the performance for that file. Also data isn't duplicated until something is changes with it, copying a 1TB file doesn't create another 1TB file, it creates a new pointer and metadata set that points to the same data blocks as the first file. As date is "written" to the new file those writes are saved off into new blocks and recorded in the file's metadata. Over time if there are enough writes / changes, then you will get a separate 1TB file.

Lets say you have a 4x 1TB RAID-Z1 array as a single pool "tank" of 3072 MB.
You then create two ZFS sub volumes. Tank1 and Tank2. All sub volumes will share the storage space but you can then set quota's and different mount points to each volume and the host OS will treat them as separate file systems. You can resize quota's dynamically without harming the data.

Tank1 = /export/home Quota 512MB
Tank2 = /export/share Quota 1024MB

You can create as many sub volume's as you want, and each appears to be a new volume to the host OS although their all sharing space from the same ZFS volume. That ZFS volume is what is managing the actual physical location of the file blocks and their parity data.

Now I keep saying volume's instead of disk's for a very good reason. ZFS doesn't have to be used on physical disks. A member volume can be a physical partition, an entire disk, or a file placed somewhere that is accessible, like over a NFS or SMB share. All member volume's of a RAID-Z ZFS don't have to be the same size, but it will use the lowest common denominator when determining ZFS size. Excess unused blocks are marked but not used for data storage. As a test you can create 4 1GB files and put them in different places. You can then do zpool create tank raid-z file1 file2 file3 file4 and you'll get a 4GB ZFS storage volume. Sun doesn't recommend this for production, only LAB / testing environments. Ultimately ZFS doesn't care where its volume's are from, it'll treat them all the same. This is how resilvering with larger disks will result in larger space once you've changed them all out.

In short use it, its an amazing file system. It single handily makes RAID obsolete. Sun actually recommends that you don't use hardware RAID with ZFS and instead configure all your disks as JBOD and let ZFS control them directly.
June 20, 2011 11:15:58 AM

wow thanks

i was doing some reading and found a btrfs, which is a next-gen filesystem similar to zfs but developed primarly for linux. Unfortunately it is still in unstable beta phase. Booting zfs from grub seems to be in beta as well with some patches and license conflicts :p 

http://www.h-online.com/open/news/item/GRUB-1-99-enable...

(current stable is 0.97).

Considering all this, i think i'll go for one 100-120GB SSD for system with reiser4/ext4 and then 4-5 1TB disks with ZFS for everythign else.
June 21, 2011 1:45:13 AM

The way the licensing is setup you have to put ZFS in as a kernel module yourself, or use FUSE.

I use Solaris 10 as my Unix OS of choice so ZFS is integrated natively. Its really a legal mess right now. The license the Linux kernel is released under makes commercial involvement nearly impossible. If a company develops some piece of software and they release as free, they need to relinquish all rights to it before it can be distributed with Linux.
a b G Storage
June 21, 2011 5:48:08 PM

emsy said:
I ruled out raid10 because i'd lose too much space in comparison with raid5 solution. I rly thought write performance is much better than it in reality is.

SMART: I'm not able to read smart from separate disks in my raid0. Neither on windows, nor on linux. Is there any special trick how to get it?


To read SMART para. from a HDD the controller has to support it. Most of Mobo's controllers have this feature. then you need to run an utility to display or email this to you...

Some thing like this http://sourceforge.net/projects/smartmontools/
September 17, 2011 11:31:24 PM

Reporting in with the results:

I've got a 80GB intel SSD for root filesystem with ext4, and 5x 1TB samsungs with zfs-fuse zpool for data. The manipulation with the partitions and "backup" properties of ZFS are brilliant. However, the performance is an issue i'm a bit worried about.

Reading/writing large files are quite good. but with small files the speed degrades vastly.
Also, any intensive HDD r/w rockets cpu usage of zfs-fuse to 100% (of 1 core - fortunately i have 6)

  1. hroch ~ # dd if=/dev/zero of=/zfs/testfile bs=1024000 count=3000
  2. 3000+0 records in
  3. 3000+0 records out
  4. 3072000000 bytes (3.1 GB) copied, 21.4192 s, 143 MB/s
  5.  
  6. hroch ~ # dd if=/dev/zero of=/zfs/testfile bs=1024 count=3000000
  7. 3000000+0 records in
  8. 3000000+0 records out
  9. 3072000000 bytes (3.1 GB) copied, 109.56 s, 28.0 MB/s

(for a comparison, same tests on the SSD run at 92/121MB/s)

Is this a general issue or is there a way to tweak it? running some games is major pain in the ass at loading/saving and it often leads to lockups
September 19, 2011 2:45:13 AM

That's no so much large / small files difference as the way the file subsystem works.

Even on magnetic disks, using a block size of 1K (1024 byes) will have dramatic performance impacts. I frequently have to clone disks and we try to use a 8m (8192k) block size to optimize I/Os.

It's the disk controller that has the issue, every block is a new IO request that it has to process, and if our T5220's and T6340's have issues then any home system will definitely have issues.
!