Poor raid 5 performance on Dell Poweredge R710

rcfant89

Distinguished
Oct 6, 2011
546
3
19,015
I have a Dell Poweredge R710 raid 5 (5 disk array) with the PERC H700 raid card. For some reason, I am overloading my drives while doing NZB data transfers. This never used to happen and just started happening where it would spin up and hit around 13 MB/s and then pause with this message:

ebc9e9a76ed0233e1d7144ccffbd13b5.png


The thing is, each drive should be able to read and write at roughly 100-150 MB/s and the raid 5 performance should be at least marginally better than even that. This thing is crapping out at barely 10%.

8d668a719e430c2152d44a9974006628.png

e2332b41ae87a169aaeb947307759c5e.png


This array is 20 TB (5 disks that are 4 TB each) with about 14.5 TB usable space. I ran dell command update and updated everything available and same with windows updates.

I had a few VMs running but I shut those down and rebooted the server and nothing changed. So off a fresh boot, basically nothing running, it still craps out at 13 MB/s.

I did do a disk replacement a little while ago (like 2 months maybe?) and overall performance has seemed to drop. Perhaps the array is misconfigured but everything looks good to me.

Any ideas?






 
Solution
With Dell PERCs you don't have to deal with URE issues much because background patrol reads are performed during idle periods automatically with default settings. Meaning, during a patrol read, if it encounters a parity error within a segment of data, it will identify and correct. Although, it's generally best practice to go ahead and perform a volume consistency check periodically to sweep the entire array for that extra level of confidence.

As for changing read cache settings, here's a breakdown as to what options are typically available (based on model of PERC), and what they do. I'll cut and paste from Dell..

• Read-Ahead—When using read-ahead policy, the controller reads sequential sectors of the virtual disk when seeking...

Doctor Rob

Distinguished
Jul 21, 2008
676
3
19,160
It is one reason I use RAID 10 on my servers.. it may be more expensive but its much faster. I would check your RAID build to be sure its running properly. I dont have that card as I have a different one with 1Gig in cache and have so far not run into that issue.
 

stdragon

Admirable
That PERC H700 controller should suffice. I assume it has 512MB of cache per the specs? Also, SATA or SAS drives? What make/model and RPM? I'm assuming the replacement drive you installed isn't rated any slower in specs?

A few things you can do. First, make sure you have the Dell OpenManage Server Administrator Managed Node (OMSA) in 64bit installed v8.5. It's about 233MB in size if you have to download it at support.dell.com. It's not required, but it does make easy to manage the PERC options while in Windows. Otherwise, you'll be going into the controller manually upon a reboot. Specifically, you'll want to review hardware logs and run a consistency check across that RAID5 array to validate and/or correct parity issues. Don't worry, it will do this for you; but it is a lengthy process that can take hours to a few days. You can still use the server in production, but until the process completes, there will be a performance hit.

Also, be sure to update the BIOS, iDRAC + Lifecycle Controller, PERC H700 firmware and H700 driver if needed. Depending on what version of the Lifecycle you have, you should be able to reboot the server and initiate its menu (F10 I think). If given the option, choose to update via FTP, and plug in ftp.dell.com. Leave all authentication to defaults. It should connect and provide a list of updates.

FYI, it's been awhile since I've worked on a generation of PowerEdge units that old.
 

rcfant89

Distinguished
Oct 6, 2011
546
3
19,015
It is 512 MB cache for the raid card. Disks are Toshiba MG03ACA400 4TB 7200RPM 3.5-Inch/ SATA3/SATA 6.0 GB/s 64MB/ Enterprise Hard Drive.

I replaced all 5 HGST hard drives with 5 of the above drives. Never had any issues with the HGST drives but they were only 2 TB each. They are technically the same specs, 7200 rpm, 64 MB cach, "enterprise grade" but these just seem way worse.

I have OMSA v 9.1.0.

I think dell command update updates all those things, bios, idrac, etc. so I believe they are all up to date. (at least I know on dell laptops it updates all drivers and bios, I would assume servers are the same).

So I guess I will start with a consistency check. Thanks.
 

stdragon

Admirable
In theory, the Toshiba's should provide more throughput over the HGST drives given areal density of data between 2TB and 4TB drives. That said, neither drives in an array should be hobbling along at 13MB/s.

I got to thinking about a similar issue I had once in a RAID array. It turned out one of the drives in the array was failing, but still hadn't flagged a SMART failure (honestly, SMART isn't very reliable anyways at pro-active failure detection like it's supposed to, but I digress). The performance drop was substantial, and lasted a week. Eventually, the drive failed, and performance shot back up to normal, minus the array now left in a degraded state. Good news, we replaced the drive, the array rebuilt back to full fault tolerance within 6+ hours, and all was back to normal. I suspect you may be dealing with the same issue were performance is ranked at the slowest common denominator. One failing drive in a RAID array can hobble the entire system as the other drives are waiting on it.

As for the Dell Command Update, I'm not sure for servers. But when in doubt, check the firmware revisions of the hardware as reported in OMSA and then compare that to what's available for your R710.
 

rcfant89

Distinguished
Oct 6, 2011
546
3
19,015
That's a good though. I guess all I can do it use the available tools to check disk health and wait for the failure.

The consistency check finished but didn't say anything:
16e9acf3e6f655ec0362ca71500dc451.png


Edit: I ran a crystal disk mark on my laptop SSD (on the left) and on my raid 5 array (the result on the right).

c0d5426c5b9f230373db6bb0a972b0fb.png
 

stdragon

Admirable
Very interesting. So clearly, the problem isn't the combined throughput of all spindles in the array, rather, the issue is IOPS. Those Toshiba MG03ACA400 drives are rated for a maximum 155 MBps on the outer track. So that, x5 drives = a combined 775MBps of theoretical sequential reads in the array. Now, given overhead, and where exactly, the sequential reads are occurring 669MBps is about right and where I would expect normality.

By chance, did you change the default stripe size when you created your array? If so, what did you set it too? If not, then I still stand by the theory of a failing drive. It's possible there's an issue with an actuator R/W head someplace causing issues with seeking.

Another possibility is a problem with the cache module itself on the PERC controller. I'm guessing you've already ran a diagnostic on it to validate?
 

rcfant89

Distinguished
Oct 6, 2011
546
3
19,015


I don't *think* I changed the stripe size, if so then whatever default was. What should it be?

I haven't run a diagnostic on the PERC controller. Can I do that the same as with the dell laptops? Hit F10 on boot and run hardware diagnostics? I have done that on many dell laptops but not on a dell server. Thanks.



Yes that's right. It depends on what the transfer is but yes, it's usually a lot of small files like 50 MB or something.

That's a good idea, I'll try the ISO transfer. Thanks.
 

stdragon

Admirable
I don't *think* I changed the stripe size, if so then whatever default was. What should it be?

Oh, just asking in the event either you or someone else changed it. If you didn't specify, it's already at defaults.

FYI, changing the RAID stripe size biases the volume to either provide more throughput, or improved IOPS, but not both. It's a sliding scale where there's a trade off. It's best to just leave it at default unless you have a specific corner-case need.
 
May 10, 2018
2
0
10
Hello

You must not use raid 5 with 4Tb drives, that's insane

If one drive fails, the rebuild time will be so long (days) & the probability to encounter an URE is so high, that you risk another drive failure and lose whole array

You should immediately go for raid 6 or raid 10; but given that you seem to need good RW performance, go for raid 10

So you need to buy 3 extra 4Tb drives
 

stdragon

Admirable
He can't do that. The Dell PowerEdge R710 is a 2U rack mount server with 6x 3.5 inch drive bays. He's using 5x drives in the array.

Assuming the other bay isn't populated with an OS drive, he could either increase the capacity with RAID5, or keep extra fault tolerance with RAID6 while keeping the volume the same capacity just by adding a sixth drive. FYI, he would have to blow away the array regardless to do either upgrade.
 
May 10, 2018
2
0
10
The drawback of raid 6 is that he will loose performance compared to raid 5
Personnaly I use raid 6 for pure storage (backup)

I see that he could go for raid 10 with 6 disks (~11Tb of usable space)

I know that he will have to blow his array but since it's currently not working correctly he might test/benchmark each drive separately
If each drive works then it's the raid controler that fails
 

stdragon

Admirable
With Dell PERCs you don't have to deal with URE issues much because background patrol reads are performed during idle periods automatically with default settings. Meaning, during a patrol read, if it encounters a parity error within a segment of data, it will identify and correct. Although, it's generally best practice to go ahead and perform a volume consistency check periodically to sweep the entire array for that extra level of confidence.

As for changing read cache settings, here's a breakdown as to what options are typically available (based on model of PERC), and what they do. I'll cut and paste from Dell..

• Read-Ahead—When using read-ahead policy, the controller reads sequential sectors of the virtual disk when seeking data. Read-ahead policy may improve system performance if the data is actually written to sequential sectors of the virtual disk.

• No-Read-Ahead—Selecting no-read-ahead policy indicates that the controller should not use read-ahead policy.

• Adaptive Read-Ahead—When using adaptive read-ahead policy, the controller initiates read-ahead only if the two most recent read requests accessed sequential sectors of the disk. If subsequent read requests access random sectors of the disk, the controller reverts to no-read-ahead policy. The controller continues to evaluate whether read requests are accessing sequential sectors of the disk, and can initiate read-ahead if necessary.

• Read Cache Enabled—When the read cache is enabled, the controller reads the cache information to see if the requested data is available in the cache before retrieving the data from the disk. Reading the cache information first can provide faster read performance because the data (if available in the cache) can more quickly be retrieved from the cache than from the disk.

• Read Cache Disabled—When the read cache is disabled, the controller retrieves data directly from the disk and not from the cache.


Typically, I just keep it at Adaptive Read-Ahead. But by all means, I encourage changing the policy settings and then bench-marking to review any benefit in changes/
 
Solution