Sign in with
Sign up | Sign in
Your question

Linux Fileserver - Hardware Questions - All Help appreciated

Tags:
Last response: in Storage
Share
August 1, 2006 6:45:30 PM

I've got several projects that I am about to begin. Many of them are going to need a lot of disk space. So, I wanted to have a box (that would be shared by all projects) that was only responsible for storage. I'm looking to keep costs low, but not cut corners. Plus, I really like building things myself!

That said, here are my questions.

1) I want to keep power requirements down, so I want the most efficient processor for the task. If I am ONLY serving files and use a dedicated RAID card and GigE card, how much processing power would I really need?
2) In regards to question 1, with dedicated hardware for the more or less only system tasks does the motherboard make a difference? Any suggestions? Expandability?
3) I'm pretty much set on SATA, for price/performance, any recomendations for a RAID controller? I'll probably start out with a 4 disk array, but could move up to 8+ in the future... That said, one that allows for multiple controllers/system would be a plus.
4) I'm familiar with Red Had and Ubuntu, but am thinking of going Debian for this box? Any suggestions for a different distro?
5) This will be my first RAID setup, so any good places about how to allow this to scale from initially ~1TB to eventually ~8-10TB?
6) This will be a rackmount system, so reccomendations for cases? I would prefer hot-swappable drives, but not an absolute must especially if they are just easily accessible.

The main reason that I want this dedicated barebones fileserver is that I want it to be rock solid and independant of the different projects.

Any help on any of the questions is greatly appreciated. Also any other advice in general is welcomed as well!
August 2, 2006 12:46:12 PM

Quote:
I've got several projects that I am about to begin. Many of them are going to need a lot of disk space. So, I wanted to have a box (that would be shared by all projects) that was only responsible for storage. I'm looking to keep costs low, but not cut corners. Plus, I really like building things myself!

That said, here are my questions.

1) I want to keep power requirements down, so I want the most efficient processor for the task. If I am ONLY serving files and use a dedicated RAID card and GigE card, how much processing power would I really need?
2) In regards to question 1, with dedicated hardware for the more or less only system tasks does the motherboard make a difference? Any suggestions? Expandability?
3) I'm pretty much set on SATA, for price/performance, any recomendations for a RAID controller? I'll probably start out with a 4 disk array, but could move up to 8+ in the future... That said, one that allows for multiple controllers/system would be a plus.
4) I'm familiar with Red Had and Ubuntu, but am thinking of going Debian for this box? Any suggestions for a different distro?
5) This will be my first RAID setup, so any good places about how to allow this to scale from initially ~1TB to eventually ~8-10TB?
6) This will be a rackmount system, so reccomendations for cases? I would prefer hot-swappable drives, but not an absolute must especially if they are just easily accessible.

The main reason that I want this dedicated barebones fileserver is that I want it to be rock solid and independant of the different projects.

Any help on any of the questions is greatly appreciated. Also any other advice in general is welcomed as well!



Ok, sounds like fun.

My answers ( not always the right ones :)  )

1. The amount of processor power you will need will depend in part on what type on Gig Eth0 card you use and raid controller. On board controllers take CPU clicks, so if your using both on board raid and on board eth0 then you'll want something starting at a P4 2.8+ or AMD 64 3000+. If you use PCI cards for your eth0 and raid, you wont need much cpu power at all, maybe PIII or Barton core, or low voltage AM2 would work here for sure. But I give those as examples hoping you'll find some of the parts laying around. Now thats just a recommendation, so no one get all snippy.

2. Good question, file servers don't need much if you configure them correctly. You can go two routes, used and cheap or new and upgradeable. If you don't have anything in your possession to begin your build now ( like ram and CPU) then you would always benefit from going with new. In this case I think you could get a cheap AM2 board & cpu with integrated video, more similar to real life servers.

3. Tom's did this article a while back. These card are good and cheaper now. Don't mind the PCI-X formats, those cards are compatible with PCI 2.0, they just wont work at 133mhz. In PCI or PCIe formats you'll only see about 5 max channels on one card( that are are affordable). Best practice would say that 5 is the max you want to flood in an array.

4. I would suggest Freenas setup or a fedora build, Only because its a launch pad for Redhat builds.

5. Your not going to typically migrate into to a 1TB to 10TB in one box, so you'll want to start small and work your way up, with maybe mirrored boxes.

6.Curve ball! if this is your case, then you'll want to get a chassis for the array. Your setup is going to be a large setup, I may suggest just buying NSM 160 or some type of iSCSI array. You'll save money. But you could always buy upgrade your own. If this is your first time, start small.


Good luck
August 2, 2006 1:37:42 PM

I had sometime ago the same issue... How I solved it?

1) had a used computer (tbird 1100) and added alot of memory (it had a "fast" board - kt266a) - 2GB
2) used LVM+linux raid (only used hardware controllers to add ports as I was limited to the 4 ata ports) - most of the cheaper solutions are software raid anyway (even when using a raid controller) and linux implementation is very good.
3) sata seems the way to go
4) best way to choose the distro is to choose the one you know best... and your geek friends use ;)  also google is very helpful
5) biggest advantage to use linux lvm + raid is that is isn't dependent on one single controller capacity. Can grow as much as you can add drives
6) hot-swappable is dependent the controller bios/raid controller. As to the case, you need to add alot of drives (LVM + raid of 10x750Gb ~= 6TB) so remember drive space, power cabling, pci slots to add controllers.... all count...

you can always have a look at sun x4500 ;) 

in this type of machine, CPU is not very important, what's really important is memory (to cache HD), and I/O (fast hd's with fast controllers and with enough bandwitdh).
Related resources
August 2, 2006 2:19:19 PM

Quote:
I had sometime ago the same issue... How I solved it?

1) had a used computer (tbird 1100) and added alot of memory (it had a "fast" board - kt266a) - 2GB
2) used LVM+linux raid (only used hardware controllers to add ports as I was limited to the 4 ata ports) - most of the cheaper solutions are software raid anyway (even when using a raid controller) and linux implementation is very good.
3) sata seems the way to go
4) best way to choose the distro is to choose the one you know best... and your geek friends use ;)  also google is very helpful
5) biggest advantage to use linux lvm + raid is that is isn't dependent on one single controller capacity. Can grow as much as you can add drives
6) hot-swappable is dependent the controller bios/raid controller. As to the case, you need to add alot of drives (LVM + raid of 10x750Gb ~= 6TB) so remember drive space, power cabling, pci slots to add controllers.... all count...

you can always have a look at sun x4500 ;) 

in this type of machine, CPU is not very important, what's really important is memory (to cache HD), and I/O (fast hd's with fast controllers and with enough bandwitdh).

agree with most points, specially with the linux raid implementation. He wont need any raid controller, and if he's going to start from scratch, there are motherboards with like 8 sata ports, could go 8x750GB for instance (i'm lazy to calculate the total) only with on board ports.
But i kinda disagree about cpu. Depending on the raid level he will need a decent cpu, i guess an athlon64 3000 would be good, but not like dual cores like i've seen people saying on another topics.
August 2, 2006 2:40:57 PM

OK with something of a 10TB area I would use a SAS Serial ATA on SCSI. It will allow you to have all the benifits of SCSI and the cheaper side of Serial ata. They can also hot swap. Since you would be connecting through a SCSI, you can put the storage core anywhere you can make the SCSI cable reach. Any of the newer SCSI controllers should be able to do this function, the main problem would be finding the interfaces between the SCSI and SATA drives. This is moving out of the NAS area and into a SAN type storage design. Greater expandability, greater data transfer speeds, but also a greater expense.

If your wanting to keep things cool, the best way to do it would be to go with an AMD Opteron dual core or with a new Conroe, either way the Wattage will be lower, therfore heat will be lower, and the dual core would be put to good use, esp. with file serving and both being 64bit.

Last your greatest investment will be in RAM, the greater amount of ram, the better your computer will serve to others around you. Depending on the MB you invest in, you can put upto 8gigs of ram on the machine. This would be the biggest factor on choosing a MB. You'll do well with either one of the proccessors mentioned above, but RAM and page swapping is the greatest concern.

As far as the OS, I would stick with what you have the greatest knowledge in, I use Fedora Core 5 for my cluster. Redhat does best as far as I'm concerned, but again that will be your choice.
August 2, 2006 2:58:59 PM

Thanks for all of the input so far. I've been reading up on all of this and there is just soo much information. For future reference of anyone going through this same process, the only way your are going to keep from going crazy is to start finding certain aspects that you know for sure, because otherwise there are so many variables that you won't ever make any progress.

A couple of the responses mentioned using linux software RAID. I'm currently using it on one of my boxes and don't have any comlpaints, but it is just a simple RAID1 setup for basic fault tolerance.

I was pretty sure that I wanted to do a hardware solution, but if I'm going to reconsider going back to software then I've got a couple of questions:

1) Has anyone really used this for a multi TB setup? If so, how is performance? One of the functions this box will be supporting is hosting MythTV (linux version of TiVo) files (MPEG-2 I think). There will be at most 3 front-ends streaming from it, 2 of which could be HD...So, will that be able to handle that type of load?

2) How well does it scale with RAID5? If I started with say 3x750GB drives, can I just add in a 4th, 5th, etc and have the disk incorporated into the array?

3) How well does it handle failures? Can it support hot-spares and hot-swapping?

4) If I had 8 on-board SATA ports, would there be any way to add additional disks down the line?

5) An advantage of hardware RAID is that the host CPU isn't affected, but with software RAID it is. I've heard that any new processor is able to handle the loads, but that is usually from smaller number of drives and smaller sized arrays. If I did get this thing up to several TBs, is that going to be an issue?

Some good/positive answers to those questions could get me to start looking back further into a software solution, but does anyone have a good rule of thumb for when it starts becoming a better idea to go the hardware route?
August 2, 2006 3:11:38 PM

Quote:

1) Has anyone really used this for a multi TB setup? If so, how is performance? One of the functions this box will be supporting is hosting MythTV (linux version of TiVo) files (MPEG-2 I think). There will be at most 3 front-ends streaming from it, 2 of which could be HD...So, will that be able to handle that type of load?

i've used just like you, as a raid 1 mirrored setup and i noticed it's noticeably faster than using an on board sata controler. Considering a dedicated card provides better performance than on board controllers, and the linux implementation is faster than on-board, it's positioned at least between a dedicated card and the on-board controller, with the advantage of cost.

Quote:
2) How well does it scale with RAID5? If I started with say 3x750GB drives, can I just add in a 4th, 5th, etc and have the disk incorporated into the array?

yes (http://www.tldp.org/HOWTO/Software-RAID-HOWTO-3.html#ss...)

Quote:
3) How well does it handle failures? Can it support hot-spares and hot-swapping?

same as above. About hot swapping, it depends on the controller chip. If you go SATA2, it supports hot swapping.

Quote:
4) If I had 8 on-board SATA ports, would there be any way to add additional disks down the line?

through a dedicated card, i guess. for example: 8 on board ports plus 4 on a dedicated card.

Quote:
5) An advantage of hardware RAID is that the host CPU isn't affected, but with software RAID it is. I've heard that any new processor is able to handle the loads, but that is usually from smaller number of drives and smaller sized arrays. If I did get this thing up to several TBs, is that going to be an issue?

I guess it depends on the load, not on the size. How many people accessing at the same time, in which frequency? Anyway, i think it's more important a setup that can handle high I/O than parity calculations for instance.

oh, at this site there's some good info:

http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html
August 2, 2006 4:06:59 PM

That does sound promising, but from that site in this section concerning reconfiguration and adding of disks the tool it recommended was stated as not "production ready."

I'd hate to lose several TBs. So, has anyone used that tool? Because, if I was understanding that correctly, that would be what I would use to add additional drives *after* the original array was created.

Further clarification of a scenario that I would want to be able to do incrementally (RAID5 w/750Gb Drives):
1 ) Create an initial 3 drive array (1.5TB)
2 ) Add 4th drive and increase capacity (2.25TB)
3 ) Add 5th drive as a hot spare
4 ) Add 6th drive and increase capacity (3TB)
5 ) Add 7th and 8th drives and increase capacity (4.5TB)

If I can do that, then I'm pretty impressed. But if I can:
6 ) Add an 4-port card
7 ) Add 2 drives to card and then be incorporated into existing array (6TB)
8 ) Add 2 more drives to card (7.5TB)

Then I'd be really impressed and am really going to start considering this as an option.

I did a pretty good amount of reading on software RAID when I set my other box up, and it is stable and has good performance. But since I wasn't looking at adding capacity I didn't read that much about its expansion capabilities.
August 2, 2006 4:52:41 PM

hmmmmm i havent read that.
well, but the text below, about backups is really important too, and totally complements the text above. In my opinion, it's pretty much like what partition magic does in windows. It resizes an existing partition on the fly, and although some people say it's safe, on the other side i've seen people losing all their data. Partition magic is supposedly "production ready" and still has its risks.
Changing disks/partitions WITH data on them ir a risky task on it's own, and even raid being a solution which can increase data security, it does not eliminate the need for backups when you deal with it.
I, personally, never used any tool of this kind, neither on windows or linux. Even though i know there are programs that can resize partitions on the fly for both. When i want to change my partitions i simply backup all my data and re-partition from scratch, no resize.

By the way, when you said about adding disks i took a different approach. I thought about adding the disks on another volume, not to the existing one, that's why i haven't realized about the risk of using that tool
August 2, 2006 4:57:15 PM

1) yes
2) yes
3) yes
4) yes
5) yes
6) yes
7) yes
8) yes
August 2, 2006 5:08:37 PM

Quote:
1 ) yes
2 ) yes
3 ) yes
4 ) yes
5 ) yes
6 ) yes
7 ) yes
8 ) yes


Haha, "8 )" (without the space) is the code for smiley. I went back and added a space to all of mine to keep it from doing that...

Not doubting you, but have you (or someone you know) done it? Theory and practice are two different things. What utilities did you/they use?

It's just a good chunck of money (not to mention time and effort), and i would hate to get down a path to realize that it was a deadend.
August 2, 2006 5:14:06 PM

Just a little input for CPU: my home file server and firewall is an... AMD K6-2 @350MHz!
192MB of RAM, 2x 100Mb eth, 1x 1Gb eth, 1x 80GB HDD (os drive), 3x 300GB HDD (raid 5) and it sits idle for 96% of the CPU time drawing only 65W!

You absolutely don't need a fast or modern HW for a file server, if you can put your hands on an old and cheap PC, pick it immediately!
August 2, 2006 5:48:04 PM

Cool, I like the possibilities of going the software route.

Now, onto the next piece of the puzzle, motherboards.

1) What motherboards do people recommend?
2) I've found plenty that can support 8 SATA drives, but it is usually 4xSATA I and 4xSATA II. Are there any that support 8xSATAII?
3) What are the differences between "server" and "gaming" oriented motherboards? Am I going to notice a performance difference from one of the other?
4) For future expansion, what type of PCI (and its derivatives PCIe, PCI-X, etc) do you recommend?

Thanks, I finally feel as though I am making some progress on this journey!
August 2, 2006 6:34:25 PM

Quote:
I did a pretty good amount of reading on software RAID when I set my other box up, and it is stable and has good performance.


I appreciate all of the good input here. I'm planning to take a 2.8GHz P4 box and turn it into a file server in the near future. Its mobo has no SATA ports, so I'm shopping mobos for it. I don't need many SATA ports, since I'm planning to do a 8-port PCI controller for RAID10. I'm looking forward to being able to consolidate various libraries that are strewn about the network.
August 2, 2006 7:08:15 PM

Quote:
I'm looking forward to being able to consolidate various libraries that are strewn about the network.

That is what I am trying to prevent with this effort. I've got several projects coming up and don't want that to happen.

Quote:
I'm planning to take a 2.8GHz P4 box and turn it into a file server

If that is all the box is going to be doing, then you could look into underclocking it. Save some cooling and power. The P4's are power hungry beasts.
August 2, 2006 7:12:25 PM

Quote:

1) What motherboards do people recommend?

dunno :p  i think any that has 8 sata ports as you desire

Quote:
2) I've found plenty that can support 8 SATA drives, but it is usually 4xSATA I and 4xSATA II. Are there any that support 8xSATAII?

i've seen a few, but they're expensive ones. Some i've found:
Epox EP-MF570 SLI
ASUS P5WD2-E Premium

There must be others around

Quote:
3) What are the differences between "server" and "gaming" oriented motherboards? Am I going to notice a performance difference from one of the other?

Server usually have pci-x slots, dual cpu slots, require ECC memory, some have on-board scsi, support a lot of memory, like 8 or 16GB. For a fileserver you probably wont notice any difference, as the differences would be on the disks you choose.

edit: game oriented boards have stuff like sli, 666-channel audio, tons of overclocking features

Quote:
4) For future expansion, what type of PCI (and its derivatives PCIe, PCI-X, etc) do you recommend?

from what i've seen on newegg, for instance, there are more pci-x cards, followed by pci-e, then pci. So i think pci-x would be interesting to have
August 2, 2006 7:27:28 PM

Quote:
I did a pretty good amount of reading on software RAID when I set my other box up, and it is stable and has good performance.


I appreciate all of the good input here. I'm planning to take a 2.8GHz P4 box and turn it into a file server in the near future. Its mobo has no SATA ports, so I'm shopping mobos for it. I don't need many SATA ports, since I'm planning to do a 8-port PCI controller for RAID10. I'm looking forward to being able to consolidate various libraries that are strewn about the network.
depending on the storage size you need, you wont even need a new board. Linux's raid works with pata drives. in principle, 4 is enough for raid10, so if you dont need more than that you wont have to buy a new motherboard or controller card....
August 3, 2006 1:43:13 AM

Quote:
depending on the storage size you need, you wont even need a new board. Linux's raid works with pata drives. in principle, 4 is enough for raid10, so if you dont need more than that you wont have to buy a new motherboard or controller card....


Thing is, I got a great deal on some 400GB SATA HDs, so they are in-hand and ready to go. The mobo won't cost that much and I need to reinstall the OS anyway because I'm tossing the current PATA OS HD when I reconfigure. It's developed some problems that look to be permanent and although the box is limping along, it's time to give it a fresh shot of love and get it going strong again.

The plan is to have the OS on an 80GB drive, then the library on an 8 drive RAID-10. I'd also like to have two or three fairly large drives running off the mobo SATA ports just for project work space, intermediate backup space, etc. All those drives will require a new case too, so I'll probably use the Armor I have. That sucker will be hefty if I put 12 HDs in it. Time to do more homework.

Thanks for the advice.
August 7, 2006 5:58:59 PM

First of all, "RAID is not a backup". I wouldn't think about putting so much data on a server without any form of backup. Think about how many different ways there are for you to lose your data.... virus/etc., bad OS patch, hacking, wrong user operations, software bugs, mistake during software reconfiguration, mistake during hardware reconfiguration, mistake during storage expansion, mistake during hardware failure recovery. PSU failure damaging computer...

While of course you can go for a long long time without any of these happening, all it takes is for one bad failure to cost you all your data.

Multi-TB storage doesn't have any good affordable backup solutions. So I'd budget for 2 servers, the second one being cheaper and optionally less reliable, but with enough capacity to handle full backups of the primary and perhaps incremental ones in addition. You should also shut this server down when not in use in order to serve the independent backup role.
!