School server environment storage problem

petchap

Honorable
Oct 20, 2013
10
0
10,510
I am working at a school and have a limited knowledge of some sever capabilities. Here is the problem:
With new media courses a bottleneck is forming on the network drives when pupils access videos. There is also becoming wider network problems with slow profile loading and other network storage access faults. Currently the fujitsu DX60 has 4 7200rpm SAS dirves (i know that this is not good) hence the bottle neck. The options that I would think to improve the bottleneck are as follows I would like your opinion on the pros and cons of each route:
1. Lowest cost: Purchase more 7200rpm drives to improve space and performance (probably another 4)
2. Purchase a set of 15K rpm drives and replace the 4 7,200rpm (I think that I cannot combine the drives as they have to be matched)
3. Purchase a new rack mount Sata bay and put in 6 480Gb SSD for performance.

Please comment on suggestions or tell me the best solution.
 

petchap

Honorable
Oct 20, 2013
10
0
10,510
Not the network. Install 1000Mbps with fiber to each switch to ensure speed of network. Storage access is still slow. The storage has Gbps connection to server in the same cabinet.
 

petchap

Honorable
Oct 20, 2013
10
0
10,510
Sorry not to give you this in the last post, it is one server with 2 virtual servers using VMware. ~300 client computers with windows 7 and server 2010. There is both a broadcom wire but all but ~20 laptops are wired. The problem is never experienced by a laptop as they have light usage as most applications run locally.
 

COLGeek

Cybernaut
Moderator
Sounds like a very capable network. I agree that your current bottleneck seems to be the single server. I assume that the load spikes at the beginning of each class and the server just bogs down.

You may want to consider a second server to split the load. Even with a faster storage sub-system, you will still have a single server (faster drives would help).
 

petchap

Honorable
Oct 20, 2013
10
0
10,510
server not spiking to bad but the sever storage is slow. Looking for advice on more slower driver and what to do. I think that a Sata bay and ssd is the mid cost high speed to go but how many ssd to buy? Or would a good bet be more SAS 7200rpm?
 

COLGeek

Cybernaut
Moderator
I like the 15k HDD option (your original option 2), if you can maintain your needed capacity. If not (capacity-wise), I would expand with the SAS 7200rpm HDDs.

As fast as they are, I am not a huge fan of SSDs in a server environment. The cost of storage space needed can be prohibitive.
 

popatim

Titan
Moderator
You said the storage has a Gb connection to the server?
So basically your servers' storage is a gigabit connected Nas - to me this looks like a huge bottleneck?
Who knows, maybe you've looked into it and the current drives cannot saturate this link.
 

dgingeri

Distinguished
I have seen major storage issues with ESXi servers and 7200rpm drives. It seems as if ESXi doesn't want to use all the drive bandwidth. I've seen the same with 1Gb networks. (I work in a software test lab with 5 ESXi test hosts, where the testers are working with backup software.) I've seen VMs set up on 8 drive RAID6 sets of 7200rpm drives have a transfer bandwidth of less than 80MB/s, and network transfers of single VMs get limited to 50MB/s. The limitations aren't the hardware, but something in ESXi.

However, many of my testers have gotten past this by running more VMs. Instead of testing with one VM backing up to our disk products at a time, they run 4 or 6 and are pushing 200MB/s for their tests. Perhaps your answer isn't in expanding the hardware, but running 4 VMs instead of 2.
 

petchap

Honorable
Oct 20, 2013
10
0
10,510
Still have the problem with the network accessing the SAN very slow. I think that increasing the number of servers will help with the login speed problems I am having. My plan is to purchase another 4 or 5 drives for the SAN to see what speed increase I will get. Or is this throwing good money after bad?
 

choucove

Distinguished
May 13, 2011
756
0
19,360
There's several pieces here that are causing your problem I believe, it can't ONLY be narrowed down to drive speed.

First, you have three hundred or so end client computers, and ALL of these are concurrently requesting access to the same storage server. Split your gigabit connection into that many client computers and each one is crawling. Now of course I'm not saying they are all simultaneously pulling data through that server, but still you are talking about a ton of access having to go through a single gigabit channel.

Even if you do increase the throughput of your server storage, you're still limited by that gigabit connection. A single 7k SATA hard drive can saturate the throughput of a single gigabit ethernet channel, so while increasing the speed of your hard drives will give you greater overall data throughput at the hard drive, it's not increasing the bottleneck that's really causing an issue at your network interface. You should be looking at also increasing your network throughput for individual connections by either utilizing link aggregation and a network team of ports, or investing in 10 GbE technology.

While a group of 7k SAS hard drives isn't great performance, it's still pretty good depending upon the other underlying hardware (RAID controller, OS, RAID type, etc.) You may wish to consider separating out to additional storage servers (or NAS) to help distribute the load. Putting additional storage or faster storage into a single server can get you by, but that depends on your other underlying hardware again. The reason I might recommend an additional server instead of utilizing the same server for that kind of workload is because you're still running into a problem with bottlenecks. Even with more 7k or even faster 15k drives, you're still running everything through that single RAID controller, that single processor, that single network adapter, etc. If you set up two separate servers you have twice the flexibility and half the load across each server. This means you can also specialize your servers for their storage needs. For example, you can utilize your current storage server exactly how it is, but put in a second server with new faster 15k SAS drives for greater throughput and store shared video or database files on there instead, while having photos, documents, and other "lightweight" data that doesn't require as much throughput on your original server with the 7k SAS drives.
 

FireWire2

Distinguished


I would trouble-shoot the problem 1st - Here are things that I would do:
Test RANDOM access of your SAS array using IO meter,
You need over 100MB/s in RANDOM read

Assume you get 100MB/s random, and each video stream requires 1.5MB (DVD duality)
You are looking at about streaming 60+ clients - in theory but in real application you are lucky to have 40+, without ANY other computers access to the network

Like other posts, you need to know where is the bottle neck - IMO shot-gun is not a good approach
 

petchap

Honorable
Oct 20, 2013
10
0
10,510
Light login load
4.87Mbps
failed due to time out

Light load
36Mbps writing
69.6Mbps Reading

During low (between class logins)
51Mbps Writing
39.8Mbps Reading (odd Result)




Standard login (class start)
38.7Mbps Writing
45Mbps Reading

Normal Class load
19Mbps Writing
110Mpbs Reading

This is seem some confusing results I think that it might be a network broadcast storm. All switches are GBit HP linked via fibre optic.
 

FireWire2

Distinguished
Work a bit crazy! could not check back in

Can you confirm the trxfer is Mbps not MBps - it's a 1 to 8 ratio between Mb and MB.
If your data is Mb <--- you do have a major issue in your network
I would plug direct (bypass) the router/switches further trouble-shoot
can you list your Router and Switched model?