How to monitor HD use? PCIE SSD vs Sata raid 0

RS82

Honorable
Oct 2, 2012
37
0
10,530
For work i need to do matrix operations on really really huge data sets. 30 Gb and more.
We have some budged, but not that much what would be really required.
If we put all our ram in one single basket, we can have a 64gb machine (667 MHz ddr2 fb ram). And even that is still not enough, (there is additinal memory required for the intermediate results and opperations) so our programs still have to scratch on the HD. which slows things down of course.

Now the idea was to use some SSD's in raid or a revo drive as a huge scratch disk (like putting the windows page file on the complete drive). In order to minimize the HD bottleneck.

Now my question is what would be the best putting 2 120gb drives in raid 0 or buying a revodrive 3 or 3x2 if lucky on ebay. (limited to sata2 and pcie 2.0)

My googling comes up that it all depends on the use. But how can i monitor my use?

I mean if i had both disk i could run my problem as a benchmark. but i dont have the disks and i still like to know to wich benchmarks my problem would compare the best.

How do i know how the programs are writting to the disk? are they doing that i small 4k blocks or huge 500 mb blocks? is the program regulating that, or is windows/linux regulating that? can i change such settins some where? does there exist a kind of monitor software such that i can run it together with my problem, and then that it tells some statistics about how the hardisk has been used?

Any suggestions are welcome! thanks in advance.
 
1) why on earth would you even consider a RAID0 on a critical workload? You should be fired even for thinking the thought! RAID1 or above. RAID0 is only for schleps like myself who need a larger partition on the cheap and are not running critical workloads.

2) A pure SSD Revo drive is going to have MUCH more bandwidth than a pair of SSDs, so if you can find/afford one, then this would absolutely be the way to go. A revo drive that is caching an onboard laptop HDD will not be quite as impressive. Moving to a pair of SSDs (in RAID1) is also going to be significantly faster than your HDDs, likely at a much better price point.

3) Hardware is cheap. Go up to your manager, or finance officer, or whoever holds the purse strings in your company and show them the amount of manpower that goes into fiddling about with a server/workstation attempting to do a workload that it is not designed for compared to the cost of simply buying a better machine made for your workflow. It is cheaper to buy a better machine every single time, and will free you up to do other things on your to-do list. They just don't get it until they see numbers.
 

RS82

Honorable
Oct 2, 2012
37
0
10,530


Well i am working in a not so rich university. and as i sayed budged is an issue. We can spend 250 euro for a revo drive, but what do you think a ddr3 system with 256Gb ram would cost?
For problems that exceed your ram size, the relative ram size, and hd speed are much more critical then the memory speed. Because the HD becomes the bottleneck. Do you have matlab or octave? just invert a matrix that is larger than the size of your RAM, and you can test it yourself. I am quite sure our old dual quad xeon E5472 3GHz. 1600MHz fsb, with 64Gb 667MHz ddr2 still beats a flashy new, 16gb ram ddr3 system. (provided they have the same mechanical hard drive). So to respond in short, yes winning the lotery and spending lots of money would solve the problem, but since we are on a budged, i am looking for help on a cost effective solution.
 
ah, university level... that is a bit different.

DDR3 has been on the market a long time now. A few phone calls to large local businesses may net you a much newer used system for free which could fix a lot of your problems. Otherwise look online for used servers, they tend to be quite cheap and will still run a lot faster than what you currently have.
 

RS82

Honorable
Oct 2, 2012
37
0
10,530


Thanks for your reply.
1) I think Raid 0 for a scratch drive is perfectly ok! there is no critical data stored there its just for temporary data, during the matrix operations. It is just like setting your windows page file on that raid0. The worst thing that could happen, is that you get a bluescreen, and you had to restart your computer. All crucial data is stored in a save place, dont worry. Sorry if that was not clear. Thanks for worrieng so much that you would fire me, for that.

3) I completly agree, just the budged doesnt cooperate, its not really the fold of my boss/professor.

2) This is what i would like to know but then in hard, (or if necessary soft) numbers.

For example the revodrives have excelent speed but especially when the data is compressible.
Random write performance is not so much better than a normal ssd, or a ssd raid.
How can i find out how the hard drive is used during the calculations? I mean it is difficult for me to determine wich of the specs are the most important for this type of applications: sequential read/write speed?
random 4 KB read/write speed? Iops? Raw transfer rates? How compressible is that temporary work data? I mean i can theoreticaly save it on disk and zip it, and see the data reduction, but i dont think that would be really representative for the type of compression used here. Is there any further advantage of the PCIE drive being on the bus, and not behind a sata controller? How are windows or linux using the HD when they are using it as an pagefile/swap partition? are they writing little bits, as they come from the processor, or are they stockign up the little bits into larger chunks in ram, and then write them to the disk? can i influence how they handle this? can i measure how they handle this?

I was kinda hoping that there would be a disk monitor software that would tell me how the drive used during my use such that i know wich of the benchmark criteria i should prioritize for a particular use.
Anoter simple solution would be if they would have a remote machine for rent/trying, with a revo drive such that one could test it for a particular application.

 

RS82

Honorable
Oct 2, 2012
37
0
10,530


Thank you for the suggestions. Especially asking to some local businesses can be a good idea, but i fear they are not swimming in money and hardware to. Believe it or not, this system is actually already bought used.
3 server blades, 8 xeon 3.0 GHz cores each 32 gb each, for a total of about 1000 Euro (1.5 years ago, it was the best performance for the buck that we could find on the budget). Honestly they are old, but they have about 80GFlops per machine, newer i5s dont do so much better. This year we went wild, and there is a used gtx titan comming, for gpgpu programming When i sayed buged i meant like a 1000 Euro a year :(. Many of the guys here even bring in their own money, we were even thinking about doing collections in town. Maybe poverty in some ways stimulates creativity, but it also takes a lot of time.
 
lol, I hear ya! I work for a nonprofit that relies on equipment donations to keep things running all of the time. You certainly hear a lot of "no" before you hear a "yes" when begging for equipment, but it is still worth doing. Moving from a SATA2 and DDR2 based system to a SATA3/DDR3 based system will do wonders. Yes, the processors may not be all that much faster, but they will be fed much more efficiently which will make for some nice performance gains if you can find it. Certainly something worth pursuing, even if you have to pay the shipping.