Sign in with
Sign up | Sign in
Your question
Closed

Simulating Neural Networks using GPGPU computing

Last response: in Graphics & Displays
Share
May 7, 2011 10:37:04 PM

Hi all,

I have a few questions regarding GPGPU computing that I am very hopeful you all will be able to help me with. I am a graduate student getting my PhD in neuroscience and a major component of my work is simulating large scale neural networks that represent recordings I have performed on real neurons. Our lab is currently investing in a few Tesla systems that will be used to perform highly parallel simulations that we used to outsource to larger computational resources on campus.

My major goal here is to build myself a new system where I will be able to test my CUDA code before I run it on the bigger system. I would also like to not spend my entire life in the lab, so having a rig at home where I can do some of my work would be favorable. In addition, I occasionally have time to play games and would like my system to be able to perform decently with most modern titles.

I am going to lay out a couple of the options that I have put together so far and would really appreciate any comments the community could provide on the viability of any of the combinations. I am mostly torn between deciding on the GTX 400 or 500 series and determining whether I want to throw more money down for an Intel CPU or to stick with the lower cost AMD solution.

Theoretical System 1:

2 x EVGA GeForce GTX 460 (Fermi) SSC+ 1GB 256-bit GDDR5

1 x ASUS P8P67 WS REVOLUTION LGA 1155 Intel P67 / NVIDIA NF200 (16x, 16x, 8x, 8x)

1 x Intel Core i5-2500K Sandy Bridge 3.3GHz (3.7GHz Turbo Boost)

1 x G.SKILL Ripjaws X Series 16GB (4 x 4GB) 240-Pin DDR3 SDRAM DDR3 1600

1 x CORSAIR CMPSU-750TX 750W


Theoretical System 2:

2 x EVGA SuperClocked GeForce GTX 560 Ti (Fermi) 1GB 256-bit GDDR5

1 x ASUS M4N98TD EVO AM3 NVIDIA nForce 980a SLI ATX AMD Motherboard *listed wrong mobo

1x AMD Phenom II X6 1090T Black Edition 3.2GHz

1x G.SKILL Ripjaws X Series 16GB (4 x 4GB) 240-Pin DDR3 SDRAM DDR3 1600

1 x CORSAIR CMPSU-750TX 750W




Those are pretty much the two options that I am trying to choose between. I will most likely have 1 6 GB/s SATA drive in either system.

I would really appreciate any suggestions anyone might have based on experience with components of GPGPU computing in general. Thanks everyone!

Harrison
May 8, 2011 12:56:16 AM

I don't think cuda works with sli well; wouldn't it be better to get a faster single card?
May 8, 2011 1:14:52 AM

I'd go with the Phenom II if you need more physical cores/threaded apps. The Sandy Bridge is better in gaming, but I'm not too sure about the difference in threaded apps with HyperThreading off if you are going to overclock it.
Related resources
May 8, 2011 1:33:55 AM

You're not going to OC a CPU when doing science. Bit errors will kill you.
May 8, 2011 2:41:04 AM

Thanks for the replies guys.

Math1337:

I was thinking the same thing and I have read that SLI and CUDA coding can be considered perpendicular processes. Suggesting that SLI makes all the cards one addressable computing block where CUDA gives you the flexibility to discretely address each Stream Multiprocessor (SM) on the cards. With the two cards I was planning on running independent simulations that occasionally communicate through the CPU but I have also read that this sort of path can give rise to thread pile ups. I want to get as many SM's out of this rig as possible.


In regards to the AMD CPU I had a similar inclination but I have just read so many reviews about how good sandy bridge is that I was beginning to doubt my judgement. I have always leaned toward AMD and I think the X6 with definitely provide me with what I am looking for.

I you guys have anymore advice as to whether I should choose a single high end card or dual mid range cards I would really appreciate it.

Harrison
a b Î Nvidia
May 8, 2011 4:12:47 AM

The 460/560 are relatively poor CUDA cards. Their architecture is better for gaming than CUDA (they will work, but at 2/3 the power you'd expect). I'd suggest a 470 or 570 for a CUDA comp. Another thing to remember, NVidia artificially limited the double precision on the gaming cards to 1/8 SP (vs the hardware limit of 1/2 the Teslas see) so if you are doing DP, the gaming card will be a lot slower than your Teslas at the lab.

Multi cards just got a lot easier with the 4.0 Toolkit thanks to the addition of direct GPU to GPU copy and the memory addressing you mentioned. However it is still harder than doing a single card.

Sandybridge is the best processor choice, however if going AMD allows you to upgrade the GPU to 470/570, then the slight downgrade will be worth it. As the host processor is primarily driving the client CUDA GPUs, fast single thread performance is important (moreso than having more threads).
May 10, 2011 1:32:47 AM

What do you think about the 465?

Thanks for bringing up the FP64 point as I had completely forgotten that they didn't have the same double precision capability. What do you think about going with a single lower end quaddro to up the FP64 processing speed? Most of the work I am doing uses vector arrays where each variable is expressed as a double precision number. This, double precision speed is going to be a pretty big factor. The quaddro's are just so darn expensive.

Thanks Again

Harrison
May 10, 2011 2:57:14 AM

Might be better off setting up a remote desktop with the lab computers. Still waiting for a cloud based neural network to go up. Skynet anyone?

Best solution

a b Î Nvidia
May 10, 2011 3:36:10 AM
Share

Well, I often use double precision (except when folding) and the 470 does pretty decently. The 465 is the same architecture as the 470/480, so it will scale down in speed by the number of cores you lose. And one thing I misspoke on earlier, the 460/560 is great in single precision, but poor in double (so what I said was true for your case, just wanted to clarify).

On Newegg at least, the 465 is about $175 after MIR, which is in the same ballpark as the 460. Unfortunately they are all the reference cooler, which gets a little hot for my tastes.

The cheapest Quadro that has good double precision is the Quadro 4000 (the rest are handicapped as bad as the gaming cards). http://www.nvidia.com/content/PDF/product-comparison/pr...
And it is pretty expensive ($700+).

NVidia has kind of dropped the ball (purposefully) on us scientific programmers on a budget.

The 465 has very limited models left (all reference and hot), the 470 is gone, the 480 has a couple models left but is $280+, the 460/560 were not designed for double precision thus are relatively slow at it, and the 570/580 is too much $$.

With all that, I'd suggest looking into the 560Ti. I know I said it has slow DP, however it has a much higher core clock than my 470 which should make up for it some. If that is too high, it is a close race between taking a performance hit and getting the power efficient 460 or getting the hot and power hungry 465.
May 15, 2011 4:36:19 AM

Best answer selected by Harrison_013.
a b Î Nvidia
May 15, 2011 9:35:36 AM

This topic has been closed by Maziar
!