I have a few questions regarding GPGPU computing that I am very hopeful you all will be able to help me with. I am a graduate student getting my PhD in neuroscience and a major component of my work is simulating large scale neural networks that represent recordings I have performed on real neurons. Our lab is currently investing in a few Tesla systems that will be used to perform highly parallel simulations that we used to outsource to larger computational resources on campus.
My major goal here is to build myself a new system where I will be able to test my CUDA code before I run it on the bigger system. I would also like to not spend my entire life in the lab, so having a rig at home where I can do some of my work would be favorable. In addition, I occasionally have time to play games and would like my system to be able to perform decently with most modern titles.
I am going to lay out a couple of the options that I have put together so far and would really appreciate any comments the community could provide on the viability of any of the combinations. I am mostly torn between deciding on the GTX 400 or 500 series and determining whether I want to throw more money down for an Intel CPU or to stick with the lower cost AMD solution.
I'd go with the Phenom II if you need more physical cores/threaded apps. The Sandy Bridge is better in gaming, but I'm not too sure about the difference in threaded apps with HyperThreading off if you are going to overclock it.
I was thinking the same thing and I have read that SLI and CUDA coding can be considered perpendicular processes. Suggesting that SLI makes all the cards one addressable computing block where CUDA gives you the flexibility to discretely address each Stream Multiprocessor (SM) on the cards. With the two cards I was planning on running independent simulations that occasionally communicate through the CPU but I have also read that this sort of path can give rise to thread pile ups. I want to get as many SM's out of this rig as possible.
In regards to the AMD CPU I had a similar inclination but I have just read so many reviews about how good sandy bridge is that I was beginning to doubt my judgement. I have always leaned toward AMD and I think the X6 with definitely provide me with what I am looking for.
I you guys have anymore advice as to whether I should choose a single high end card or dual mid range cards I would really appreciate it.
The 460/560 are relatively poor CUDA cards. Their architecture is better for gaming than CUDA (they will work, but at 2/3 the power you'd expect). I'd suggest a 470 or 570 for a CUDA comp. Another thing to remember, NVidia artificially limited the double precision on the gaming cards to 1/8 SP (vs the hardware limit of 1/2 the Teslas see) so if you are doing DP, the gaming card will be a lot slower than your Teslas at the lab.
Multi cards just got a lot easier with the 4.0 Toolkit thanks to the addition of direct GPU to GPU copy and the memory addressing you mentioned. However it is still harder than doing a single card.
Sandybridge is the best processor choice, however if going AMD allows you to upgrade the GPU to 470/570, then the slight downgrade will be worth it. As the host processor is primarily driving the client CUDA GPUs, fast single thread performance is important (moreso than having more threads).
Thanks for bringing up the FP64 point as I had completely forgotten that they didn't have the same double precision capability. What do you think about going with a single lower end quaddro to up the FP64 processing speed? Most of the work I am doing uses vector arrays where each variable is expressed as a double precision number. This, double precision speed is going to be a pretty big factor. The quaddro's are just so darn expensive.
Well, I often use double precision (except when folding) and the 470 does pretty decently. The 465 is the same architecture as the 470/480, so it will scale down in speed by the number of cores you lose. And one thing I misspoke on earlier, the 460/560 is great in single precision, but poor in double (so what I said was true for your case, just wanted to clarify).
On Newegg at least, the 465 is about $175 after MIR, which is in the same ballpark as the 460. Unfortunately they are all the reference cooler, which gets a little hot for my tastes.
NVidia has kind of dropped the ball (purposefully) on us scientific programmers on a budget.
The 465 has very limited models left (all reference and hot), the 470 is gone, the 480 has a couple models left but is $280+, the 460/560 were not designed for double precision thus are relatively slow at it, and the 570/580 is too much $$.
With all that, I'd suggest looking into the 560Ti. I know I said it has slow DP, however it has a much higher core clock than my 470 which should make up for it some. If that is too high, it is a close race between taking a performance hit and getting the power efficient 460 or getting the hot and power hungry 465.