Sign in with
Sign up | Sign in
Your question

Home-Built Supercomputer for Scientific Computing

Last response: in Systems
Share
March 12, 2009 9:54:38 PM

Hello all!

I am an Aerospace Engineer and looking to build a homemade "super computer", for lack of a better name. The computer's sole purpose will be to perform computational fluid dynamics (CFD). As you may know, CFD is very demanding when it comes to cpu performance. I would like the system to run a linux os. It will primarily running FORTRAN 90, C, and C++. It will also be running MATLAB, but in a less demanding manor. Along with fortran and C++, GAMBIT will be used for CFD grid generation and Fluent will be used for its cfd codes.
Complex cfd programs can produce data files in the terabytes range, although I dont plan on getting that in depth. Speed is primary concern. What can I build for under $10k USD?

Thanks Guys
March 12, 2009 10:21:47 PM

how about a VX50? I know it's E-bay....but this guy has some real high-end stuff with warranty. Check out the double wide chassis S4985 he's got listed if you need alot of drive space too. About as close as you'll get to a "personal supercomputer".

http://shop.ebay.com/merchant/marsbrode_W0QQ_nkwZQQ_arm...

If you have your heart set on building one yourself, you can always pick-up a barebones VX50 or the "cheaper" FT48. Don't know if this is the type of thing you have in mind, or if you're looking for something a little more traditional.
March 12, 2009 10:39:45 PM

To be honest, Im not exactly sure what I'm looking for. Im not really familiar with the VX50, although I will look into it. My knowledge of computer hardware is quite limited, I just know how to use them :) . I am not opposed to buying a system like this, I just figured that I could build something tailored specifically for my needs, and do it more cost effectively. Oh and i don't know if this is important but the system needs to be able to perform double precision calculations. Again, speed speed speed...
Related resources
March 12, 2009 10:59:58 PM

I actually don't know anything about that type of software :(  I've used tyan stuff for years for CAD. I'm actually in the process of building a FT48 (16-core) right now as a virtualization/CAD rig. I do know that the advertised primary use for these type of boxes are CFD and they definately will run Linux. From what I do understand, this type of work usually requires a good number of cores, a large amount of RAM, and some fast HDD's or some sort of clustering set-up using infiniband networking. I did some quick googling, and apparently results can drastically differ with slightly different types of calculation depending on how you set it up to run. Most of the info I've found is waaayyyy over my head. Sorry I can't be of more help....
March 12, 2009 11:03:54 PM

Any and all input is helpful! Thanks shadow
March 13, 2009 12:23:12 AM

I would start with a careful inventory of all production software
you intend to run on a regular basis.

Let me give you a good example from our daily workload:
after we installed COPERNIC desktop search software
on a quad-core Q6600 workstation, we compared it
head-to-head with a dual-core D 945 in a similar workstation.

The Q6600 finishes just about twice as fast as the D 945
updating the very same 5GB database.

That can only be the result of parallel programming in COPERNIC.

If your software is not coded to exploit multiple cores,
you're wasting your time bulking up on multi-socket systems.
Those are intended chiefly for busy servers with lots of
multi-tasking to process.

After you've completed your analysis of your production software,
the real choice for you is between a single-socket Core i7 machine,
or a twin-socket Xeon with Core i7 architecture.

A single-socket Core i7 machine has 4 CPU cores with hyperthreading:
each socket is THUS capable of running 8 threads simultaneously.

That should be enough for any software you want to throw at it,
particularly if you are only running one copy of your fluid dynamics model
at any given point in time.

Next decision is the clock speed of the Core i7: go for the fastest
because you've got plenty of budget for it, and it was just reported
on the Internet that all Intel Core i7 CPUs are "unlocked" --
meaning they can all be overclocked.

So, buy a motherboard that makes overclocking easy, e.g. ASUS P6T.

As for RAM, X58 chipsets now support either 12 or 24GB of DDR3 RAM:
look into Corsair's high-end Dominator series, with the memory module cooler.
Kingston make engineering samples of 4GB DIMMs x 6 = 24GB total e.g.:

http://www.hexus.net/content/item.php?item=17187


The Corsair power supplies w/ 850 Watts and up should be enough:
they are very highly regarded for high-end workstations like the
one you want to build:

http://www.newegg.com/Product/Product.aspx?Item=N82E168...

I'll leave your graphics hardware choices up to your own good research.

If you are really serious about writing very large data files,
then be sure to give serious consideration to motherboards
and/or RAID controllers that can pump data quickly to and from
SAS (Serial Attached SCSI) hard drives: these come in both
10,000 and 15,000 rpm e.g. Seagate.

Despite what lots of amateurs claim, in ignorance,
a RAID 0 with multiple (4 x or 8 x) HDDs can really move a lot of raw data
very fast: 250-300MB/second is quite easy, and 500MB/second
is within reach, if you know what to buy and how to configure
your storage subsystem. The key is choosing a RAID controller
that does parity computations in hardware, e.g. Areca, 3Ware
or Highpoint's "enterprise" class controllers (there are others).

Out on the "bleeding" edge you will find Fusion-io's ioDrive Duo
(Google that one), or OCZ's new Z-drive (actually just a
Highpoint RocketRAID with 4 x OCZ MLC SSDs inside the
plastic shell).

Lastly, we've had enormous success with ramdisks
by installing RamDisk Plus from SuperSpeed, LLC
in Sudbury, Massachusetts: www.superspeed.com

A little bit of intelligent memory management, tailored
to your application software, can pay enormous dividends:
Core i7 memory bandwidth BEGINS at 25,000 MB/second
(25GB/second), and goes up from there when triple-channel
DDR3 is overclocked.


This is THE hottest computer hardware available
at the present time for workstation-class machines.
AMD still can't even come close.


hope this helps


MRFS
March 13, 2009 1:00:11 AM

MRFS, WOW! What a reply! It will take a while for all that information to settle into my mind, like I said im not to knowledgeable on computer hardware. I will have to do some more research on your post. If you dont mind me asking, what field are you in?
March 13, 2009 1:53:59 AM

I first began working with computers during the Summer
before my first year of grad school at U.C. Irvine in 1971.

I've been using and developing advanced computer systems
ever since then, now 38+ years.

I build about one new workstation every year or so,
with the "trickle-downs" going to friends and neighbors.

I have also submitted 3 patent applications to the U.S.
Patent Office, with a fourth ready to submit, in the area
of high-speed solid-state data storage subsystems.

(I've also been banned by certain websites merely
for asking what "VISTA" stands for: nevertheless,
witness how fast MS is now pushing Windows 7.)


MRFS (= Memory Resident File Systems)
March 14, 2009 12:08:02 AM

@OP: Take a look at nVidia CUDA. I believe they have a plug in tor MATLAB acceleration.
March 14, 2009 1:47:20 AM

http://www.tomshardware.com/news/Tesla-C1060-S1070,5672...

[begin quote]

Besides performance improvements, the T10P also delivers 64-bit or double-precision capability, which is required for most fluid dynamics and financial stream processing applications. Double precision is substantially more intensive than single precision calculations and with decrease the performance of the card dramatically. Nvidia told us that double-precision calculations will result in a 90% speed penalty and deliver only 100 GFlops per T10P processor.

[end quote]


Also:

http://www.tomshardware.com/reviews/nvidia-cuda-gpu,195...


MRFS


March 14, 2009 2:24:51 AM

http://news.softpedia.com/news/Intel-Nehalem-EP-Gets-Ea...

http://www.techradar.com/news/computing-components/proc...


Just to confirm that our eyes did not deceive us, we also gave Nehalem EP a quick going over with the Stars Euler3D benchmark. It's a computational fluid dynamics simulation that majors on floating point performance.

Sure enough, Nehalem EP roasts all comers in this benchmark, too – it's twice as quick as a pair of 2.7GHz Shanghai processors (14.34 seconds to complete five instances versus 30.32 seconds).

[end quote]


MRFS
March 14, 2009 2:36:54 AM

Re: Nehalem EP (multi-socket Core i7)

http://www.tweaktown.com/news/10575/clearer_picture_of_...

http://www.digitimes.com/news/a20081112PD218.html

Intel is planning to launch Xeon 5500 (Nehalem-EP) and Xeon 3500 series (Nehalem-WS) server CPUs in the first quarter of 2009, according to sources at server makers.

Intel will launch ten CPUs for the Xeon 5500 series: quad-core W5580 (3.2GHz), X5570 (2.93GHz), X5560 (2.8GHz), X5550 (2.66GHz), E5540 (2.53GHz), E5530 (2.4GHz), E5520 (2.26GHz), E5506 (2.13GHz), E5504 (2GHz) and dual-core E5502 with prices at US$1,600, US$1,386, US$1,172, US$958, US$744, US$530, US$373, US$266, US$224 and US$188 in thousand-unit tray quantities.

For the Xeon 3500 series, Intel will launch three CPUs: quad-core W3570, W3540 and W3520 priced at US$999, US$562 and US$284.

In additional news, Intel is planning to phase out seven notebook CPUs including the Core 2 Extreme X7900 and X7800, and Core 2 Duo T7800 and L7700 in January next year.

[end quote]


http://www.theregister.co.uk/2009/03/09/intel_nehalem_m...

Intel 'Nehalem' Xeons poised for March 31 launch

[begin quote]

With the quad-core Nehalem EP processors and their QuickPath Interconnect offering between three and four times the memory bandwidth of the current quad-core Xeon 5400 series processors and their antiquated front side bus architecture, it will be easy to make the technical case for an upgrade to the new chips, code-named "Gainestown" and paired with the "Tylersburg" chipset. The Nehalem EPs will be sold as the Xeon 5500 series. They were outed last week by Apple, which plunked the Nehalem 3500 (for single-socket machines) and 5500 (for dual-socket boxes) into its Mac lineup. It is not clear how much more raw oomph the Nehalems will have, but on some early benchmarks, system performance has increased by nearly 80 per cent.

[end quote]


http://www.hardware.info/en-US/news/ymiclZqXwpeaaZY/15_...


Excellent photos and block diagrams here:

http://forum.***.com/mainboards-chipsets/16074-nehalem-ep-discussion-thread.html
("***" should be "*** above (no spaces): don't know why the Forum is hacking this URL ??)

So, Google "With the impending release of the Dual Socket Nehalem EP platform I figured we should start a discussion of what is coming down the pike."


http://www.ewiz.com/detail.php?name=MB-Z8PD12X#


Google "ASUS Z8PE-D12X"


MRFS
March 14, 2009 4:12:02 PM

http://www.crn.com/white-box/215900275


Intel To Launch Nehalem Server Chips March 30

By Damon Poeter, ChannelWeb
8:08 PM EDT Fri. Mar. 13, 2009

Intel (NSDQ:INTC) is prepping its channel for the wide release of its Nehalem-class Xeon server microprocessors and platforms to whitebox partners on March 30, Channelweb.com has learned.

Code-named Gainestown, the first quad-core Xeon parts featuring the company's next-generation Nehalem microarchitecture hit the market in early March with the launch of Apple's new Mac Pro workstations.

[end quote]


MRFS
March 14, 2009 4:31:55 PM

Two things, one observation and one question.

First and foremost, I definitely need to do a TON of research and get a better understanding of all of this hardware.

And the question, With all of the hardware that you are presenting, would I need to alter my code to run on these platforms, or will it compile and run in the same way that it would on a desktop ( hopefully a lot faster)?
March 14, 2009 4:46:25 PM

"Nehalem EP" is the code phrase for multiple Core i7 CPUs
running in the same motherboard e.g. dual-socket
(e.g. see photos above of the ASUS MP version).

"MP" means multi-processor (i.e. multiple CPUs).

As far as I know, the instruction sets are the same,
with extensions.


You may need to re-compile, but the languages you
mentioned fully support the x86 instruction set by now.

AMD's 64-bit implementation was designed from the outset
natively to support the 32-bit instructions of that x86 set.


What machine(s) are your fluid dynamics codes
running on presently?


I used to do a LOT of FORTRAN conversions (long time ago):
the main problems resulted from system-specific
SUBROUTINE calls, and different compiler defaults
e.g. INTEGER defaulted to *2 on some machines
and *4 on other machines.

In your case, you should look into the
defaults for DOUBLE PRECISION:
is the default REAL*8 or REAL*16?

You may need to hard-code REAL*8 to achieve maximum performance,
because REAL*16 may cause a severe performance penalty.


Do you have any colleagues currently running Core i7 machines?


From what I've gleaned after a lot of reading,
ASUS and Gigabyte are the only games in town
for a serious single-socket Core i7 machine.

"Core i7" is synonymous with "Nehalem".

With your budget, I would strongly suggest that
you start with a modest "test" machine, e.g.:

http://enthusiast.hardocp.com/news.html?news=MzgzNTIsLC...

http://www.asus.com/Product.aspx?P_ID=6i86Hj0lGriFHfY9

Then, when your codes are running, you can either demote
this mATX motherboard to the role of a backup server, via a Gigabit switch;
or, cannibalize the parts for your dream workstation e.g. dual-socket.
The single-socket Core i7 CPU will need to stay in the mATX motherboard, however.

SDRAM vendors are competing fiercely right now for this Core i7 market,
and you wouldn't need to buy THE fastest DDR3 for such a "test machine".
Start with 6GB of triple-channel RAM, 1 x VelociRaptor for your OS,
and 1 x 1TB SATA/3G hard drive for data storage e.g. Western Digital.


Dual-socket Core i7 machines are now or will very soon
be available from ASUS, Tyan, Supermicro and Intel.
I don't know enough about these MP motherboards
to make a recommendation to you, however.


If you like, I can put you in touch with the ASUS
Marketing Manager for North America. Let me know.


MRFS
March 14, 2009 8:18:15 PM

Well right now, I am a graduate aerospace engineering student at Arizona State. Everything I have done thus far has been on my personal laptop ( not a high performance machine at all...) and on standard desktop computers. Although ASU does have a very good HPC facility, I haven't had the privilege to use it yet ( I think the Saguaro super computer is number 150 or so in the world).

Any way, Im not immediately in the market for this hardware as i wont be graduating for another year or two. But, Im starting to look into formulating a business plan for a CFD outsourcing business. For this type of thing, computing power is a major consideration, and Im trying to figure out if it is feasible for me to acquire the necessary hardware at a decent price. Ill admit, Im still new to this type of computing and there are still large gaps in my knowledge. What I do know is, I do enjoy the coding involved with CFD and I enjoy the results even more! That being said, I think there is a market for CFD simulations.

Sorry, kind of got of topic there, but you should know what my intentions are and a little bit about my background.

answering your questions...

Machine Currently being used:

Gateway tablet PC, win xp, 1.5 Gig Ram, 1.74GHz intel mobile cpu
like I said, not a great machine, but does the job for very simple code. I run MATLAB most the time and it is a very slow running language from my experience. It would be completely inadequate for anything really meaningful.

REAL*8 or REAL*16:
No clue! but ill look into it. Im not a fortran guru (yet) but If its as simple as hard coding REAL*8 , no problem right?

Colleagues Running i7?:
Right now my colleagues are fellow students, so no, they are running the same type of garbage I am.

No need to put me in contact with Marketing Manager as Im not immediately in the market but thanks for the offer!

Stupid Question for you. So for this type of system, would it consist of multiple i7's or just 1? I know they are multi core processors but am confused if more than one processor is actually used.

I very much appreciate all of the help and insight you are providing. You are helping me learn a lot.

March 15, 2009 1:15:59 AM

The Intel Core i7 CPU aka "Nehalem" is a quad-core chip
with hyperthreading, that installs into a single LGA1366 socket.

Dual-socket motherboards are being called Nehalem EP.

I think you should start out with a single socket motherboard
like the ones built by ASUS or Gigabyte (see above).


MRFS


April 12, 2009 5:05:15 PM

Guys, I understand the traditional interest in a hotter CPU but that misses the point of the last two years of personal supercomputing innovation: all that revolves around NVIDIA CUDA. Using hundreds or thousands of stream processors within NVIDIA G200 series GPUs costs almost nothing compared to the high price of Intel CPUs. It also makes it clear that old-fashioned silicon like Nehalem CPUs have very little role in computation anymore. They are so incredibly slow they mostly get in the way even if all you ask them to do is be the modern equivalent of a keyboard or disk controller.

Intel no doubt will get around to getting competitive sooner or later (only a fool thinks the Empire won't try to strike back) with things like Larrabee, but for now they are very, very far behind NVIDIA.

To get an idea for what can be accomplished for far under $10,000, see http://www.manifold.net/info/pr_gpu_record.shtml where there's 1440 stream processors in use and the Core i7 is used mainly only as a disk controller. The eight hyperthreads in the Core i7 are just loading up the GPUs, where the real comptuation is being done.

As a cost-effective strategy, it doesn't even make sense to buy a Core i7 Extreme. Get a Core i7 920 and take the $750 you save and you have one and a half times the cost of a GTX 295, that is, about 720 stream processors - in other words, over a teraflop of computational power. How many Core i7 Extremes would you have to buy to get a teraflop of computational power?

Buy a Core i7 for around $250, put it into a $250 mobo like the ASrock X58 WS Supercomputer, buy 12 GB of RAM for $130 (for the Corsair 1333Mhz DDR3 triple-channel memory), get a couple of 1 TB disks for a total of around $150 and then buy four GTX 295's for $2000. You've only spent $2780 so far. Spend another $300 for a couple of power supplies (the cheapest way for big rigs, like that at http://estoniadonates.wordpress.com/ and in round numbers you've only spent around $3100 and you have 1920 stream processors and about four and a half teraflops of computational power.

If you really have $10,000 to spend and you don't mind writing the code to cluster out, heck, you could configure three such systems with money left over for a cool rackmount case and some buck to pay your local electrician to add a few extra 20A circuits to your house wiring. That would be 5760 stream processors and over 13 teraflops of computational power.

If you are serious about personal supercomputing, go CUDA, my son. :-)
April 14, 2009 8:22:23 PM

SuperCruise, I am also a CUDA enthusiast (bought a 8-series video-card only to see how can CUDA practically work), but keep in mind that what you are talking about is theoretical computational power. Not only his software needs to be ported on CUDA, but also the algorithms have to be re-designed in order to use effectively the large amount of cores. Also you need to understand the CUDA constrains about synchronization, memory access, etc. The incentive of this sustained effort is the tremendous speed improvement to benefit of.
bgood44, what you have right now is a very mobile computer. Its performance is measured in pounds and battery life when idling, it was clearly not designed for extensive computational work. If you purchase a decent price i7 processor, with 4-8 GB of RAM and a regular 1T B hard drive, your will be on a budget less than $800 and you will observe a significant improvement in performance, I would say x5 in computations and x2 in data storage. Later on, when you have very clear in mind what algorithm do you have to implement, how many independent session you have to support, how long is a session supposed to last, you can decide between a large computer with 2 or even 4 processors, a CUDA approach based on NVidia hardware, a cluster of inexpensive one processor nodes or a mixture of these 3.
July 7, 2009 12:41:21 AM

Well, I have tried using CUDA (9800GTX) for volumetric ray tracing and could barely match single core i7(single Nehalem core 3.0GHz) ray-tracer counterpart. Software ray-tracer I used is definitely a top performer and CUDA version has demonstrated around the same performance as probably the today best GPU volumetric ray tracer - ImageVis3D 1.1.1 (I'm not sure but unlikely they used CUDA). Anyway; from my experience with CUDA it is clear that GPU SIMD architecture makes it effectively usable only for specific SIMD friendly class of algorithms: texture mapping, back-projection, many tasks of linear algebra etc... Once threads run asynchronously speed goes down dramatically; for example volumetric ray-tracer running on dual E5540 outperforms ImageVis3D 1.1.1 on 9800GTX by factor 4..6 for majority rendering scenario relevant for medical applications (CT data). Hysterical marketing BrainWash about GPU as a general computing device is really just to make you invest your time in this GPU "crap" so, rationales like above is understandable. It does not mean that there is no area where GPU is really great but it is definitely not an universal computational devise. Besides new coming GPU (Larabee is one of them) is going to be MIMD so all SIMD limitations and skills to go around its limitations is getting to be irrelevant. By the way, dual E5540 machine is an example of perfect implementation of MIMD architecture; I would just increase number of core by factor 100 to satisfy my appetite.

Stefan
Anonymous
August 11, 2009 9:15:57 AM

NVidiA Tesla HPC cards do 1 Teraflop of single precision floating point operations per second and 280 GFLOPS of double precision operations per second. They are completely scalable for many Teraflops just buy using additional cards. The cards are about $1399 at Tiger Direct and fit in any PCIe 16 X slot. They do however, require lots of system RAM. They use CUDA and are well suited to fluid dynamics. They are not a video card, strictly a massive number crunching machine.
August 26, 2009 5:33:04 PM


>NVidiA Tesla HPC cards do 1 Teraflop of single precision floating point operations per second

It is Very True for SIMD computational model. Once algorithm can be formulated as a single instruction stream crunching wide array of number you may have 1 Teraflop from Tesla. If you need to run thousands of totally undependable threads processing undependable data (MIMD model) you will have performance probably slower than i7 can provide.

Stefan
October 8, 2009 10:11:53 PM

like others earlier I strongly recommend looking at CUDA. CUDA uses

Start here: http://www.nvidia.com/object/computational_fluid_dynami...

Then check your software to see what is supported.

If you care about your results, consider waiting for the next gen from nVidia codenamed “Fermi”. It actually has error checking so you won't need to re-run your code a few times to see if you got the right results or if a sunspot flipped a bit on you. (RAS of current accelerators is mostly missing because they are built from graphics cards that don't care about data integrity)

Suggest: Append to CUDA forums describing your application needs in CFD. Ask for config help, verify software works as advertised, check benchmarks.

Suggest: Start with a 'gamer' PC of your choice. Specify a high end nVidia graphic card. Cost will be $1K to $2K total from Dell or some specialty house. Get the software you need working with CUDA. If the performance is not there then you can consider higher end solutions. Don't throw $20K at this problem until you understand the ups and downs of the $1-2K system.



October 16, 2009 11:44:38 AM

i have some dual nehalem computers listed at ebay... Engineering Samples

remember MATLAB's software is CUDA and OPENCL ready

I have 6Gbit SAS on a board coming in with ISTABUL twin from SUPERMICRO

or TYAN 8-way 84XX series chassis's ready to ship also

omces.com

call mark at 3148450695 for any specs
October 16, 2009 11:46:15 AM

REMEMBER I can build a 10TFLOP machine with the new supermicro board. Screamers here...

LEAN GREEN COMPUTING POWER... mark
January 3, 2010 3:07:37 PM

bgood44 said:
Well right now, I am a graduate aerospace engineering student at Arizona State. Everything I have done thus far has been on my personal laptop ( not a high performance machine at all...) and on standard desktop computers. Although ASU does have a very good HPC facility, I haven't had the privilege to use it yet ( I think the Saguaro super computer is number 150 or so in the world).

Any way, Im not immediately in the market for this hardware as i wont be graduating for another year or two. But, Im starting to look into formulating a business plan for a CFD outsourcing business. For this type of thing, computing power is a major consideration, and Im trying to figure out if it is feasible for me to acquire the necessary hardware at a decent price. Ill admit, Im still new to this type of computing and there are still large gaps in my knowledge. What I do know is, I do enjoy the coding involved with CFD and I enjoy the results even more! That being said, I think there is a market for CFD simulations.

Sorry, kind of got of topic there, but you should know what my intentions are and a little bit about my background.

answering your questions...

Machine Currently being used:

Gateway tablet PC, win xp, 1.5 Gig Ram, 1.74GHz intel mobile cpu
like I said, not a great machine, but does the job for very simple code. I run MATLAB most the time and it is a very slow running language from my experience. It would be completely inadequate for anything really meaningful.

REAL*8 or REAL*16:
No clue! but ill look into it. Im not a fortran guru (yet) but If its as simple as hard coding REAL*8 , no problem right?

Colleagues Running i7?:
Right now my colleagues are fellow students, so no, they are running the same type of garbage I am.

No need to put me in contact with Marketing Manager as Im not immediately in the market but thanks for the offer!

Stupid Question for you. So for this type of system, would it consist of multiple i7's or just 1? I know they are multi core processors but am confused if more than one processor is actually used.

I very much appreciate all of the help and insight you are providing. You are helping me learn a lot.


I havent bothered to read all of the replies to this message so its quite likely they have been addressed already. However, I would strongly suggest that you look into cloud computing with Amazon EC2 or Microsoft Azure . Cloud computing would likely save you a great deal of money as you could run the applications as needed (assuming you are conducting simulations or some other quantitative calculations). There will be an initial learning curve, but imo, it would be worth it.

March 9, 2010 7:49:50 AM

for a real personal super computer under 10k USD try nvidia tesla super computer its really amazing

Get your own supercomputer. Experience cluster level computing performance-up to 250 times faster than standard PCs and workstations- right on your desk. The NVIDIA® Tesla™ Personal Supercomputer is powered by up to 960 parallel processing cores and 16 GB of dedicated compute memory to solve large datasets. Based on the revolutionary NVIDIA® CUDA™ parallel computing architecture, the Tesla Personal Supercomputer puts nearly 4 teraflops of computing muscle at your fingertips.
Up to 960 Core Parallel Supercomputer on Your Desk, 250 Times Faster than a PC.
Up to 4 teraflops of compute capability for under $10,000.
Computation comparable to a large server cluster that fills a room.
250 times faster than a multi-CPU core PC or workstation using traditional serial computing.
GPU parallelism designed to execute tens of thousands of concurrent threads.
Up to16 GB of dedicated compute memory meets the computing requirements of large datasets.
Single- and double-precision floating point.
Scalable architecture meets computational demands of complex scientific applications.
Based on Revolutionary NVIDIA® CUDA® Parallel Computing Architecture.
Unlocks the power of GPU parallel computing.
Over 25,000 application developers worldwide using CUDA.
Over 100,000,000 CUDA-enabled processors shipped.
Accessible to Everyone
Supercomputing available to anyone, anytime.
Iterate faster; discover more.
Computing whenever you need it. No need to ask permission to an administrator.
Buy it anywhere, put it anywhere.
Plugs into standard power strip.

http://www.connoiseur.com/NVIDIA_Tesla_Personal_Superco...

March 22, 2010 7:14:38 PM

I'm a graduate student at caltech, I know about cuda and gpu's and parallel computing. I do CFD research. The type of computer you need depends on the type of flow you are simulating. If you are simulating incompressible flow, then I would say learn about gpu's and write your own code from scratch and use a FFT solver library that runs on a GPU. If you are modeling compressible flow (hyperbolic equations), then I would say you need to buy a parallel computer made up of i7 chips. If you don't want to build it from scractch, buy a Cray personal super computer, $25000 to $64000. This will allow you to use up to 64 cores I think. If you want to start smaller and just have a desktop. Use an Intel i7 configuration. You need double precision to do compressible flow. GPU's are slower for double prescision...they are getting better though, there also exists no comericial compressible flow software that runs on GPUs, you'll have to make your own. Wait a couple years and check up on that though. However, also buy that time there will probably be 32 core processors on the market, and those might be better for CFD than GPUs. Basically right now, for programs based on the MPI model, getting access to a cluster, (64 processors or more) is ideal. I'm assuming Fluent runs on clusters here. Would you hardware experts out there agree with this? I'm also interested in your opinions.
July 9, 2011 1:19:02 PM

My honest and sincere advice is that if you are a recent graduate and want a computer to do CFD at home, just buy a good PC, don't worry about clusters, GPUs and servers.... I suppose you are not intending to model the airflow in a commercial plane with a detailed geometry and under project time pressures, right? Everybody waits for CFD results.
October 15, 2012 5:27:22 PM

MRFS said:
I would start with a careful inventory of all production software
you intend to run on a regular basis.

Let me give you a good example from our daily workload:
after we installed COPERNIC desktop search software
on a quad-core Q6600 workstation, we compared it
head-to-head with a dual-core D 945 in a similar workstation.

The Q6600 finishes just about twice as fast as the D 945
updating the very same 5GB database.

That can only be the result of parallel programming in COPERNIC.

If your software is not coded to exploit multiple cores,
you're wasting your time bulking up on multi-socket systems.
Those are intended chiefly for busy servers with lots of
multi-tasking to process.

After you've completed your analysis of your production software,
the real choice for you is between a single-socket Core i7 machine,
or a twin-socket Xeon with Core i7 architecture.

A single-socket Core i7 machine has 4 CPU cores with hyperthreading:
each socket is THUS capable of running 8 threads simultaneously.

That should be enough for any software you want to throw at it,
particularly if you are only running one copy of your fluid dynamics model
at any given point in time.

Next decision is the clock speed of the Core i7: go for the fastest
because you've got plenty of budget for it, and it was just reported
on the Internet that all Intel Core i7 CPUs are "unlocked" --
meaning they can all be overclocked.

So, buy a motherboard that makes overclocking easy, e.g. ASUS P6T.

As for RAM, X58 chipsets now support either 12 or 24GB of DDR3 RAM:
look into Corsair's high-end Dominator series, with the memory module cooler.
Kingston make engineering samples of 4GB DIMMs x 6 = 24GB total e.g.:

http://www.hexus.net/content/item.php?item=17187


The Corsair power supplies w/ 850 Watts and up should be enough:
they are very highly regarded for high-end workstations like the
one you want to build:

http://www.newegg.com/Product/Product.aspx?Item=N82E168...

I'll leave your graphics hardware choices up to your own good research.

If you are really serious about writing very large data files,
then be sure to give serious consideration to motherboards
and/or RAID controllers that can pump data quickly to and from
SAS (Serial Attached SCSI) hard drives: these come in both
10,000 and 15,000 rpm e.g. Seagate.

Despite what lots of amateurs claim, in ignorance,
a RAID 0 with multiple (4 x or 8 x) HDDs can really move a lot of raw data
very fast: 250-300MB/second is quite easy, and 500MB/second
is within reach, if you know what to buy and how to configure
your storage subsystem. The key is choosing a RAID controller
that does parity computations in hardware, e.g. Areca, 3Ware
or Highpoint's "enterprise" class controllers (there are others).

Out on the "bleeding" edge you will find Fusion-io's ioDrive Duo
(Google that one), or OCZ's new Z-drive (actually just a
Highpoint RocketRAID with 4 x OCZ MLC SSDs inside the
plastic shell).

Lastly, we've had enormous success with ramdisks
by installing RamDisk Plus from SuperSpeed, LLC
in Sudbury, Massachusetts: www.superspeed.com

A little bit of intelligent memory management, tailored
to your application software, can pay enormous dividends:
Core i7 memory bandwidth BEGINS at 25,000 MB/second
(25GB/second), and goes up from there when triple-channel
DDR3 is overclocked.


This is THE hottest computer hardware available
at the present time for workstation-class machines.
AMD still can't even come close.


hope this helps


MRFS


Hi there,

I have found this thread and really liked you reply. Question is if I ask same question now (mean 2012) what would be reply or what would you change and what would you keep on above mentioned setup?

I'm in similar situation so wondering where to start in terms of building a "supercomputer" for CFD on reasonable budget (5000 $)

thank you




October 15, 2012 5:28:50 PM

MRFS said:
I would start with a careful inventory of all production software
you intend to run on a regular basis.

Let me give you a good example from our daily workload:
after we installed COPERNIC desktop search software
on a quad-core Q6600 workstation, we compared it
head-to-head with a dual-core D 945 in a similar workstation.

The Q6600 finishes just about twice as fast as the D 945
updating the very same 5GB database.

That can only be the result of parallel programming in COPERNIC.

If your software is not coded to exploit multiple cores,
you're wasting your time bulking up on multi-socket systems.
Those are intended chiefly for busy servers with lots of
multi-tasking to process.

After you've completed your analysis of your production software,
the real choice for you is between a single-socket Core i7 machine,
or a twin-socket Xeon with Core i7 architecture.

A single-socket Core i7 machine has 4 CPU cores with hyperthreading:
each socket is THUS capable of running 8 threads simultaneously.

That should be enough for any software you want to throw at it,
particularly if you are only running one copy of your fluid dynamics model
at any given point in time.

Next decision is the clock speed of the Core i7: go for the fastest
because you've got plenty of budget for it, and it was just reported
on the Internet that all Intel Core i7 CPUs are "unlocked" --
meaning they can all be overclocked.

So, buy a motherboard that makes overclocking easy, e.g. ASUS P6T.

As for RAM, X58 chipsets now support either 12 or 24GB of DDR3 RAM:
look into Corsair's high-end Dominator series, with the memory module cooler.
Kingston make engineering samples of 4GB DIMMs x 6 = 24GB total e.g.:

http://www.hexus.net/content/item.php?item=17187


The Corsair power supplies w/ 850 Watts and up should be enough:
they are very highly regarded for high-end workstations like the
one you want to build:

http://www.newegg.com/Product/Product.aspx?Item=N82E168...

I'll leave your graphics hardware choices up to your own good research.

If you are really serious about writing very large data files,
then be sure to give serious consideration to motherboards
and/or RAID controllers that can pump data quickly to and from
SAS (Serial Attached SCSI) hard drives: these come in both
10,000 and 15,000 rpm e.g. Seagate.

Despite what lots of amateurs claim, in ignorance,
a RAID 0 with multiple (4 x or 8 x) HDDs can really move a lot of raw data
very fast: 250-300MB/second is quite easy, and 500MB/second
is within reach, if you know what to buy and how to configure
your storage subsystem. The key is choosing a RAID controller
that does parity computations in hardware, e.g. Areca, 3Ware
or Highpoint's "enterprise" class controllers (there are others).

Out on the "bleeding" edge you will find Fusion-io's ioDrive Duo
(Google that one), or OCZ's new Z-drive (actually just a
Highpoint RocketRAID with 4 x OCZ MLC SSDs inside the
plastic shell).

Lastly, we've had enormous success with ramdisks
by installing RamDisk Plus from SuperSpeed, LLC
in Sudbury, Massachusetts: www.superspeed.com

A little bit of intelligent memory management, tailored
to your application software, can pay enormous dividends:
Core i7 memory bandwidth BEGINS at 25,000 MB/second
(25GB/second), and goes up from there when triple-channel
DDR3 is overclocked.


This is THE hottest computer hardware available
at the present time for workstation-class machines.
AMD still can't even come close.


hope this helps


MRFS


Hi there,

I have found this thread and really liked you reply. Question is if I ask same question now (mean 2012) what would be reply or what would you change and what would you keep on above mentioned setup?

I'm in similar situation so wondering where to start in terms of building a "supercomputer" for CFD on reasonable budget (5000 $)

thank you
October 15, 2012 5:46:41 PM

Holy thread ressurection...

The thing with supercomputers is that they're a series of machines. It's not all one chasis. So, a home setup would likely be a series of midtower boxes and some software to distribute instructions across the cluster. SSDs would increase the throughput to really good levels especially if you have striped arrays.
!