Hey everyone, I've been researching a new machine for the last day, as my present core2 quad core 2.4 GHz with 2.5 GB RAM is barely usable. I've decided to go the high-performance workstation route, but would rather wait until Intel releases its Xeon E5 8-core later this year (or early next) before investing a considerable amount in processors, RAM, and a motherboard. At that time, I'll buy two of those + 24 GB ECC memory. Until then, I've decided to deal with a modest desktop i5 + mobo + 16GB DDR3 1600 MHz memory. Here are the current specs of the machine:
Additionally, I'll be purchasing the fifth monitor from Amazon, a Dell UltraSharp U2711.
Now for how I plan to use the machine:
--- General Usage Notes ---
I don't play games (other than StarCraft 2 occasionally), but I do watch movies/tv shows, which is why I'd like a decent IPS monitor. I do a lot of programming and solid modeling in SolidWorks 2011, which is why I'm considering the Quadro FX 3800. The Quadro NVS 420 has four DisplayPorts, to which I'd connect the four NEC monitors.
I generally tend to run more than 100 processes at a time; I'm a serial multi-tasker, and rarely close windows. I generally consider "closing a window" as a final step in something, and will only do so when I'm convinced I no longer need the application. That's why I tend to have >20 firefox processes with ~10 tabs each running at one time. It's just how my brain works.
Additionally, the type of programming that I do is research-oriented and involves a lot of machine learning and artificial intelligence algorithms, which are notoriously require many FLOPS. At some point, I know the trade-off between a high-end workstation and renting space on an EC3 cluster weighs in favor of the latter, but for everyday data processing and code compilation, I need something that's relatively future-proof.
My biggest concern is maximizing the efficiency of this build; the i5/mobo/RAM combination is temporary, but I need to be able to justify the expense vis-à-vis the future expected expense of dual Xeon E5s. That's part of the reason I chose the 2500K in favor of the 2600K: I just don't want to shell out another $100 for a slight improvement. I had actually chosen the 2500K in favor of the 2400, which I would've preferred, but the -K + X68 mobo combination offered enough of an improvement from the 2400 + cheap ASUS to justify the added (albeit slight) expense. Moreover, I need to be able to salvage EVERYTHING from this build--graphics cards, the dual Samsung 470 SSDs, monitors, psu, case, etc. for the subsequent upgrade.
--- Machine Learning Usage Details ---
I do a lot of statistical analyses on text and sound, and right now I'm working with one text corpus that's about 28 TB in size. I will never need to store the actual data on my external hard drives--I'd probably only need to store a couple gigabytes in the end, but it's the processing of this data that matters.
Mostly, I use support vector machines for classification, k-means for clustering (along with a couple other methods, but k-means is the crudest and simplest for now) and then do a latent semantic analysis. A few of the algorithms take advantage of sparse matrices and involve eigendecomposition thereof. Running these in multiple threads concurrently evince the need for the processor upgrade, since hyperthreading is a waste. That's also where the increased memory requirements (eventually 24 GB ECC ddr3) are most apparent.
--- Solid Modeling Usage Details ---
In addition to the machine learning requirements, I do a lot of solid modeling and am presently working on an automated book scanner to help with data aggregation and input. That's more of a side project, but this device requires performing some complex simulations (animations) before I actually commit to manufacturing the device. I also do PSPICE simulations for electrical devices, and will need something that will remain useful when I move on to do my Ph.D. in electrical engineering next fall.
--- Audio Usage Details ---
I briefly mentioned audio processing in the above machine learning section, but one side hobby is music composition, and I'd love a system that's able to support Sonar and Gigastudio. I haven't begun searching for a GSIF2 sound card yet, but will eventually move into that arena once I have some more cash.
--- Virtualization Usage Details ---
Because I develop software professionally, I have 10 "clean" environments in which I test things out. These run Windows XP, Vista, and 7 with a variety of software configurations, and also run Ubuntu linux as well as another instance of Windows 7 for RealVNC remote collaboration with co-workers.
--- Final Notes ---
Any advice on ways to cut my current expenses would be appreciated. I did my best with the monitors, but they're still pretty expensive. Obviously I don't need more than one IPS monitor, but even that one is questionable. I figure that having one for color-correction stuff (when I do web design, GUI development, play the occasional game, and watch movies) is enough justification for the Dell U2711. The NECs are just a relatively cheap, decently-performing monitor for my text-based applications.
I'm also spending a little more than I'd like on the SSD drives. I'd intend to connect the Samsungs in a RAID0 array to improve the transfer rates (they're 250 MB/s read and 220 MB/s write presently, and in RAID0 that nearly doubles). However, because of my software consumption, visual studio usage, and virtualization setup, I could see myself using 512 GB pretty quickly.
Thanks for your time, everyone!
Additionally, I'll be purchasing the fifth monitor from Amazon, a Dell UltraSharp U2711.
Now for how I plan to use the machine:
--- General Usage Notes ---
I don't play games (other than StarCraft 2 occasionally), but I do watch movies/tv shows, which is why I'd like a decent IPS monitor. I do a lot of programming and solid modeling in SolidWorks 2011, which is why I'm considering the Quadro FX 3800. The Quadro NVS 420 has four DisplayPorts, to which I'd connect the four NEC monitors.
I generally tend to run more than 100 processes at a time; I'm a serial multi-tasker, and rarely close windows. I generally consider "closing a window" as a final step in something, and will only do so when I'm convinced I no longer need the application. That's why I tend to have >20 firefox processes with ~10 tabs each running at one time. It's just how my brain works.
Additionally, the type of programming that I do is research-oriented and involves a lot of machine learning and artificial intelligence algorithms, which are notoriously require many FLOPS. At some point, I know the trade-off between a high-end workstation and renting space on an EC3 cluster weighs in favor of the latter, but for everyday data processing and code compilation, I need something that's relatively future-proof.
My biggest concern is maximizing the efficiency of this build; the i5/mobo/RAM combination is temporary, but I need to be able to justify the expense vis-à-vis the future expected expense of dual Xeon E5s. That's part of the reason I chose the 2500K in favor of the 2600K: I just don't want to shell out another $100 for a slight improvement. I had actually chosen the 2500K in favor of the 2400, which I would've preferred, but the -K + X68 mobo combination offered enough of an improvement from the 2400 + cheap ASUS to justify the added (albeit slight) expense. Moreover, I need to be able to salvage EVERYTHING from this build--graphics cards, the dual Samsung 470 SSDs, monitors, psu, case, etc. for the subsequent upgrade.
--- Machine Learning Usage Details ---
I do a lot of statistical analyses on text and sound, and right now I'm working with one text corpus that's about 28 TB in size. I will never need to store the actual data on my external hard drives--I'd probably only need to store a couple gigabytes in the end, but it's the processing of this data that matters.
Mostly, I use support vector machines for classification, k-means for clustering (along with a couple other methods, but k-means is the crudest and simplest for now) and then do a latent semantic analysis. A few of the algorithms take advantage of sparse matrices and involve eigendecomposition thereof. Running these in multiple threads concurrently evince the need for the processor upgrade, since hyperthreading is a waste. That's also where the increased memory requirements (eventually 24 GB ECC ddr3) are most apparent.
--- Solid Modeling Usage Details ---
In addition to the machine learning requirements, I do a lot of solid modeling and am presently working on an automated book scanner to help with data aggregation and input. That's more of a side project, but this device requires performing some complex simulations (animations) before I actually commit to manufacturing the device. I also do PSPICE simulations for electrical devices, and will need something that will remain useful when I move on to do my Ph.D. in electrical engineering next fall.
--- Audio Usage Details ---
I briefly mentioned audio processing in the above machine learning section, but one side hobby is music composition, and I'd love a system that's able to support Sonar and Gigastudio. I haven't begun searching for a GSIF2 sound card yet, but will eventually move into that arena once I have some more cash.
--- Virtualization Usage Details ---
Because I develop software professionally, I have 10 "clean" environments in which I test things out. These run Windows XP, Vista, and 7 with a variety of software configurations, and also run Ubuntu linux as well as another instance of Windows 7 for RealVNC remote collaboration with co-workers.
--- Final Notes ---
Any advice on ways to cut my current expenses would be appreciated. I did my best with the monitors, but they're still pretty expensive. Obviously I don't need more than one IPS monitor, but even that one is questionable. I figure that having one for color-correction stuff (when I do web design, GUI development, play the occasional game, and watch movies) is enough justification for the Dell U2711. The NECs are just a relatively cheap, decently-performing monitor for my text-based applications.
I'm also spending a little more than I'd like on the SSD drives. I'd intend to connect the Samsungs in a RAID0 array to improve the transfer rates (they're 250 MB/s read and 220 MB/s write presently, and in RAID0 that nearly doubles). However, because of my software consumption, visual studio usage, and virtualization setup, I could see myself using 512 GB pretty quickly.
Thanks for your time, everyone!