Nvidia Does Accelerated Programming
Santa Clara (CA) - Back when we first saw the capabilities of Nvidia’s CUDA technology and Tesla acceleration cards, it was clear to us that the company had all the tools necessary to change the way we use computers today - the enormous computing horsepower of graphics cards open up possibilities we have talked about for some time, but didn’t think were possible in the foreseeable future. The company now challenges developers for the first time to exploit the hidden potential graphics cards in a mainstream application.
Nvidia was first to come up with a development framework that used a relatively easy-to-learn way to accelerate traditional CPU-centric applications through a graphics processor. But while CUDA, which is based on C++ and has some GPGPU extensions, is generally available, Nvidia pitched the technology mainly to universities, scientists and industries that have a need for floating-point-heavy applications - such as financial institutions and the oil and gas sector.
Both Nvidia and ATI have been showing mainstream applications based on GPGPU technologies, but neither one has targeted the mainstream application segment yet. When we asked Nvidia CEO Jen-Hsun Huang when Cuda would go into the mainstream market, he told us that such a move would depend on Microsoft and their efforts to provide a Windows interface for GPGPUs.
It appears that Nvidia is shifting its enterprise-only strategy and is turning its focus on a mainstream opportunity as well. In a contest announced today, the company looks for the "most talented CUDA programmers in the world". Nvidia will provide a "partially GPU-optimized version of an MP3 LAME encoder" and asks developers to "optimize [the software] to run as fast as possible on a Cuda-enabled GPU." The encoder has to be created in the CUDA programming environment and must achieve a speed up in run-time.
So, the challenge in this contest is not to port a mainstream application to CUDA, but rather optimize it to squeeze as many gigaflops out of the GPU as possible. That challenge may sound easier than it really is, as we were told before by researchers at University of Illinois’ Beckman Institute and the National Center for Supercomputing Applications that getting an application to run on a GPGPU is the simple task, while accelerating it takes up most of the time - and knowledge.
Those scientific GPGPU applications simulating fluid dynamics or biological processes are impressive to watch, but of course we are interested to see what these processors are capable of in mainstream applications. AMD previously demonstrated its stream processors in an application that rendered a user’s hand, which was captured by a webcam, in near real-time and replaced the mouse as moving around objects on a screen.
Optimizing an MP3 encoder is far from the sophistication of such an application, but it is a first step.
jk
while it would be great to take advantage of the gpu horse power, especially in fpu intensive processing, i dont see the gpu completely replacing the processor anytime soon...i am an artist that does a lot of music and video, and it would be great to offload a lot of the processing, but when i am running word or surfing the internet i dont need my computer eating quite as many watts as playing cod4...
-c
We're past the days where we can just raise the clock speed. New programming models are necessary. Homogeneous multi-core designs (e.g. Larabee) will fall short. Heterogeneous multi-core (many different types of cores) will dominate in the future. Although the bandwidth of the PCIe 2.0 bus is very capable, the latency of this bus will be an issue. The best designs will have all the different types of cores on the same chip. So while NVidia has a great development tool with CUDA, hardware designs along the lines of AMD's Fusion may be the way of the future.
They can't do a damn thing about it if the nVidia's programming model gets adopted and becomes a de facto standard before Intel has a chance to unveil its own model with Larabee. Just like it happened with AMD64 by the time Intel wanted to implement their own 64bit instruction set, the AMD one was already supported by Windows, Linux, Unix, Solaris and many more and none of the software companies wanted to support yet another standard that is different but basically offers the same thing.
nVidia is wise on this, it knows that they must push GPU computing into mainstream before Intel has a chance to do it with Larabee, unfortunately to succeed in doing so they will need support from the software giants like Microsoft, Sun, Oracle, the Linux crowd and alike. I don't think that just providing a CUDA development environment will be enough, they might need OS support at the core (something which Intel will likely manage to obtain shortly after they release Larabee).
Some tasks require more number-crunching capability than those CPU's can muster, and it would take 10's or even 100's of them to even begin to match the capability of a few GPU's..imagine the power consumption not to mention the footprint.
I don't think all can and should, be ported - it's too complex and in some cases completely needless. I think you'll still have powerful CPU's just that they'll act as bridges/interfaces rather than act as the soul number-crunching device. Closest I ever saw to this 'transputer' type hardware was the Amiga range of computers, that had multi-tasking built into the hardware, and those systems were a joy to use. I'd like to see a similar thing happen on the PC, and would not mind buying a GPGPU chip (or several), to speed up my applications, but I don't think we'll see it just yet, not in mainstream use anyway. Too many conflicting interests here, most of which are of a commercial nature..