The video card has evolved now to be termed the GPU, thanks to the growing capability of the hardware. Now the GPU is about to take its next big leap to becoming specialized GPGPU (of course, we realize that the term specialized and general purpose are some what contradictory).
Nvidia is betting heavily on GPGPUs becoming a large need in the computing market. While we'll still need our GPUs to push our pixels for our 3D games, Nvidia has just revealed its next-generation CUDA architecture, codenamed "Fermi."
Nvidia bills Fermi as an entirely new ground-up design that will finally realize the potential of GPU computing. Although Nvidia made big steps with its G80 and later the GT200, the graphics maker has made Fermi a much more pleasant and useful tool for programmers.
“The first two generations of the CUDA GPU architecture enabled Nvidia to make real in-roads into the scientific computing space, delivering dramatic performance increases across a broad spectrum of applications,” said Bill Dally, chief scientist at Nvidia.
“It is completely clear that GPUs are now general purpose parallel computing processors with amazing graphics, and not just graphics chips anymore,” said Jen-Hsun Huang, co-founder and CEO of Nvidia. “The Fermi architecture, the integrated tools, libraries and engines are the direct results of the insights we have gained from working with thousands of CUDA developers around the world. We will look back in the coming years and see that Fermi started the new GPU industry.”
At the unveil event, Nvidia did not give anything away in terms of clock speeds or any of the other specifications that hardcore 3D gamers focus on. Instead, it talked about technical features that lend themselves specifically for GPU computing. Such technologies include:
- C++, complementing existing support for C, Fortran, Java, Python, OpenCL and DirectCompute.
- ECC, a critical requirement for datacenters and supercomputing centers deploying GPUs on a large scale
- 512 CUDA Cores featuring the new IEEE 754-2008 floating-point standard, surpassing even the most advanced CPUs
- 8x the peak double precision arithmetic performance over Nvidia’s last generation GPU. Double precision is critical for high-performance computing (HPC) applications such as linear algebra, numerical simulation, and quantum chemistry
- Nvidia Parallel DataCache - the world’s first true cache hierarchy in a GPU that speeds up algorithms such as physics solvers, raytracing, and sparse matrix multiplication where data addresses are not known beforehand
- Nvidia GigaThread Engine with support for concurrent kernel execution, where different kernels of the same application context can execute on the GPU at the same time (eg: PhysX fluid and rigid body solvers)
- Nexus – the world’s first fully integrated heterogeneous computing application development environment within Microsoft Visual Studio
Oak Ridge National Laboratory (ORNL) has already announced plans for a new supercomputer that will use Fermi to research in areas such as energy and climate change. ORNL’s supercomputer is expected to be 10-times more powerful than today’s fastest supercomputer.
“This would be the first co-processing architecture that Oak Ridge has deployed for open science, and we are extremely excited about the opportunities it creates to solve huge scientific challenges,” Jeff Nichols, ORNL associate lab director for Computing and Computational Sciences said. “With the help of Nvidia technology, Oak Ridge proposes to create a computing platform that will deliver exascale computing within ten years.”
Nvidia did reveal that its upcoming Fermi GPU will pack 3 billion transistors, making it one mammoth chip – bigger than anything from ATI. Of course, the aspirations of Nvidia in the GPU space are far more ambitious than that of AMD. It'll be interesting to see if and how the two head-to-head rivals diverge from the focus on 3D gaming technologies to greater GPGPU application.