Nvidia just released CUDA 4.1 Toolkit, which integrates, for the first time, the company's LLVM (Low Level Virtual Machine) compiler.
CUDA 4.1 also includes more than 1,000 new imaging and signal processing functions in the Performance Primitives (NPP) library, which now covers more than 3,200 functions in total. Nvidia claims that the NPP delivers 40 percent greater performance than Intel's IPP.
The Visual Profiler has been redesigned and now offers an automated expert system to give that provides step-by-step instructions to fine-tune CUDA code. Additionally, the new CUDA toolkit integrates version 2.1 of Parallel Nsight, a collection GPU developer tools for Visual Studio.
CUDA 4.1 can be downloaded from Nvidia's website.