Ive been hearing about this for a while now, and i heard ATI Stream is a form of GPGPU. But i wanted to know, is it even available yet? If so, how do you get it, and if not when is it coming? Also what is nvidia's gpgpu and will it work with my setup if its out or coming?
What? Did you visit the Cuda zone and look at the apps there?
Apps which are far more geared towards showing off than actually doing something. The vast majority of consumer uses are completely unaffected by GPGPU at this point, and I don't see that changing anytime soon.
Look here http://techreport.com/discussions.x/17736
"The companies will work together to accelerate many computationally intensive tasks in CyberLink applications, such as video transcoding, automated facial recognition and tagging, video editing and processing applications with ATI Stream technology using the full specification capabilities of the new Windows 7 DirectCompute API. "
GPGPU computations are primarily for highly parallel applications. Unlike a CPU where you may be able to run a few threads at a time, CUDA/OpenCL (I'll use these ones as the example since you have a 295) are setup to take advantage of many thread executions on the same programming kernel. Any set of threads created for a GPU must be running the exact same kernel on every bit of data passed. Once all those threads are done executing, you can start a new kernel and accompanying threads.
Data transfers to and from the card are typically the main bottleneck so ideally you want the all the needed data (both input and output) to stay on the card as long as possible. I could be wrong but I believe you can switch kernels without changing the memory so that you can start a new execution using data from the previous step without a new memory transfer as long as you don't need different data.
For highly parallel applications, the increase in computation power is at least a ten-fold (likely a lot more on the beast you have) over a CPU. Real world is usually significantly less depending on how well you can adapt your methods to the GPU model. nVidia's sample programs will give you an idea of the real world increases for some sample programs. They range anywhere from 2x to 1500x faster than the originals.
In summary, unless your applications in C++ are highly paralleliazable, it's unlikely you'll see much of a boost switching to CUDA/OpenCL. As WR2 pointed out, CUDAZone is a great resource for learning the general idea and program structures. It takes some getting used to but it is a nice step up from old GPGPU computing.