During GTC 2013 in San Jose, Neil Trevett, VP of Mobile Content at Nvidia, spent around thirty minutes explaining why mobile devices can benefit from GPU computing. He started off with an example of an augmented reality app that simply adds information in real-time in an overlay. What it can't do is create objects in 3D – and in real time – within the physical space.
But a Tegra 4-powered notebook can thanks to CUDA. In the next example, he showed a refractive virtual glass sitting on an actual table. It looked real enough to hold – a correct refraction of the hand in the glass was obtained by using a reprojection method. Thus GPU compute can bring high-quality reflections, refractions and caustics to augmented reality apps offered on Tegra-based devices.
"The application is using the inherent processing power of the GPU so that it's anti-aliasing to really up the quality of the scene," he said. "The caustic reflections are all being generated in real-time using a quite sophisticated CUDA program."
He then went on to recap what was revealed in the keynote, that Tegra 4 devices are shipping this year while silicon samples of the Keplar-packed "Logan" are being distributed to device makers. He said the full quality robust drivers Nvidia has been working on for many years on the desktop will be made available for Tegra as well including CUDA 5.0 and OpenGL 4.3.
"So all the graphics and compute functionality we've been using through the desktop APIs will be made available on Tegra platforms," he said. Trevett added that after Logan claws into mobile devices, Parker will swing into action, based on a 64-bit Denver CPU and a Maxwell GPU which provides shared virtual memory between CPUs and CPUs.
Obviously on the mobile front, cramming all this power into an SoC is great for the mobile gamer. But you also have to consider power consumption as well. He said the "process fairy" is giving device makers more wonderful transistors to provide performance, but not in a low-power way.
"The key problem is that the leakage voltage has reached a point where you can't reduce the threshold voltage any further," he said. "So in the good old days, you halved the geometry, you get four times the number of transistors and the frequency would double, but the voltage would half so your power would stay the same. Now we can't reduce the voltage because of leakage, and so we can have the four times number of transistors with a half process shrink, but we have to suffer four times the power."
So how do you build chips to deliver the level of performance developers need in a mobile environment in the existing power envelope? Form factors will always be locked in their strict power envelopes no matter how battery technology improves – go over that limit and devices get too hot to use. A 4- to 5-inch screen has a thermal envelope ranging between 2W and 4W whereas a 10-inch tablet ranges between 6W to 10W (the screen itself takes between 1W and 2W).
To deliver GPU Compute within these power budgets, you need to reduce data movement across the silicon. "We've reached a point with the next two generations of silicon where we can put down more transistors on a die than we can afford to turn on at the same time. If we turned on all possible transistors on an SoC, we will exceed the thermal envelope. So now we have this concept of dark silicon where there is silicon that's not being utilized all the time. But you can put down gates – dedicated hardware blocks – that will be used when they need it and at no other time."
Thus dedicated hardware units can increase locality and parallelism of computation, thereby reducing power consumption while boosting performance. He said mobile SoCs will now begin to mix and match CPUs, GPUs and dedicated hardware, meaning in certain applications, the software will be taken off the CPU and dumped on the GPU, and in certain specialized situations, dumped onto the previously unused hardware as well.
Naturally all of this talk of SoC design builds the foundation upon which GPU Compute is made possible. On a mobile device, GPU Compute will allow for computational photography (real-time HDR processing and more), face, body and gesture tracking, 3D scene and object reconstruction, and richer augmented reality experiences. Naturally Nvidia is using the conference to pool additional ideas from developers.