GTC 2013: Nvidia CUDA Enables Richer Experiences on Mobile
Neil Trevett, VP of Mobile Content at Nvidia, talks about GPU Compute on Tegra and how silicon needed to change to enable it.
During GTC 2013 in San Jose, Neil Trevett, VP of Mobile Content at Nvidia, spent around thirty minutes explaining why mobile devices can benefit from GPU computing. He started off with an example of an augmented reality app that simply adds information in real-time in an overlay. What it can't do is create objects in 3D – and in real time – within the physical space.
But a Tegra 4-powered notebook can thanks to CUDA. In the next example, he showed a refractive virtual glass sitting on an actual table. It looked real enough to hold – a correct refraction of the hand in the glass was obtained by using a reprojection method. Thus GPU compute can bring high-quality reflections, refractions and caustics to augmented reality apps offered on Tegra-based devices.
"The application is using the inherent processing power of the GPU so that it's anti-aliasing to really up the quality of the scene," he said. "The caustic reflections are all being generated in real-time using a quite sophisticated CUDA program."
He then went on to recap what was revealed in the keynote, that Tegra 4 devices are shipping this year while silicon samples of the Keplar-packed "Logan" are being distributed to device makers. He said the full quality robust drivers Nvidia has been working on for many years on the desktop will be made available for Tegra as well including CUDA 5.0 and OpenGL 4.3.
"So all the graphics and compute functionality we've been using through the desktop APIs will be made available on Tegra platforms," he said. Trevett added that after Logan claws into mobile devices, Parker will swing into action, based on a 64-bit Denver CPU and a Maxwell GPU which provides shared virtual memory between CPUs and CPUs.
Obviously on the mobile front, cramming all this power into an SoC is great for the mobile gamer. But you also have to consider power consumption as well. He said the "process fairy" is giving device makers more wonderful transistors to provide performance, but not in a low-power way.
"The key problem is that the leakage voltage has reached a point where you can't reduce the threshold voltage any further," he said. "So in the good old days, you halved the geometry, you get four times the number of transistors and the frequency would double, but the voltage would half so your power would stay the same. Now we can't reduce the voltage because of leakage, and so we can have the four times number of transistors with a half process shrink, but we have to suffer four times the power."
So how do you build chips to deliver the level of performance developers need in a mobile environment in the existing power envelope? Form factors will always be locked in their strict power envelopes no matter how battery technology improves – go over that limit and devices get too hot to use. A 4- to 5-inch screen has a thermal envelope ranging between 2W and 4W whereas a 10-inch tablet ranges between 6W to 10W (the screen itself takes between 1W and 2W).
To deliver GPU Compute within these power budgets, you need to reduce data movement across the silicon. "We've reached a point with the next two generations of silicon where we can put down more transistors on a die than we can afford to turn on at the same time. If we turned on all possible transistors on an SoC, we will exceed the thermal envelope. So now we have this concept of dark silicon where there is silicon that's not being utilized all the time. But you can put down gates – dedicated hardware blocks – that will be used when they need it and at no other time."
Thus dedicated hardware units can increase locality and parallelism of computation, thereby reducing power consumption while boosting performance. He said mobile SoCs will now begin to mix and match CPUs, GPUs and dedicated hardware, meaning in certain applications, the software will be taken off the CPU and dumped on the GPU, and in certain specialized situations, dumped onto the previously unused hardware as well.
Naturally all of this talk of SoC design builds the foundation upon which GPU Compute is made possible. On a mobile device, GPU Compute will allow for computational photography (real-time HDR processing and more), face, body and gesture tracking, 3D scene and object reconstruction, and richer augmented reality experiences. Naturally Nvidia is using the conference to pool additional ideas from developers.

I wouldn't be so sure. Tegra 4 and more are looking like Nvidia is really getting serious.
That's fine they can be very serious the question is will the market space take them serious?
That's fine they can be very serious the question is will the market space take them serious?
It might be too late for Nvidia to really get anywhere after how bad Tegra and Tegra 2 were along with how mediocre Tegra 3 was. Tegra 4 doesn't seem to be getting much promised adoption yet, but it's competitors seem to be getting plenty, especially Qualcomm's Snapdragon 600. However, that may change once things get closer to and beyond launch date. We'll have to wait and see.
http://semiaccurate.com/2013/02/18/nvidias-telegraphs-tegras-woes-at-ces/#.UUpO9l9zZaQ
which probably means that Logan is still on 28nm .
That said, Tegra4 has addressed previous Tegra products' main bottleneck, which was the memory, being single channel. Here's to hoping for more good things to come, which is always a win for consumers.
They looked just as serious when presenting the previous Tegras. Time will tell if Tegra4 can finally supersede the likes of Snapdragon Adreno.
I have big doubts that anyone from the OEM industry is looking at Tegra4 as a serious offering.
They looked just as serious when presenting the previous Tegras. Time will tell if Tegra4 can finally supersede the likes of Snapdragon Adreno.
IDK about the graphics portion, but the CPU on Tegra 4 is a quad core Cortex A15 at a high frequency. It should beat all but the upcoming Snapdragon 800 and even then, it should roughly meet it. I'll wait for more official benchmarks and such before passing judgment either way.
Also, for all those saying Tegra has been a failure, they sold ~740mil worth of them last year
A quarter of the games crashed on nexus 4/qcom and all looked the best or equal to the others at worst on Tegra hardware in the Jan article here on toms reviewing tegra optimizations. That's no fluke. Qcom sucked and only apple kept up. Guess what's in apple? An old gpu card maker's chip. See the point? Their problem is only apple makes money of this. Powervr is struggling and had to borrow just to pay 100mil for MIPS cpu tech. OUCH.
http://www.tomshardware.com/reviews/nvidia-tegra-android,3371.html
You can't judge T4 adoption until an OEM can actually get them in volume (june) to announce a product launch. I expect a nexus 10 announcement in the next 3 months. S800 has a slower gpu, and A15's work faster in tablets too. If rumors are true google is afraid of any more samsung domination (you don't want ONE big oem pushing you around like an apple does) so picking Octa (good or not chip wise) is stupid if you're afraid of domination of one oem. Tegra based on perf becomes a no brainer in the tablet if you rule out samsung and are interested in making gaming important on your device. There isn't a throttle at all like a phone for A15 (which remains to be proven true, octa with a15 is in international S4). Unless qcom is lying about the perf of their A330 vs. 320 it will LOSE to T4 and T658 and likely the rogue6 also. The A330 won't be much faster than a mali T604 and slower than a 554mp4 in ipad4. I will be truly shocked if they manage to beat that chip based on their own claims so far.
Top that stuff off with ouya (which will upgrade to yearly tegra revs), splashtop (pc games on your tablet, only for tegra so far), wikipad, shield etc and you get a fairly large ecosystem aimed solely at gamer's experiences on these things. That's a lot of things pushing tegra optimizations that automatically filter to your device. Where is qcom in that race? The modems have made qualcomm what they are today, but now gaming will make you who you are tomorrow
http://www.hardocp.com/article/2013/03/04/2012_amd_nvidia_driver_performance_summary_review/
This is who the competition is dealing with in gaming. It took AMD ~9-10 months and never settle drivers to finally get their hardware optimized. NV was already there. Though I'd argue they could get more from their hardware there was just no need then as AMD was behind all year until Nov when they caught up. But again, this isn't the case now as NV has upped perf in most AAA titles in the last 3-4 driver revs to fix this (if you're not pushing them they'll just sit as everyone does when in the lead these days). NV didn't start to optimize until never settle:
http://www.geforce.com/whats-new/articles/nvidia-geforce-310-70-whql-drivers-released
http://www.geforce.com/whats-new/articles/nvidia-geforce-313-96-beta-drivers-released
"Nvidia says users should expect a 27 percent gain in graphics performance while playing Assassin’s Creed III, 19 percent in Civilization V, and 14 percent for both Call of Duty: Black Ops II and DiRT 3. Just Cause 2 improves by 11 percent, and Deus Ex: Human Revolution, F1 2012, and Far Cry 3 all improve by 10 percent."
http://www.pcgamer.com/2013/01/28/nvidia-313-95-beta-drivers-improve-performance-for-crysis-3-multiplayer-beta-assassins-creed-iii-far-cry-3/
http://www.geforce.com/whats-new/articles/nvidia-geforce-314-07-whql-drivers-released
http://www.geforce.com/whats-new/articles/nvidia-geforce-314-14-beta-drivers-released
You can dig back further to see them do this monthly basically since never settle across pretty much every popular game for the last 2yrs. That's dec,jan, feb and march. They just announced 314.21 beta for tomb raider, and 314.14 was for simcity, starcraft2 heart of the swarm, resident evil6 and hawken (among others enhanced, black ops 2 etc). Note we had nothing until never settle...LOL. NV had no need to release more power from their cards until AMD caught up (which wasn't just never settle, also clocking the cards up). Note they usually optimize BEFORE your game ships. AMD takes ages, and I note that as a fairly unhappy radeon 5850 owner! These days I just wait 6 months before I fire up the latest game I want to play to avoid the wait and issues. Hardocp read the data wrong. NV wasn't optimized last year, they just had no reason to do more with what they had already beating AMD as shown by 4 months of oops, we had some stuff left after all
I don't see Qcom or Samsung winning the coming gaming war. Even if they compete this year, it's a totally different game with kepler tech and already enhanced perf before the soc even gets near silicon production with T5. Nobody else has PC gaming on the mobile devices either (yet?). I doubt steambox will be anything but NV with them being linux tops in drivers also now. Check all the releases recently:
http://www.nvidia.com/object/linux_amd64_display_archive.html
http://linux.slashdot.org/story/12/11/06/204238/nvidia-doubles-linux-driver-performance-slips-steam-release-date
Note they worked with valve for over a year on the drivers and steam. Hmm...NV in steambox or what was the point of all this? Well steam on shield, but with linux support starting to be big now it has to be steambox too.
I can see NV doing a deal to get pc gaming on a nexus 10 tegra device just to seal up the contract. Shield would have 4-5 months of sales under it and wouldn't hurt much then and further solidify gpu sales tie-ins. People need to look at the big picture, as this isn't the same race going forward. NV (and even AMD) has already pushed 3dfx, powervr, trident, Intel, S3, SIS, XGI, Cirrus Logic, Oak Tech, 3dlabs, Number Nine, WD (yep, WD used to make gpu/cards I owned a paradise card...LOL) etc into the PC GPU dustbin ages ago.
https://en.wikipedia.org/wiki/List_of_defunct_graphics_chips_and_card_companies
Check out the list that died while AMD/Nvidia went on a rampage
Game optimizations and better drivers did this. Everything NV has done so far with tegra was just a delay until the modem (T4/T4i finally will make money entering phones), and finally kepler tech/drivers on soc (and maxwell after that). If you ask who has been the gaming gods for the last 10yrs you won't hear anything but AMD/ATI or NV. That's it.
How anybody thinks Qcom or Samsung suddenly takes this situation and reverses it needs to be explained in detail please. I don't see it. NV entered their world and has survived. Let me know when samsung/qcom enter the discrete gpu world of AMD/NV and do the same
On a humorous note: Does anyone realize Jen Hsun used to make cpu's at AMD?...LOL. I'm guessing AMD wishes they'd have given him a raise to keep him
http://thenextweb.com/facebook/2013/03/15/glassdoor-employees-rank-facebooks-mark-zuckerberg-as-ceo-of-the-year-apples-tim-cook-loses-top-spot/
Intel still seems to be managing to cut power and voltage. And it seems like i was right about ARM hitting a wall with efficiency.
Not quite sure what you mean yet, they're upping frequency every new chip rev. 2.3ghz, up from 1.9, up from ~1.5 etc. They may throttle some, but you can throttle my chip all day as long as I win the benchmark doing it