Sign in with
Sign up | Sign in

GPGPU Programming, Where Is It?

Talking Heads: Motherboard Manager Edition, Q4'10, Part 1
By , Chris Angelini

Question: The success of hybrid CPU/GPU designs like Sandy Bridge and Llano is closely tied to GPGPU programming. In the last major tech cycle, system integrators and consumers successfully adopted x86-64 processors and operating systems. Yet, potential benefits have been delayed because programmers, even today, are slow to adopt 64-bit programming. Do you think Intel and AMD can cause a major shift towards general purpose GPU programming within a year of their product launches?

  • AMD will introduce its integrated graphics-equipped CPU in 2011. Intel will do so even earlier than AMD. But users still needs more time to be educated about GPU; only then can they really demand it. Consumers still think discrete graphics provide more performance and functionality.
  • The 64-bit transition has been very slow and gradual. Software is always behind hardware, so we don't believe GPGPU will see any quantum leaps in the next year.
  • Honestly, we have no clue. We are at the mercy of the big three: Intel, AMD, and Nvidia.
  • Typically, our collaboration focuses mainly on implementing compatible hardware designs. We help drive demand through marketing, but driving the direction of demand is not within our scope.
  • Hardware is always faster than software. I think that is what we are seeing with GPGPU [programming].
  • I'm not really sure that it is necessarily slow. We are seeing more 64-bit programming about two years after full x86-64 adoption. If GPGPU [programming] follows suite, we should see more in 2012 or perhaps 2013.

The rise of hybrid processors brings new possibilities. Even on a system armed with integrated graphics, it is possible to see enhanced performance through the addition of some GPGPU programming. Specific tasks can be optimized on the graphics core, and even though systems with the most to gain will be those with powerful discrete graphic solutions, additional processing power can be a boon in environments that benefit most pointedly from parallelism.

By design, our question was meant to solicit the opinions on the speed of GPGPU programming adoption. Lately, progress seems to have ground to a halt (or at least, we're not hearing as much momentum behind apps optimized for CUDA and DirectCompute). Frankly, it is frustrating to see this occur. Reading through the comments from our last survey, readers seem to be in agreement. We are at a point where we have a lot of computer power, but much of the time, we aren't using it.

We also mentioned in the last survey how frustrating it was to see the slow pick-up of 64-bit programming. If you recall the emergence of 64-bit as a feature, both Intel and AMD were actively leveraging that capability as a differentiating feature. Fast forward to today. We are still lacking a concerted effort by the software development community to adopt 64-bit programming--perhaps due to a perceived lack of benefit. We still don't have a 64-bit version of Firefox, and there is no ETA on a 64-bit Flash plug-in. While the benefits of 64-bit in these two scenarios may in fact be negligible, it shows how slow the software community has been in contrast to what today’s hardware provides. Only recently did Adobe update its suite of apps to support a 64-bit architecture, and we’ve already shown the effect of that decision to be massive.

One of the key problems has been a standardized programming layer. Nvidia went with Compute Unified Device Architecture (CUDA). AMD went with Stream. And Microsoft is in the middle with DirectCompute--an attempt to standardize general purpose GPU computing across dissimilar architectures. Similar to the 64-bit extension war, this has delayed GPGPU programming adoption. CUDA was a fairly robust interface from the get-go. If you wanted to do any sort of scientific computational work, Nvidia's CUDA was the library to use. It set the standard. Unfortunately, as with many technologies in the PC industry kept proprietary, this has also limited CUDA's appeal beyond specialized scientific applications, where the software is so niche that it can demand a certain piece of hardware. That's not the case with a transcoding app or a playback utility. Even Adobe seems to have made a brave move by limiting its Mercury Playback Engine to a handful of CUDA-based GeForce and Quadro cards.

Nvidia no doubt wants to keep stressing the GPGPU capabilities rolled up into its Fermi architecture. It even hired the guy (Dr. Mark Harris) who coined the term GPGPU, which stands for "General Purpose computation on Graphics Processing Units." Unfortunately, mainstream adoption isn't going to happen without support from Intel and AMD, who probably have the biggest ability to help augment support for DirectCompute and OpenCL through large development budgets.

We have been playing with some of the CUDA framework and would love to see more mainstream adoption, but we understands the lack of progress. Looking at the big picture, a software developer would have to justify months (maybe even years) of extra programming in CUDA to get some of the GPGPU enhancements. And even then, gains are going to depend on the application.

A single GPGPU coding framework does a lot for adoption, since it allows developers to target any properly-enabled graphics card, and not just one from Nvidia. Again, this makes much more sense in the context of broad adoption. For the moment, CUDA remains the best solution if you have a lot of money, a very specific task able to benefit from parallelism, and the resources to develop with GPGPU in mind. Personally, we are enjoying Jacket for MATLAB. OpenCL and DirectCompute come close, but both give up lower-level hooks into the architecture in favor of compatibility.

Intel and AMD both need to get with the program--particularly AMD. Its much-hyped APUs are right around the corner, and it unquestionably has the advantage with regard to graphics. Intel's solution, at first blush, looks more like an evolutionary afterthought than anything that'll be capable of augmenting its processors. And to be frank, Intel's CPUs are its first priority.

Display all 35 comments.
This thread is closed for comments
  • 4 Hide
    dannyboy3210 , October 20, 2010 7:31 AM
    I seem to have this nagging feeling that discrete graphics options will probably be around for another 10-15 years, at the least.
    If you factor the fact that getting a fusion of cpu/gpu will cost a bit more than a simple cpu, if you plan on doing any gaming at all, why not invest an extra 30$ or so (over the cost of cpu/gpu fusion, not just cpu) and get something that will game like twice as well and likely have support for more monitors to boot?

    Edit: Although after the slow release of Fermi, I bet everyone's wondering what exactly is in store for Nvidia in the near future; like this article says, there seems to be a lot of ambivalence on the subject.
  • 5 Hide
    sudeshc , October 20, 2010 8:38 AM
    I would rather like improvements in chipsets then in CPU GPU they already are doing a wow job, but we need chipsets with less and less limitation and bottlenecks.
  • -3 Hide
    ta152h , October 20, 2010 9:49 AM
    I'm kind of confused why you guys are jumping on 64-bit code not being common. There's no point for most applications, unless you like taking more memory and running slower. 32-bit code is denser, and therefore improves cache hit rates, and helps other apps have higher cache hit rates.

    Unless you need more memory, or are adding numbers more than over 2 billion, there's absolutely no point in it. 8-bit to 16-bit was huge, since adding over 128 is pretty common. 16-bit to 32-bit was huge, because segments were a pain in the neck, and 32-bit mode essentially removed that. Plus, adding over 32K isn't that uncommon. 64-bit mode adds some registers, and things like that, but even with that, often times is slower than 32-bit coding.

    SSE and SSE2 would be better comparisons. Four years after they were introduced, they had pretty good support.

    It's hard to imagine discrete graphic cards lasting indefinitely. They will more likely go the way of the math co-processor, but not in the near future. Low latency should make a big difference, but I would guess it might not happen unless Intel introduces a uniform instruction set, or basically adds it to the processor/GPU complex, for graphics cards, which would allow for greater compiler efficiency, and stronger integration. I'm a little surprised they haven't attempted to, but that would leave NVIDIA out in the cold, and maybe there are non-technical reasons they haven't done that yet.
  • 0 Hide
    sohaib_96 , October 20, 2010 10:43 AM
    cant we get an integrated gpu as powerful as a discrete one??
  • 0 Hide
    Draven35 , October 20, 2010 10:45 AM
    Quote:
    CUDA was a fairly robust interface from the get-go. If you wanted to do any sort of scientific computational work, Nvidia's CUDA was the library to use. It set the standard. Unfortunately, as with many technologies in the PC industry kept proprietary, this has also limited CUDA's appeal beyond specialized scientific applications, where the software is so niche that it can demand a certain piece of hardware.


    A lot of scientific software vendors I have communicated with about this sort of thing actually have been hesitant to code for CUDA because until the release of the Fermi cards, the floating-point support in CUDA was only single-precision floating point. They were *very* excited about the hardware releases at SIGGRAPH...
  • 1 Hide
    enzo matrix , October 20, 2010 12:23 PM
    Odd how everyone ignored workstation graphics, even when asked about them in the last question.
  • 0 Hide
    K2N hater , October 20, 2010 1:10 PM
    That will only replace discrete video cards once motherboards ship with dedicated RAM for video and the CPU allows a dedicated bus for that.

    Until then the performance of the processors with integrated GPU will be pretty much the same as platforms with integrated graphics as the bottleneck will still be RAM latency and bandwidth.
  • 0 Hide
    elbert , October 20, 2010 1:10 PM
    The death of discrete will never occur because the hybrids are limited like consoles. Even if the CPU makers could place large amounts of resources on the hybrid GPU they will be stripped away by refreshes. The margin of error being estimating how many thought motherboard integrated graphics would kill discrete kind of kills the percentages.

    From what I have read AMD's Llano hybrid gpu is about the equal to a 5570. Llano by next year has no chance of killing sales of $50+ discrete solutions. I think they hybrids will have little effect on discrete solutions and your $150+ is off. The only thing hybrid means is potentially more CPU performance when a discrete is used. Another difference will be unlike motherboard integrated GPU's going to waste the hybrids will use the integrated GPU for other tasks.
  • 3 Hide
    Onus , October 20, 2010 1:52 PM
    sohaib_96cant we get an integrated gpu as powerful as a discrete one??

    No. There are [at least] two reasons that come to my mind. The first is heat. It is hard to dissipate that much heat in such a small area. Look at how huge both graphics card and CPU coolers already are, even the stock ones.
    The second is defect rate in manufacturing. As the die gets bigger, the chances of a defect grow, and it's either a geometric or exponential growth. The yields would be so low as to make the "good" dies prohibitively expensive.
    If you scale either of those down enough to overcome these problems, you end up with something too weak to be useful.
  • 4 Hide
    Onus , October 20, 2010 2:22 PM
    elbert...From what I have read AMD's Llano hybrid gpu is about the equal to a 5570. Llano by next year has no chance of killing sales of $50+ discrete solutions...

    Although the reasoning around this is mostly sound, I'd say your price point is off. Make that $100+ discrete solutions. A typical home user will be quite satisfied with HD5570-level performance, even able to play many games using lowered settings and/or resolution. As economic realities cause people to choose to do more with less, they will realize that this level of performance will do quite nicely for them. A $50 discrete card doesn't add a whole lot, but $100 very definitely does, and might be the jump that becomes worth taking.
  • 0 Hide
    kelemvor4 , October 20, 2010 3:10 PM
    I'm surprised about the one comment that the integrated graphics won't even be powerful enough for HD video playback. In my mind, HD video these days is a "basic" functionality.

    Moreover, my concern about integrated graphics is this: given that ALL cpu's will have it, and it won't match the performance of high end GPU's - it's going to drive up costs for everyone buying the new generation of cpu's. And afaik, there's not going to be any alternative.
  • 0 Hide
    arges86 , October 20, 2010 3:26 PM
    The only way integrated graphics works for gamers, is if the Motherboard is able to switch between integrated graphics and discrete seamlessly.
    I've seen the feature touted before, but it doesn't appear to have caught on.
  • 1 Hide
    theoutbound , October 20, 2010 3:30 PM
    Call me crazy, naive or just plain stupid, but I don't think discrete GPUs will ever die. While processors are becoming more powerful on the graphics side all the time, everyone seems to dismiss that graphics cards will follow the same trend. The biggest problem is that CPUs primary focus will always be on processing power and will have limited thermal headroom for graphics processing. Meanwhile, die shrinks on cards will continue to allow more processing power and memory for an increase in performance that I don't think IGPs will ever be able to match. Even if IGPs get to the point where they can play games at acceptable framerates for lower resolutions, there will always be companies that will push the envelope to develop better looking games that need more processing power than IGPs are capable of. Do we really expect Crysis 5 to run as well on Sandy Bridge and Llano 4th gen as well as a GTX 980 or Radeon 12000 series card? Graphics programmers will continue to push effects well past what IGPs will ever be able to do. I don't see integrated graphics ever surpassing their current market. They will be great for notebooks and cheap solutions for anyone who doesn't play games. Anyone who does will always want a discrete solution to push the latest and greatest graphical effects just as they do now.
  • 0 Hide
    redbluur , October 20, 2010 4:55 PM
    I think that it would make more sense to have the cpu change to be more like the gpu then vice-versa. Imagine a cpu that you could put on any motherboard with the correct slot type. All motherboards would have the same standards and a motherboard wouldn't be exclusive to Intel or AMD or Nvidia (if they got in the cpu market). You would not be limited to 1 cpu much like SLI and Crossfire with GPUs. Platforms need to be more flexible in the future. A platform like this would force each company to concentrate on their specialty instead of having them generalize and try to do everything.
  • -1 Hide
    snowonweb , October 20, 2010 6:19 PM
    The way I see it is. We have had onboard graphics and dedicated or discrete cards since the beginning of computing. The balance between them is driven by customer demand and the rest are just variables. Seems certain technologies can make a better case for themselves than other technologies. We already had a trend for CPU power, now its for graphic power. Hard Drive capacity trend is over, Solid Drives trend is just the beginning. If only one solid drive company advertised on TV it would create demand right on the spot but they dont, because Hard Drive TECHNOLOGY is not proprietary discrete like cuda/nvidia or intell is.
    I am sure there is a consortium and standard on Solid Drives but its non for profit unlike nvidia architecture or intel design.
  • -2 Hide
    youssef 2010 , October 20, 2010 7:34 PM
    article--nor will it ever be able to


    I think this resembles "no one will need more than 6xxKB of memory"
  • 1 Hide
    insightdriver , October 20, 2010 8:50 PM
    The future, beyond a decade is notoriously difficult to predict. One only has to look at prior predictions of what the present might be like that were given ten years ago. Looking at the current state of the art in game graphics, it doesn't take a sharp eye to compare that against the current CGI in movies to realize that is where the overall goal is aimed. Eventually there will be CGI on a personal computer in real time. How long it takes, and what architecture is involved is a roll of the dice for anyone at present. Imagine a day when it will look like a real 3-d image in front of us, being generated by a game of some kind. I could see, standing up to walk to the side, to look behind that tree to see if a sniper is hidden there.
  • 0 Hide
    reprotected , October 20, 2010 11:51 PM
    Tesla.
  • 0 Hide
    Anonymous , October 21, 2010 12:18 AM
    YOU GUYS AT TOMS BETTER DO SOME 2D PERFORMANCE TESTS ON THOSE IGP'S!!!
    (talking about the upcoming corei and atom igp's.
    They may prove a significant increase on app responsiveness!
  • 0 Hide
    dEAne , October 21, 2010 2:29 AM
    Thanks you so much tom for this article.
Display more comments