Can OpenGL And OpenCL Overhaul Your Photo Editing Experience?

Q&A: Under The Hood With Adobe, Cont.

Tom's Hardware: Are we anywhere close to saturating 16 lanes of second-gen PCIe for image editing operations?

Russell Williams: I don't have numbers off the top of my head, but think of a 16-megapixel DSLR image. Say you want to do something, like modifying the tilt of the blur plane in the blur gallery, and you want to get feedback in real-time—30 to 60 FPS. Then you have to composite the result with 50 other layers, and that compositing needs to be done back on the CPU, because the entire compositing engine isn't done on the GPU. So copying data back at 60 FPS, you're copying the full image that's being processed two or three times per frame. Suddenly, that PCIe doesn't look as fast as you originally thought.

Or look at it from a different point of view. Regardless of whether PCIe is fast enough, what matters is how fast it is compared to how fast the computation out on the card is. If the on-card computation takes half as long as before, the trip across the bus can mean that you only sped up the entire thing by 10% or so. I have a pithy metaphor: it's like driving to New York to make a sandwich.

Tom's Hardware: Say what?

Russell Williams: If you want to make a sandwich, and you invent a machine that can make your sandwich in two seconds, it still doesn't make sense to drive to New York to use the machine when you live in California. The shorter latency of the APU empowers us to use the GPU in all sorts of ways that don't make sense for discrete graphics. Really, the APU is a new kind of compute device. In the future, it's likely our code will have quite a few cases where it says "if discrete GPU, use discrete" but quite a few more that say "if APU, use APU."

Tom's Hardware: What about the future of shaders in a time of OpenCL and similar APIs. Adobe has taken a proprietary approach with Pixel Bender, but do you see this continuing as the market shifts to open standards?

Russell Williams: Shaders have a very solid future. Graphics APIs like OpenGL and DirectX are not going anywhere. OpenGL with custom shaders still provides the best solution for problems that are similar to 3D rendering, like 3D rendering in Photoshop or the Liquify filter. Now, I can’t speak for Adobe on this, but my own opinion is that GPGPU programming has come a long way since Adobe started Pixel Bender, and now that there's an industry standard—OpenCL—that addresses this area, we're adding more emphasis to that. We're members of Khronos, and we'll be contributing the experience we gained designing and building Pixel Bender to help improve future versions of OpenCL.

Tom's Hardware: My own impression is that many people still view CPUs with integrated graphics—APUs—as a budget solution. Maybe it’s just a habit from so many years of suffering with graphics-equipped Intel northbridges, I don’t know. But today...has the market shifted? Is APU and heterogeneous architecture really a game-changer?

Russell Williams: There are different sources of compute power in the box. It used to be there was just one—the CPU—and you wrote in C to use that resource. Now, a great deal of power is in the GPU, but it’s only suited for some problems. And a great deal of the CPU is in multiple cores and compute units, like vector units, which are only good at certain problems. In order to use the compute resources and utilize the performance of the machine, you have to use all the different kinds of units and resources in the machine. You have to "light up" all these things at the same time, with the CPU, GPU, vector units, and so on all doing the things they're best at. We're trying to use them all at once to give the user the most responsive experience. We're trying to move away from “fill out a dialog box, click OK, and watch the progress bar” to a more game-like, cinematic FPS experience, where you modify the image directly and get immediate feedback. The only way to do that is to utilize all the compute resources.

The significance of having integrated performance plus highly capable graphics is it moves this capability into more platforms. Many platforms that don't have the space, cost, or power budget for discrete. The APU-based solutions give you a tremendous potential performance boost in those environments. The other critical impact of APU is performance. We have a fixed power budget, and we don't know how to make a CPU go faster in a significant way on that power budget. We've seen the last of the 50% per year performance boosts on the CPU side. And we're not going to just keep scaling cores—it’s too difficult to make use of them. The number of programs that could really take advantage of a 24-core single-socket CPU is near zero. So the GPU is essentially the path to bring that transistor budget to users in a way that can be used.

I think that GPGPU and APUs are just beginning to deliver on the promise that many people have seen in them for many years. We'll see a lot more advantage taken of that, not just in Photoshop, but in other Adobe apps over the next couple of versions.

  • ilysaml
    Now Adobe uses both CUDA and OpenCL that's superb.
    Reply
  • alphaalphaalpha1
    Tahiti is pretty darned fast for compute, especially for the price of the 7900 cards, and if too many applications get proper OpenCL support, then Nvidia might be left behind for a lot of professional GPGPU work if they don't offer similar performance at a similar price point or some other incentive.

    With the 7970 meeting or beating much of the far more expensive Quadro line, Nvidia will have to step up. Maybe a GK114 or a cut-down GK110 will be put into use to counter 7900. I've already seen several forum threads talking about the 7970 being incredible in Maya and some other programs, but since I'm not a GPGPU compute expert, I guess I'm not in the best position to consider this topic on a very advanced level. Would anyone care to comment (or correct me if I made a mistake) about this?
    Reply
  • A Bad Day
    How many CPUs would it take to match the tested GPUs?
    Reply
  • blazorthon
    A Bad DayHow many CPUs would it take to match the tested GPUs?
    That would depend on the CPU.
    Reply
  • esrever
    Would be interesting to compare the i7 ivybridge against trinity in openCL
    Reply
  • mayankleoboy1
    why no nvidia cards here?
    Reply
  • mayankleoboy1
    any CUDA vs OpenCL benchmarks?
    Reply
  • de5_Roy
    can you test like these combos:
    core i5 + 7970
    core i5 hd4000
    trinity + 7970
    trinity apu
    core i7 + 7970
    and core i7 hd 4000, and compare against fx8150 (or piledriver) + 7970.
    it seemed to me as if the apu bottlenecks the 7970 and the 7970 could work better with an intel i5/i7 cpu on the graphical processing workloads.
    Reply
  • vitornob
    Nvidia cards test please. People needs to know if it's better/faster to go OpenCL or CUDA.
    Reply
  • bgaimur
    vitornobNvidia cards test please. People needs to know if it's better/faster to go OpenCL or CUDA.
    http://www.streamcomputing.eu/blog/2011-06-22/opencl-vs-cuda-misconceptions/

    CUDA is a dying breed.
    Reply