Adobe Flash: Hardware Acceleration, GPU, Drivers, And Details
What about hardware acceleration? Well there are actually number of areas in the video playback workflow where hardware acceleration can be beneficial: video decoding, compositing, “presentation” (color conversion, scaling, blitting), and vector rendering. This actually makes the whole Flash debate actually a bit more complicated. Start with video decoding. If you have read the Flash 10.1 Release Notes, then you know it's a technology with a few quirks.
In general, there are two ways to decode video.
- Use the GPU's dedicated video decoding hardware
- Use the CPU via a software decoder
They are efficient, in that order, with regard to power consumption and battery life. Flash Player employs dedicated video hardware, when it's available, for processing the video data. However, the hardware-based acceleration that modern GPUs employ is generally are limited to H.264, MPEG-2, and VC-1. There is no spec for accelerating H.263 and VP6. This is the universal constant between Intel, AMD, and Nvidia, and it actually makes a lot of sense, simply because it is rare to see high bit rate VP6 and H.263. And since all three Flash codecs need to be licensed, it really isn't an issue of cost. Moreover, MPEG-LA (the custodian of H.264) has extended the royalty-free use of free Internet broadcast video until December 31, 2016.
The issue really becomes about codec efficiency. As multimedia consumers, we generally try to stay neutral on codec debates, but this isn't as simple as you might think. VP6 encoding quality is generally more consistent across the various encoders used. However, H.264 can vary widely depending on encoder because there are just so many encoders available. Even when you try to make an apples-to-apples comparison, you find there are different encoding parameters than can skew it. Once you throw audio into the mix, legitimate comparisons are hard to make unless you look at an entire library of videos. With that in mind, Hulu's CTO Eric Feng previously stated that H.264 has a 2:1 compression advantage over VP6. There is no reason to doubt this margin, given that Hulu programmers will have directly handled more video in a week than we would in month.
But does this mean that moving up to H.264 automatically incurs a higher processing burden? No. Flash Player 10.1 can take advantage of hardware-accelerated H.264 decoding, offloading the task from the CPU to the GPU, when that functionality is made available.
What about GPU compositing of video with other graphical elements? According to a developer post on Adobe and a recent one-on-one discussion with Adobe, GPU hardware compositing does not occur on Window PCs in Flash Player 10.1. The hardware compositing that was available in Flash Player 10 was really related more to animations where you have multiple bitmaps and vectors generated in real-time. For instance, imagine an animation where multiple objects and layers (such as grass, trees, and clouds) are all generated as bitmaps that need to be composed as a single image after all the additional vectors undergo rasterization. This could occur via the GPU using hardware compositing, similar to what we might see in 3D video game. But this was not carried over into Flash Player 10.1 because differences in content and fragmented graphics hardware/driver implementations often negated the gains of doing this in the GPU.
Tinic Uro summed up the end effect best. "Just because Flash Player is using the video card for rendering does not mean it will be faster. In the majority of cases, your content will become slower." To address the hardware fragmentation problem, Adobe is now relying more on native platform libraries. Thus, consistent acceleration with hardware compositing is now available in Flash Player 10.1 on Mac OS X, which takes advantage of Core Animation in browsers that support Core Animation (Safari).
At the end of the day, other operations are still done in software in Flash Player 10.1. After the H.264 video data is decoded, there is no real GPU assistance in rendering interactive graphics elements, doing color space conversion, or performing scaling on the video itself. All of those still remain CPU-oriented operations.
We should make this completely clear: there is no general-purpose GPU computing in Flash Player. There is no DirectCompute, no APP, no CUDA. And any requests to add that support are nutty, because it is self-defeating. GPGPU is for processing raw data in a highly parallelized manner. But there is more to video than just raw data. There is a lot of image processing that occurs. Modern GPUs already have a portion of their design specifically dedicated to decoding and processing video data. This is called the "fixed function decoder." It lives to decode and it does nothing else. Shifting that burden to more general-purpose compute resources would be one step away from moving it back onto the CPU itself, since in both cases you'd be working with a software-based decoder. This is why Flash Player (and other hardware-accelerated video players that handle H.264) rely on the GPU’s fixed-function decoding capability for video decode acceleration.
Remember, there are specific hardware requirements that need to be satisfied to realize the decoding benefits of H.264.
|Requirements for Hardware H.264 Flash Decoding||Hardware||Starting Driver Support|
|Intel||Intel 4-Series chipset family (like the GMA 4500MHD)Core i3/i5/i7 processor family with Intel HD Graphics||22.214.171.1241 (126.96.36.1991)|
|AMD||Radeon HD 4000 or higherMobility Radeon HD 4000 or higherRadeon HD 3000 (integrated) or higherFirePro V3750, V7750, V8700, V8750 or later||ATI Radeon: Catalyst 9.11ATI FirePro: driver 8.68|
|Nvidia||View list of latest list of support GPUs||starting support unknown, use latest|
|Apple||Hardware that Supports Mac OS X Video Decode Acceleration Framework (such as GeForce 9400M, 320M, GT 330M)||Mac OS X 10.6.4 or later|
The requirements are pretty straightforward. Note that on the Mac side, Flash Player must rely on the Mac OS Video Decode Acceleration Framework to access hardware acceleration (included in Mac OS X 10.6.4 and later). This Mac OS framework does not support Intel GPUs, such as the GMA 950 and the new HD Graphics. The new MacBook Pros lean on the discrete Nvidia hardware for Flash-based H.264 decoding, which is supported by the Video Decode Acceleration Framework.
Things aren't as clear with regard to Nvidia as they are with AMD's graphic solutions. AMD only requires its second-generation Unified Video Decoder (UVD2 for discrete and UVD+ for integrated). It is as simple as that. If you have an older Nvidia graphics card, remember that the third generation of PureVideo (VDPAU decoder) has H.264 decoding restrictions. It cannot decode source video with the following horizontal resolutions: 769–784, 849–864, 929–944, 1009–1024, 1793–1808, 1873–1888, 1953–1968 and 2033–2048 pixel.This applies to products like the first-gen Ion, 8400 GS, 8200, 8300, 9300M GS, 9300 GS, and 9300 GE.