Adobe CS5: 64-bit, CUDA-Accelerated, And Threaded Performance

After Effects CS4

Our first step in this article was to pick up where Chris Angelini left off in his July look at the Intel Xeon 5600-series. Chris started with 12 threads on a Gulftown chip and worked his way up to 24 threads on a pair of Xeon X5680s. Counter-intuitively, he found that workload completion performance decreased as processing capability increased.

“After Effects CS4 only has access to 4 GB of system memory—a third of what these Xeon boxes bring to bear,” he wrote at the time. “As you add execution resources to AE’s pool, less and less memory is available to each processor, be it logical or physical. The result is a lot more swapping to solid state storage, which is fast, but nowhere near as quick as three channels of DDR3.”

Rather than scale up the CPU chain into workstation configs, we scaled down from the consumer-class flagship, Intel’s Core i7-980X with all features enabled, to only two threads—two 980X cores with no Hyper-Threading. This lowest-end arrangement should more closely resemble some of AMD’s Athlon II processors.

In his story, Chris noted keeping the multiprocessing option in After Effects enabled, as this gave the fastest results in AE CS5. With multiprocessing, AE crunches on different frames with multiple cores. Without multiprocessing, every available core works on a single frame until it’s finished. We decided to run the tests both with and without multiprocessing to better see how much impact adding cores/threads would yield.

As you’ll see more clearly in a moment, After Effects CS4 clearly dislikes Hyper-Threading. In all of our tests with HT and multiprocessing disabled, AE’s overall CPU utilization hovered in the teens and 20s, but the even-numbered threads—the logical cores created through HT— were barely touched. With only two active cores (four threads), there was a bit more activity on the even threads, but still nothing like the utilization seen with multiprocessing enabled in the application.

How does this processor utilization translate into real performance? The data is clear: After Effects CS4 performs much better with Hyper-Threading disabled, sometimes by a factor of 2-to-1. In everyday usage, it would be silly not to run the app with HT off and multiprocessing enabled provided you weren’t multitasking. The exception to this rule would be if you’re multitasking, because running with multiprocessing and HT enabled will save about 20% to 30% in CPU utilization, leaving enough room to run something else concurrently.

Interestingly, Chris noted that “in CS4, we got our best results having all cores working on each frame,” meaning that having multiprocessing disabled yielded faster performance. That was not the case here. In all instances, using multiprocessing yielded much faster results, and the more threads we used, the wider that performance gap became.

So keeping multiprocessing enabled is a foregone conclusion. That decided, what can we observe about thread scaling? Without HTT, we see only a moderate improvement as threads increase. (In fact, there is effectively no difference between four cores and six.) From two physical cores to six, we gain only 29 percent. The punch line here is that two physical cores actually outperforms 12 logical threads by 7.5 percent. Hyper-Threading is just that bad under AE CS4.

  • reprotected
    Fermi exceeds at something finally!
    Reply
  • MAGPC
    What if I am an ATI user?.
    And Iam an ATI user !!!.
    Reply
  • IzzyCraft
    magpcWhat if I am an ATI user?.And Iam an ATI user !!!.You still get gpu acceleration just not as much =p and it would be a ATI listed on their site just like nvidia it's a limited pool.
    Reply
  • bunnyblaster
    Please increase the size of the legend. It is easy to figure out in this review since it's only two colors, however, if it is more than 2, it is hard to figure out which bar is referring to which score.

    Please consider changing the page drop-down menu to the old school drop-down menus like the other tech blogs like Anandtech and Arstech, etc.

    The interface is a little clumsy and seems to be poorly timed when I try to scroll down the drop-down menu. It often closes when I am trying to scroll to another page. Sometimes, when the page loads, it is hidden by a pop-up word ad.

    However, the article content was strong.
    Reply
  • dEAne
    I have an ATI card and still I have no problem using photoshop CS4 and premiere CS4. The thing with CS5 is that if you can't wait at all, but it is not that really long.
    Reply
  • adiomari
    why cuda and not open-cl?!!
    Reply
  • shaun_shaun
    amazing performance increase !!!!!
    Reply
  • Scott2010au
    Surely they mean the 2GB memory limit (for Win32 processes)?

    Which is one reason why the Apple Mac version is so popular (Unix/BSD can handle more per process).
    Reply
  • Why CUDA? Simply 'cause it's a mature technology.
    Reply
  • amdfangirl
    adiomariwhy cuda and not open-cl?!!
    CUDA preceded Open-Cl. Dev cycles are long and tedious. If you're going to implement something, it'll take time to show up. I honestly hope more developers decide to code for Open Cl.
    Reply