Sign in with
Sign up | Sign in

Ambient Occlusion, Continued

What Does DirectCompute Really Mean For Gamers?
By

Ambient occlusion can also be performed via pixel shaders. Developers have a choice between which method to use and, going into this article, we were a bit in the dark about why DirectCompute might be preferable. After all, we’d seen enough early benchmarks showing that using DC-enabled effects could significantly impact graphics performance (and not in a positive way). Using compute resources to achieve a feature that couldn’t be done otherwise was one thing, but why pick DirectCompute when shaders were already getting the job done? Well, for starters, DirectCompute has no more of an impact on performance than pixel shaders.

“For each pixel the occlusion term is calculated for, multiple reads of the depth texture are required,” says Codemasters’ Thomas. “In a pixel shader, each texture read costs cycles. In a compute shader, the LDS (local data share) is filled with the nearby depth information from the depth texture, and subsequent reads are significantly cheaper compared to a texture fetch.”

by tbubb at umbcretrievers.wordpress.comby tbubb at umbcretrievers.wordpress.com

In this series, we want to keep returning to the question of heterogeneous computing and how adept today's hardware is at handling the tasks discussed. How do APUs compare to discrete graphics and host processors operating over PCI Express? If texture fetches are coming from memory, and APUs are relying on a shared system RAM architecture, does this inherently handicap an APU's ability execute this task efficiently, or is its proximity to the host processing resources a boon instead?

Source: www.geforce.comSource: www.geforce.com

Source: www.geforce.comSource: www.geforce.com

“HDAO only requires the depth of the scene as an input,” says Thomas. “This has to be rendered first, but in practice most games already have this information hanging around from either the g-buffer or a depth pre-pass. The depth buffer is a video memory resource and the implementation of HDAO would be no different on an APU compared to a GPU. The technique is very memory efficient since the only extra memory requirement is for the output texture. This is another reason why the technique is becoming increasingly popular.”

This is born out in our test results, and it’s an important point to make up front. You're going to look at our upcoming Battlefield 3 results and see that the APU only manages an average of 14 FPS with horizon-based ambient occlusion (HBAO) enabled—a clearly unplayable rate. With the Radeon HD 7970 card pulling in results 8.5x greater, it'd be natural to assume that the APU simply can’t handle the DirectCompute load. But don’t let the article’s context mislead you. Even with ambient occlusion disabled, the APU system only averages 16.6 FPS. 

Battlefield 3's load is such that it's the APU's graphics muscle is unable to keep up. It's not the chip's heterogeneous architecture killing performance. We simply need hardware with more horsepower.

Display all 37 comments.
This thread is closed for comments
Top Comments
  • 12 Hide
    Anonymous , March 12, 2012 4:18 AM
    Ha. Are those HL2 screenshots on page 3 lol?
Other Comments
  • 12 Hide
    Anonymous , March 12, 2012 4:18 AM
    Ha. Are those HL2 screenshots on page 3 lol?
  • -6 Hide
    Khimera2000 , March 12, 2012 5:17 AM
    so... how fast is AMD's next chip??? :)  a clue??? anything?
  • 1 Hide
    de5_Roy , March 12, 2012 6:13 AM
    would pcie 3.0 and 2x pcie 3.0 cards in cfx/sli improve direct compute performance for gaming?
  • 7 Hide
    hunshiki , March 12, 2012 8:22 AM
    hotsacomanHa. Are those HL2 screenshots on page 3 lol?


    THAT. F.... FENCE. :D 

    Every, single, time. With every, single Source game. HL2, CSS, MODS, CSGO. It's everywhere.
  • -3 Hide
    Anonymous , March 12, 2012 8:49 AM
    hunshikiTHAT. F.... FENCE. Every, single, time. With every, single Source game. HL2, CSS, MODS, CSGO. It's everywhere.


    Ha. Seriously! The source engine is what I like to call a polished turd. Somehow even though its ugly as f%$#, they still make it look acceptable...except for the fence XD
  • 1 Hide
    theuniquegamer , March 12, 2012 10:50 AM
    Developers need to improve the compatibility of the API for the gpus. Because the consoles used very low power outdated gpus can play latest games at good fps . But our pcs have the top notch hardware but the games are playing as almost same quality as the consoles. The GPUs in our pc has a lot horse power but we can utilize even half of it(i don't what our pc gpus are capable of)
  • 9 Hide
    marraco , March 12, 2012 10:52 AM
    I hate depth of field. Really hate it. I hate Metro 2033 with its DirectCompute-based depth of field filter.

    It’s unnecessary for games to emulate camera flaws, and depth of field is a limitation of cameras. The human eye is able to focus everywhere, and free to do that. Depth of field does not allow to focus where the user wants to focus, so is just an annoyance, and worse, it costs FPS.

    This chart is great. Thanks for showing it.



    It shows something out of many video cards reviews: the 7970 frequently falls under 50, 40, and even 20 FPS. That ruins the user experience. Meanwhile is hard to tell the difference between 70 and 80 FPS, is easy to spot those moments on which the card falls under 20 FPS. It’s a show stopper, and utter annoyance to spend a lot of money on the most expensive cards and then see thos 20 FPS moments.

    That’s why I prefer TechPowerup.com reviews. They show frame by frame benchmarks, and not just a meaningless FPS. TechPowerup.com is a floor over TomsHardware because of this.

    Yet that way to show GPU performance is hard to understand for humans, so that data needs to be sorted, to make it easy understandable, like this figure shows:





    Both charts show the same data, but the lower has the data sorted.

    Here we see that card B has higher lags, and FPS, and Card A is more consistent even when it haves lower FPS.
    It shows on how many frames Card B is worse that Card A, and is more intuitive and readable that the bar charts, who lose a lot of information.

    Unfortunately, no web site offers this kind of analysis for GPUs, so there is a way to get an advantage over competition.
  • 2 Hide
    hunshiki , March 12, 2012 10:54 AM
    I don't think you owned a modern console Theuniquegamer. Games that run fast there, would run fast on PCs (if not blazing fast), hence PCs are faster. Consoles are quite limited by hardware. Games that are demanding and slow... or they just got awesome graphics (BF3 for example), are slow on consoles too. They can rarely squeeze out 20-25 FPS usually. This happened with Crysis too. On PC? We benchmark FullHD graphics, and go for 91 fps. NINETY-ONE. Not 20. Not 25. Not even 30. And FullHD. Not 1280x720 like XBOX. (Also, on PC you have a tons of other visual improvements, that you can turn on/off. Unlike consoles.)

    So .. in short: Consoles are cheap and easy to use. You pop in the CD, you play your game. You won't be a professional FPS gamer (hence the stick), or it won't amaze you, hence the graphics. But it's easy and simple.
  • 1 Hide
    kettu , March 12, 2012 11:30 AM
    marracoI hate depth of field. Really hate it. I hate Metro 2033 with its DirectCompute-based depth of field filter.It’s unnecessary for games to emulate camera flaws, and depth of field is a limitation of cameras. The human eye is able to focus everywhere, and free to do that. Depth of field does not allow to focus where the user wants to focus, so is just an annoyance, and worse, it costs FPS.


    'Hate' is a bit strong word but you do have a point there. It's much more natural to focus my eyes on a certain game objects rather than my hand (i.e. turn the camera with my mouse). And you're right that it's unnecessary because I get the depth of field effect for free with my eyes allready when they're focused on a point on the screen.
  • 0 Hide
    npyrhone , March 12, 2012 1:08 PM
    Somehow I don't find it plausible that Tom's Hardware has *literally* been bugging AMD for years - to any end (no pun inteded). Figuratively, perhaps?
  • 3 Hide
    xenol , March 12, 2012 2:32 PM
    There's one thing I hate about current implementations of AO: it's too coarse. An object that's no more than say two feet behind something manages to receive some AO treatment. I want to say it's a shadow, but it's clearly not in the direction of the light source.
  • 4 Hide
    gtguy257 , March 12, 2012 2:50 PM
    The human eye cannot focus everywhere at once. In fact it has a very limited depth of field. But, the human eye can focus so quickly that you rarely notice unless you focus from up close to far away. The effect isn't a flaw in camera systems it is a function of how any optic works. Whether that is your eye or a camera.
  • 2 Hide
    TeraMedia , March 12, 2012 2:53 PM
    @xenol:

    Looking at the cartoonish pic above (last page), I would have to agree with you. It looks like they turned up the effect too strongly because they wanted to make it easily visible. The real world doesn't look like that at all. Look at the corner of your room, and you can see a faint darkening in the very corner that gradually brightens as you move away a few inches. But in the above pic, the darkening is strong enough to almost black out the pixels. To be more realistic, I think it needs to be more subtle.
  • -1 Hide
    TeraMedia , March 12, 2012 2:55 PM
    It would have been great to see a lower-end GCN card such as a 7750 so that we could see the frame rate impact of this feature when the card is already stretching - but is still capable. The inclusion of the 5870 kind of approximates this, I suppose, but the 7750 would have been a decent add-on match for an A8-equipped computer.
  • 2 Hide
    gamerk316 , March 12, 2012 5:00 PM
    Rather then continuing to throw resources at approximating the Rendering equation, can we PLEASE move to Ray Tracing already? All these little problems that are hard to implement in terms of the rendering equation are a natural outcome of Ray Tracing.
  • -2 Hide
    bloc97 , March 12, 2012 5:09 PM
    Why is Vsync enabled in the Dirt 3 Benchmark Screenshot?
  • 2 Hide
    bloc97 , March 12, 2012 5:13 PM
    gtguy257The human eye cannot focus everywhere at once. In fact it has a very limited depth of field. But, the human eye can focus so quickly that you rarely notice unless you focus from up close to far away. The effect isn't a flaw in camera systems it is a function of how any optic works. Whether that is your eye or a camera.


    But it is an annoyance since when something isn't focused on the screen, you cannot see it until you turn your camera to it...
    Just like when you look something at your side, you don't need to turn your head...
  • 0 Hide
    bloc97 , March 12, 2012 5:15 PM
    gamerk316Rather then continuing to throw resources at approximating the Rendering equation, can we PLEASE move to Ray Tracing already? All these little problems that are hard to implement in terms of the rendering equation are a natural outcome of Ray Tracing.

    Really? Ray Tracing would make the game 0.3 FPS unless you have a quad Crossfire of HD 7990's, and even then you will only get 4 FPS...
  • 5 Hide
    shin0bi272 , March 12, 2012 5:38 PM
    I dont know if its because I just woke up or if Im right but in those 3 BF3 screens showing the no ao, ss ao, and hbao, look identical. I dont see any difference what so ever save for some small shadows around the edges of the boxes and pallets.

    I would have also liked to have seen them include an nvidia card in their benchmarks to see the fps you got doing the same test with the competitors products.
  • 3 Hide
    nebun , March 12, 2012 6:43 PM
    Ambient Occlusion is nothing new....nVidia has had it for a very long time, a lot of people just don't know about it
Display more comments