Sign in with
Sign up | Sign in

Virtualizing The GPU: How It Works

Can Lucidlogix Right Sandy Bridge’s Wrongs? Virtu, Previewed
By

Interestingly, some of the same technologies that went into Lucidlogix’s Hydra also play into its Virtu software.

Normally, when you fire up a game—let’s say it’s a modern DirectX 11 title like Metro 2033—it loads certain DLLs based on the hardware installed. If you’re using a Radeon HD 4870, for example, the game will only run through the DirectX 10 or DirectX 9 code path. The same would hold true in a machine limited to Intel’s HD Graphics engine.

Virtu inserts an abstraction layer between the application and operating system, though. Depending on the piece of software requesting resources, Virtu’s rendering assignment manager sends the workload to either Intel’s HD Graphics or the discrete card. Both the abstraction layer and rendering assignment manager are borrowed from Hydra.

Applications that don’t require the discrete card’s performance, or conversely run best on HD Graphics, are handled by Intel’s integrated component. Web content, video playback, the Aero interface all fall under that umbrella, as do apps optimized for Quick Sync. There’s really nothing fancy involved.

Games better rendered on the discrete card, however, are redirected by the assignment manager and processed by the GPU. From there, Lucid’s InterOp engine maps the discrete card’s frame buffer to the HD Graphics’ memory—necessary, since the display output is connected to that device.

Overcoming Overhead

Naturally, the process of mapping one adapter’s frame buffer to the other’s over PCI Express is not free. You’re generally looking a 1 to 1.2 ms process.

So, say you’re running Call of Duty at 100 frames per second. That means each frame is being rendered in 10 ms. Factor in the time it takes to move that frame from the discrete GPU to the other GPU for output, and you’re looking at 11.2 ms or slightly more than 89 frames per second.

Now take that number to the other extreme. Let’s say you’re running Metro 2033 at 20 frames per second. Each frame gets rendered in 50 ms. Add 1.2 ms for the frame buffer transfer and you’re looking at 51.2 ms, or 19.53 frames per second. Clearly, the concern about overhead is more pronounced at higher frame rates, where performance theoretically won’t be affected as severely.

Even though those frame rate drops aren’t too impactful, there is a way to work around them—at least to some extent. In fact, we’ve already seen the strategy from Nvidia with its Optimus technology. From Nvidia’s Optimus white paper:

“To preserve coherency, the 3D engine is blocked from rendering until the mem2mem transfer completes. This time-consuming (synchronous) DMA operation can stall the 3D engine and have a negative impact upon performance. The new Optimus Copy Engine relies on the bidirectional bandwidth of the PCI Express bus to allow simultaneous 3D rendering and copying of display data from the GPU frame buffer to the main memory area used as the IGP frame buffer.”

Lucid similarly employs an asynchronous copy using multiple buffers to transfer data during the render process. In theory, that means you’ll still see 100 frames per second in Call of Duty, and performance is only affected by a small amount of latency in the game. This latency is masked by the fact that a DirectX game buffers up to three frames ahead. Of course, those are the nuts and bolts. In practice, we still see some performance loss.

However, if access to Quick Sync is important to you—and if it isn’t now, there’s a good chance it will be at some point in the next year or two—then Virtu as it exists today presents an acceptable compromise. An upcoming version of Virtu will be even more attractive. More on that shortly.

Display all 61 comments.
This thread is closed for comments
  • 2 Hide
    rhino13 , February 28, 2011 11:27 AM
    AMD's Fusion stuff integrates without needing software though right?
  • 0 Hide
    mister g , February 28, 2011 11:37 AM
    I'm pretty sure that Fusion only works with AMD parts, but the idea whould be the same. Anybody else remember this company's ads on the side of some of Tom's articles?
  • 1 Hide
    jemm , February 28, 2011 11:40 AM
    I wonder how much the Z68 will cost.
  • 0 Hide
    Anonymous , February 28, 2011 11:52 AM
    I suppose a multi-monitor setup, main screen for gaming on the discrete card (assuming game only uses that one screen), secondary on the Z68 Output of the Intel HD card, will not have any need for this, and just run perfectly.

    Thats how i will roll, once Z68 gets out.
  • 0 Hide
    user 18 , February 28, 2011 11:57 AM
    sounds cool, although the whitelist could be a deal-breaker for a lot of people.
  • 0 Hide
    haplo602 , February 28, 2011 12:07 PM
    seems like we are heading to what voodoo graphics and TV tuners were doing long long time ago. just now over the PCIe bus.

    I wonder why it's so difficult to map framebuffers and create virtual screens ?
  • 1 Hide
    tommysch , February 28, 2011 12:41 PM
    I dont want a cheap graphic solution producing heat along my precious CPU...
  • 0 Hide
    RobinPanties , February 28, 2011 1:12 PM
    This sounds like software technology that should be built straight into OS's, instead of added as separate layers... maybe OS manufacturer's need to wake up (*cough* Microsoft)
  • 0 Hide
    truehighroller , February 28, 2011 1:28 PM
    I already sent back my sandy bridge setup, that's to bad. Guess it's Intel's loss huh?
  • -1 Hide
    lradunovic77 , February 28, 2011 1:45 PM
    This is another absolutely useless piece of crap. Why in the world would you put deal with another stupid layer and why would you use Intel integrated graphic chip (or any integrated solution) along with your dedicated video card???

    Conclusion of this article is...don't go with such nonsense solution.
  • 0 Hide
    hp79 , February 28, 2011 1:48 PM
    (unrelated to article)
    Dear Tom's,
    your pull down menu for page navigation sucks. I mean it really really sucks. I am so annoyed that it makes me want to stop reading the articles. It is the worst design of any webpage. I use IE, Firefox, and Chrome. It's very hard to jump through pages using the pull down menu. Please fix the style of it.
  • 4 Hide
    lradunovic77 , February 28, 2011 1:48 PM
    Again Intel and AMD move to integrate graphic chip into CPU is good for mobile useless for anything else. It is far from being smart solution for desktops unless they can pack GTX580 capable card into....mmmm i don't think so.
  • 0 Hide
    wolfram23 , February 28, 2011 1:51 PM
    I'm sorry but Intel seriously didn't bother to allow you to do transcoding with the EPUs with a discreet card?? WTF are they thinking!?
  • 1 Hide
    lradunovic77 , February 28, 2011 1:53 PM
    @HP79

    I agree with you. You guys need to implement partial rendering on this site. It is annoying how much it flickers on post back actions.
  • 0 Hide
    Travis Beane , February 28, 2011 1:53 PM
    I've been asking for using the integrated GPU for GPGPU purposes, and the discrete for gaming for a year now.
    It seems like we're slowly getting there.
    I'd like to run PhysX, but why not on the HD 3000 instead of a second $200 card, requiring a more meaty power supply, better cooling etc.?
  • -1 Hide
    sblantipodi , February 28, 2011 4:39 PM
    Why care about quick sync when you have a discrete GPU?
  • 0 Hide
    Anonymous , February 28, 2011 4:39 PM
    Who needs this ? Mobile users can use Optimus, as part of native NVidia drivers. And for desktops, why do I need this at all ?
  • 1 Hide
    ProDigit10 , February 28, 2011 4:43 PM
    I wished the intel graphics could be used for most desktop activities, and the discrete card as main monitor connector for games, using a dual monitor setup.
    It's a much easier, and much better approach!

    Play games on the discrete, while your desktop is showing on the other monitor.
  • 0 Hide
    ProDigit10 , February 28, 2011 4:44 PM
    ^^-*edit while your desktop is showing through the intel card*
  • 0 Hide
    cangelini , February 28, 2011 4:44 PM
    sblantipodiWhy care about quick sync when you have a discrete GPU?


    Because the discrete GPU can't do what Quick Sync does :) 
Display more comments