Multi-GPU In Detail
SFR: Rendering All Over The Place With DirectX 12
When two cards take turns rendering frames, it's called alternate frame rendering (AFR) and the result is typically an increase in both the average and maximum frame rate. However, this mechanism is quite fragile and includes a number of potential problems.
The graphics cards might become desynchronized if two frames follow each other too quickly or if their frame rendering times are very different. This results in subjectively perceptible stuttering that can be quite severe. Different synchronization techniques and frame pacing might provide some relief, but the underlying problem remains.
Another caveat is that all resources need to be duplicated. For instance, each graphics card has to have a copy of the same data in its memory.
DirectX 12 uses split frame rendering (SFR). This method works in a similar way to raytracing by dividing the screen into tiles. Consequently, each card still uses its own resources, but only needs the assets required for its part of the image and not the entire image. This doesn’t just reduce the memory footprint, but also the bandwidth requirements and transfer times.
Idle time until the output of the finished frame is reduced, since each graphics card just needs to render its tile, and not the whole frame.
Frame Rates And Times
We ended up testing three graphics cards in various combinations using AFR, since it wasn't possible to set up SFR in the game's current form. We simply didn’t have the time for any more than that. Curiously, we ran into CPU bottlenecks quite a bit anyway, even though we stuck to the Extreme preset and an overclocked six-core CPU. If we had experimented with slower graphics cards, the results would have been even less representative.
Don't get too hung up on the similar frame rates. As usual, the devil’s in the details. For comparison, we’re also including the two individual graphics cards:
A look at the frame rates over time shows us significant differences, with Nvidia’s graphics cards enduring the lowest minimum frame rates. This also goes to show that the overall averages are usually worthless.
The frame render times don’t look much different. We don’t really see any outliers, which are generally almost customary for SLI and CrossFire setups. The individual graphics cards have some problems in this arena, though.
Things look the same in our smoothness graph, with the multi-GPU configurations doing well.
The number of batches per frame is close once again, which isn’t really a surprise, seeing what’s being rendered.
It’s plain to see that the CPU becomes a bottleneck in the multi-GPU configurations. There might only be a difference of seven percent between the fastest single GPU's frame rate and the slowest multi-GPU configuration, but the rate of bottlenecked frames jumps to 100 percent.
Depending on the performance level, the portion of the total render time consumed by the CPU is massive.
The same goes for the portion of the total render time used up by the driver.
Present Time is dominated by AMD’s Radeon R9 Fury X once again. In spite of the similar frame rates, there are a lot of differences between the graphics cards, which are also reflected in the subjective user experience.
Multi-GPU Bottom Line
If nothing else, the boundaries between graphics cards falls with DirectX 12. This goes both for different graphics card models and graphics cards by different manufacturers. Almost all combinations are possible, even though they don’t necessarily make sense.