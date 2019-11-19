Some people aren't content with just one graphics card. For enthusiasts with Nvidia graphics cards, that means relying on the company's Scalable Link Interface (SLI) to run multiple GPUs at the same time. But a recent post on the 3DCenter forum suggests that recent Nvidia drivers have quietly enabled a new checkerboard rendering mode that points to potential for upping graphics without SLI.
The post claimed that Nvidia enabled this new rendering mode starting with graphics drivers from the r435 development branch. Saying the feature was quietly introduced might actually be an understatement -- it apparently can only be enabled using tools like the Nvidia Profile Inspector while running a multi-GPU setup. This is all according to the 3DCenter post, as translated via Google Translate.
Once the mode is enabled, however, it's said to run in DirectX 10, 11 and 12. It's reportedly not perfect, with some games experiencing graphical issues and others failing to load entirely. This suggests that Nvidia's in the early stages of developing this technology and didn't intend for it to be discovered in recently released driver updates -- so does the company's "no comment" in response to our request for more information.
But now that it's out there, speculation has already started about what Nvidia intends to use this new checkerboard rendering mode for. Some Reddit users guessed that this is part of Nvidia's rumored plans to introduce multi-chip GPUs, a way for the company to make up for declining SLI support from game developers or just part of Nvidia's efforts to improve performance for people with multiple GPUs.
Nvidia isn't the only company that appears to be curious about multi-chip GPUs either, with Intel recently announcing that its Ponte Vecchio graphics cards will feature a multi-chip module (MCM) design that uses multiple compute units. This probably means recent Intel graphics drivers that added support for multiple GPUs were actually meant for the new multi-die architecture.
Right now this is all conjecture. Nvidia's revealed precious little about its next-generation graphics cards and how it plans to respond to increasing competition from Intel and AMD. It could be a while before this new checkerboard rendering mode debuts -- if it ever debuts at all. We won't know more until the company decides to acknowledge the fact that its recent driver updates were more than they seemed.
Tile and checkerboard, two completely different things.
Checkerboard-based rendering is when you divide a screen into squares and each square is assigned to a processing unit to render. Cards from a few generation back (like the Radeon 9700) wouls actually assign 16x16 tiles to a compute unit in the GPU, and each compute unit would take care of its own tile(s) - this would become very visible when trying to flash a 9500 with a 9700's BIOS, while some had disabled compute units that this hack would enable, other were actual cut-down chips where those checker squares would remain blank on screen.
What Nvidia is trying to do is to have SLI cards process their own squares; performance could be better than using the good old alternate line rendering (because most shaders work better on a tile than on a line) or alternate frame rendering (which causes lots of microstutter) by allowing both cards to work on the same frame - this also allows better balancing because not all tiles are equal and thus the cards could dispatch tiles as fast as they can and the rest redispatched between those that are done with theirs.
That's not what I'm referring to. PowerVR had true tile based 3D rendering decades ago. More recently various mobile chips use tiled rendering. The naming convention is new, it's a way for Nvidia to specifically refer to multi-GPU tiled rendering. But even multi-GPU tile based rendering itself isn't new. Anyone familiar with Sega arcade hardware is aware of the multi-GPU NAOMI variants (not the first NAOMI which was single GPU), which were also PowerVR based. Although you had to fully replicate VRAM... it remains to be seen if anyone can do something about that.
Side note Sega also added in a hardware T&L engine (PowerVR Elan IIRC) to the NAOMI 2, which some had claimed wasn't possible for PVR chips. I was saddened that a version of Elan wasn't put together in time for Kyro II, even if only for a higher-end variant. Their eventual software-based solution worked pretty well but it was too little too late - it wasn't released until after the demise of the Kyro II SE project. WAY too late.
This is the first time something like tiled rendering is publicly used on stuff like SLI cards though - maybe it was used before with other systems. The name change hints at something a bit more intricate though, as I said I wouldn't be surprised if it were using in out of order rendering instead of the much more prevalent even/odd tile distribution.
Today we have things like early discard (culling) of triangles that fall outside the render tile to allow greater efficiency and lower resource use. Technically speaking a core can render an entire tile with the full scene. But it wouldn't be as efficient.
That said the previous poster is correct as it would lend itself to a more even distribution between gpu cores. And is more efficient then alt line rendering because compute units do better on squares.
But it presents many of the same issues as most sli and in some cases makes them worse. For example a depth of field blur is based on a z buffer distance and then applying a gaussian blur. Those objects outside the target z distance require bigger gaussian blurs. But the larger the blur the more likely that data point will exist in an adjoining gpu memory block.
Memory coherency sync issues are what make sli / crossfire a pain. And i believe chiplettes will address many of these issues as well as add support for additional add on capabilities like a ray trace intersection hit test chip for global illumination.
I believe this driver release is an indication that nvidia is indeed researching chiplettes aggressively because this kind of rendering technique is exactly how chiplettes will work.
I don't suspect there's anything special about their "checkerboard" rendering. One card gets odd, the other gets even. When you add more than two GPUs/chiplets, then you split it up more. It's not novel, but they did beat slow-arse AMD to it - despite AMD beating even Intel to chiplets on the CPU front. AMD has a couple of competitive GPUs now, but they're still slackers on graphics lately.