Say Hello To A New Pascal-Based GPU
Two months after its debut, Nvidia’s Pascal architecture is slowly filling out the company’s desktop graphics card portfolio from top to bottom. First came the GeForce GTX 1080, serving up 30%+ more performance than a GeForce GTX 980 Ti for less money. Online vendors still can’t keep them in stock (Newegg doesn’t have any as of this writing). Then we were introduced to the GeForce GTX 1070, which also outperforms a 980 Ti for hundreds of dollars less.
Now we’re getting a third Pascal-based board in the GeForce GTX 1060. Announced earlier this month, we already know that Nvidia’s partners will have versions starting at $250. The Founders Edition implementation will sell for $300 on nvidia.com and in Best Buy stores, so don’t be surprised when you don’t find them elsewhere online.
GeForce GTX 1060 is based on a brand new GPU called GP106 that exposes many of the same features as GP104, but in a more mainstream package. Don’t let that term dissuade you, though. The 1060 may be a mere 120W card, but Nvidia says it’s good for GeForce GTX 980-class frame rates. Two years ago, that level of performance sold for $550. We’ve come a long way, to be sure.
Meet GP106
Nvidia builds its flagship GeForce GTX 1080 using a complete GP104 processor with four Graphics Processing Clusters enabled. This yields a card with 2560 CUDA cores and 160 texture units. The GTX 1070 centers on the same GPU with three of its GPCs turned on, adding up to 1920 cores and 120 texture units.
GeForce GTX 1060 scales down similarly using the same architectural building blocks. From our GeForce GTX 1080 launch coverage:
“Each GPC includes five Thread/Texture Processing Clusters and raster engine. Broken down further, a TPC combines one Streaming Multiprocessor and a PolyMorph engine. The SM combines 128 single-precision CUDA cores, 256KB register file capacity, 96KB of shared memory, 48KB of L1/texture cache and eight texture units. Meanwhile, the fourth-generation PolyMorph engine includes a new block of logic that sits at the end of the geometry pipeline and ahead of the raster unit for handling Nvidia’s Simultaneous Multi-Projection feature.”
GPU | GeForce GTX 1060 (GP106) | GeForce GTX 980 (GM204) |
SMs | 10 | 16 |
CUDA Cores | 1280 | 2048 |
Base Clock | 1506 MHz | 1126 MHz |
GPU Boost Clock | 1708 MHz | 1216 MHz |
GFLOPs (Base Clock) | 3855 | 4612 |
Texture Units | 80 | 128 |
Texel Fill Rate | 120.5 GT/s | 144.1 GT/s |
Memory Data Rate | 8 Gb/s | 7 Gb/s |
Memory Bandwidth | 192 GB/s | 224 GB/s |
ROPs | 48 | 64 |
L2 Cache | 1.5MB | 2MB |
TDP | 120W | 165W |
Transistors | 4.4 billion | 5.2 billion |
Die Size | 200 mm² | 398 mm² |
Process Node | 16 nm | 28 nm |
GP106 comes equipped with two GPCs, so you end up with a total of 1280 CUDA cores and 80 texture units. The chip benefits from the same optimized timings that let Nvidia crank the clock rates up on GP104, facilitating a base frequency of 1506 MHz and a typical GPU Boost rating of 1708 MHz.
The processor’s back-end is trimmed down, too. Six 32-bit memory controllers provide an aggregate 192-bit data path. Like the larger GP104, each controller is associated with eight ROPs and 256KB of L2, adding up to 48 ROPs and 1.5MB of cache. Nvidia drops 6GB of 8 GT/s GDDR5 onto the board, serving up to 192 GB/s of peak throughput. Although that figure is lower than the GTX 980's 224 GB/s, remember also that Pascal employs new lossless techniques to extract savings in the memory subsystem, effectively increasing usable bandwidth. Adapted from our GTX 1080 coverage, "[GP106's] delta color compression tries to achieve 2:1 savings, and this mode is purportedly enhanced to be usable more often. There’s also a new 4:1 mode that covers cases when per-pixel differences are very small and compressible into even less space. Finally, Pascal has a new 8:1 mode that combines 4:1 constant compression to 2x2 blocks with 2:1 compression of the differences between them."
Of course, GP106 is manufactured using the same TSMC 16FF+ process as GP104. Whereas the larger GPU is composed of 7.2 billion transistors on a 314 mm² die, Nvidia packs 4.4 billion FinFET transistors into 200 mm² for GP106. The less-complex processor, coupled with less memory on a simpler PCA, results in a 120W TDP.
A First: No SLI For Upper-Mainstream
Notice the lack of an SLI connector up top? Nvidia recommends a GeForce GTX 1070 or 1080 to gamers looking for more performance than a 1060 delivers (of course), and does not support SLI on the 1060. Generationally, this is the highest-end board we can recall without the technology. Sure, the GeForce GTX 750 Ti didn't have it, but the 760 did. So too did the GeForce GTX 950.
Officially, Nvidia internalizes the decision. There aren't many gamers who pair up mainstream GPUs, and the company doesn't want to spread resources thin, so it's focusing on optimizing SLI on faster Pascal-based cards. Beyond that explanation, though, game development is going a different direction with post-processing and compute-oriented effects that aren't friendly to alternate-frame rendering. And with DirectX 12, more control is shifted to ISVs eager to get their content out as quickly as possible. That means much of the work Nvidia pours into its drivers is circumvented.
We do have one game in our suite that supports multiple GPUs through DirectX 12: Ashes of the Singularity. After adding a second GeForce GTX 1060 and clicking one checkbox, we see the following speed-up:
Although that's not the kind of scaling we're used to seeing from SLI, ~50% isn't bad. Unfortunately, we can't even experiment with DirectX 11 games and DX12 titles without support for multiple adapters built-in.
Given that this is a 1080p-focused card, Nvidia could retroactively enable SLI over PCI Express through a driver update, and we hope it does. Regardless of how few gamers might be interested in pairing up GTX 1060 cards, there are still plenty of DX11 titles that benefit from mutli-GPU configurations. And any problem that GP106 has cutting through DX12-imposed scaling issues applies to GP104-based cards, too. Let performance benchmarks determine how attractive SLI'ed 1060s are or are not, we say.
A Closer Look At The GeForce GTX 1060 Founders Edition
Nvidia continues with its edgier 10-series Founders Edition design, though the GTX 1060 sports a presumably less expensive implementation compared to the 1070 and 1080.
That doesn't mean the new card is small, though. It's 25.4 cm long (measured from the slot cover to end of the card), 10.7 cm tall (measured from the top of the motherboard slot to the top of the card) and 3.8 cm deep. In all actuality, the card's depth is only 3.5 cm, but its slot cover sticks out by 0.3 cm.
At 845g, the GeForce GTX 1060 Founders Edition isn’t particularly light either.
Design, Feel & Connectors
Once again, Nvidia uses a mixture of aluminum and plastic for the card’s shroud. It’s a bit simpler this time around, though. The cover, including the fan, can be removed in one piece. Up top, we find the illuminated GeForce GTX logo, along with a six-pin power connector.
The GeForce GTX 1060’s back end is a bit of a departure from previous designs. Graphics cards with short PCAs often have air intakes where the cooler protrudes beyond the board, servicing the radial fan. Instead, the 1060 has a normal cover without an opening. Undoubtedly due to cost concerns, there's also no backplate.
The back side of the card presents us with a familiar sight.
The I/O panel is copied from Nvidia's GeForce GTX 1080 and 1070 without a single change. It’s dominated by three DisplayPort connectors, which are version 1.2-compatible. However, they company tells us they're also ready for version 1.3 and 1.4 as well, matching the GPU's display controller. In addition, there’s an HDMI 2.0 connector and a dual-link DVI connector; no analog output is available.
Cooler Design, Board & Power Supply
Turning our attention inward, we remove the shroud to expose the GeForce GTX 1060's cooling solution.
Up top, we immediately notice the power connector's strange position. It’s situated in a part of the cooler that protrudes beyond the actual PCA. This necessitates a number of cables to attach to the board.
The implementation is anything but elegant, and it prevents Nvidia's partners from building shorter 1060s. Although the card is only 17.5cm long, it doesn't have any space to accommodate a power connector.
Remove the four screws securing the cooler's body and it comes right off. There’s a massive copper heat sink and metal frame underneath. The closed cooling fin design reminds us of the GeForce GTX 1070, and it should provide ample performance given the 1060's 120W TDP.
The massive retention and cooling frame serves double duty by keeping everything in place and cooling the voltage regulation circuitry/memory modules.
Once the frame is unfastened and taken off, it needs to be flipped up and over. This is due to the cables connecting the separate PCIe power connector, which are permanently soldered to the board. Doing this reveals the bare PCA in all of its glory.
As usual, the GPU sits front and center. GP106 is naturally quite a bit smaller than the GP104 GPU we found on Nvidia's GeForce GTX 1080 and 1070. The differences between boards don't end there, though.
Take the memory modules as an example. Only six of the 1060's emplacements are populated with Samsung K4G80325FB-HC25 GDDR5. They have a capacity of 8Gb (32 x 256Mb) each and run anywhere from 1.305V to 1.597V, depending on clock rate. All told, this is where we get the 1060's 6GB specification.
Unfortunately, the PWM controller isn't documented. It’s made by uPI Semiconductor and bears the model number uP9509, which means that it’s probably the uP9511P’s smaller sibling (the latter controller is what we found paired to the GP104 processor).
The memory modules and one of the GPU phases get their power through the motherboard’s PCIe slot. The two remaining GPU phases and the card’s accessories draw power from the six-pin power connector. We'll take a closer look at what this means in terms of load distribution across the rails on the next page.
When it comes to voltage regulation, Nvidia uses only one Dual N-Channel MOSFET, the E6930, per phase for both the high and low side; separate gate drivers aren’t needed. This highly integrated component explains the empty spaces on the board.
The GPU’s three phases are completely sufficient, and their distribution makes more sense here than on AMD's Radeon RX 480.
Apart from the six-pin power connector, which appears to have taken a wrong turn somewhere, Nvidia's reference GeForce GTX 1060 actually looks pretty good. And given a relatively low amount of waste heat, its axial fan isn't a bad choice either.
MORE: Best Graphics Cards
MORE: Desktop GPU Performance Hierarchy Table
MORE: All Graphics Content