Sign in with
Sign up | Sign in
Tom's Hardware Graphics Charts: Performance In 2014
By ,
1. Introducing Our Reference System And Methodology For 2014

Our Mission: The Best Data Possible

Over the past year or so, you've seen us make some significant changes to many of our graphics card reviews. For example, whenever possible, we employ the Frame Capture Analysis Tool, dubbed FCAT and covered in depth in Challenging FPS: Testing SLI And CrossFire Using Video Capture. Our performance results are consequently more accurate than they've ever been. 

Recently, we decided to dig deeper into power consumption as well, approaching it with the same precision, and you've likely seen the product of that in our more recent launch coverage. We don't want to just give you a rough estimate of what a given graphics card draws. No, the goal here is to set a new standard for power measurement. This is particularly important to us at a time when GPU vendors aren't just talking raw performance any more. They're putting an emphasis on efficiency, valuing cards able to achieve high frame rates without dissipating a ton of heat. 

A new methodology, developed in concert with an industry partner, allows us to ask (and answer) questions that couldn't be easily addressed in the past. Just days ago, in Radeon R9 295X2 8 GB Review: Project Hydra Gets Liquid Cooling, we were able to compare the Radeon R9 295X2, HD 7990, and HD 6990, telling you how much power each card pulled over its PCI Express slot and eight-pin connectors. In the days to come, we'll publish a follow-up explaining why all of this matters so much, allowing you to interpret the outcome of our testing more easily.

The other purpose of today's story is to introduce our 2014 Graphics Card Charts. To account for advances in display technology, we're testing synthetics and real-world games at two resolutions now: FHD (1920x1080) and UHD (3840x2160). You'll see a second set of resolutions and results a little later; those will account for entry-level boards, APUs, and Intel's integrated graphics solutions.

Benchmarking On A New Reference System

Over the course of the last year, we've seen plenty of cases where an Intel Core i7-3770K, even one overclocked to 4.5 GHz, can become a bottleneck. Fortunately, at the resolutions we picked and the settings we're running at, limitations should be few. To further assure this, we also built up a new reference machine based on Intel's Core i7-4930K operating at 4 GHz, a very fast quad-channel memory kit, and Asus' Rampage IV Black Edition motherboard.

Freezing the Current State to Ensure Fair Comparisons

The evolution of gaming confounds us every year. It's easy enough to adopt new titles and retire old ones in our reviews. But it's a lot harder to collect a ton of data for these persistent charts and watch it age over the course of 12 months. We know it'd be impossible to use every single popular game for this project, particularly since we're doing multiple benchmark runs for each one. So, our team picked 10 AAA titles with fairly long sequences and available settings to test. The suite we ended up with purposely strikes a balance between graphics card vendors, too.

We’re using both modern and older games, which accordingly present a range of challenges for the graphics hardware. We deactivated automatic updates, effectively freezing their current state as of early 2014. And the copy of Windows we're using is also kept from phoning home for patches. This configuration is saved as an image, allowing us to reuse it over and over. Whenever it's necessary, we also use new drivers. Sometimes that necessitates applying them retroactively and re-testing cards, particularly when the performance of a card is affected in a big way.

2. The Components In Our Reference Build

The following table includes the components in our 2014 Reference System:

Bench Table
Microcool Banchetto 101
System
Intel Core i7-4930K (Ivy Bridge-E), Overclocked to 4 GHz
Asus Rampage IV Black Edition, X79 Express
32 / 64 GB Corsair Dominator Platinum DDR3-2133
Enermax TLC 240 Closed-Loop Liquid Cooler
1 x 512 GB Samsung 840 Pro
1 x 256 GB Samsung 840 Pro
Power Supply
Corsair AX860i with Modified Rail (for Power Consumption Measurement)

CPU: Intel Core i7-4930K

Last year, we discovered that our overclocked Core i7-3770K occasionally interfered with our graphics benchmarks. Since we're planning to use this platform for workstation-oriented applications as well, it's time to toss the mainstream stuff and go with an LGA 2011-based Core i7-4930K sporting six cores and able to schedule 12 threads. In the gaming charts, it's overclocked to 4 GHz. When we switch over to the professional hardware, I drop it back to 3.4 GHz.

Motherboard: Asus Rampage IV Black Edition

The foundation of any system is a solid motherboard able to facilitate stable, consistent results.

It's no coincidence that I went with Asus' Rampage IV Black Edition, then. Not only is it easy to get running the way I need, but the board also comes with a number of measurement points for taking readings. X79 Express is an old chipset by modern standards, necessitating third-party controllers to enable many of its features. Asus deploys these intelligently.

The Asus Rampage IV Black Edition’s four 16-lane PCI Express slots run in any number of different link configurations; we'll only need them to run in two-way CrossFire or SLI at full x16 bandwidth.

Cooling

We got lucky with our CPU. After making our way through the Rampage IV Black Edition's seemingly endless list of settings, the Core i7-4930K settled in with all six cores stable at 4 GHz without increasing voltage. In fact, there was even some headroom available to lower it a bit. 

That good behavior reflects in the thermals. Using Enermax's ELC240 closed-loop liquid cooler, they wouldn't exceed 67 degrees Celsius after an hour of running LinX.

The highest overclock that makes sense on this system is 4.5 GHz, giving us some room to grow in case we're able to identify a performance bottleneck at some point in the future.

Thermal Paste for Our Benchmark System and Graphics Cards

We're cooling a fairly high-power CPU, plus taking apart and reconfiguring a lot of graphics cards. That means we need an effective thermal paste. After a lot of testing over the past two years, our German team has settled on Gelid's GC-Extreme, since it's easy to apply and doesn't require any special burn-in period.

For more information on thermal paste testing and the research we've put into it, check out Thermal Paste Comparison, Part One: Applying Grease And More and Thermal Paste Comparison, Part Two: 39 Products Get Tested.

3. How We Measure Power Consumption

Measurement Equipment and Methodology

Our power consumption test setup was planned in cooperation with HAMEG (Rohde & Schwarz) to yield accurate measurements at small sampling intervals, and we've improved the gear continuously over the past few months.

AMD’s PowerTune and Nvidia’s GPU Boost technologies introduce significant changes to loading, requiring professional measurement and testing technology if you want accurate results. With this in mind, we're complementing our regular numbers with a series of benchmarks using an extraordinarily short range of 100 μs, with a 1 μs sampling rate.

We get this accuracy from a 500 MHz digital storage oscilloscope (HAMEG HMO 3054), while measuring currents and voltages with the convenience of a remote control.

The measurements are captured by three high-resolution current probes (HAMEG HZ050), not only through a riser card for the 3.3 and 12 V rails (which was custom-built to fit our needs, supports PCIe 3.0, and offers short signal paths), but also directly from specially-modified auxiliary power cables.

Voltages are measured from a power supply with a single +12 V rail. We're using a 2 ms resolution for the standard readings, which is granular enough to reflect changes from PowerTune and GPU Boost. Because this yields so much raw data, though, we keep the range limited to two minutes per chart.

Measurement Procedure
Contact-free DC measurement at PCIe slot (using a riser card)
Contact-free DC measurement at external auxiliary power supply cable
Voltage measurement at power supply
Measurement Equipment
1 x HAMEG HMO 3054, 500 MHz digital multi-channel oscilloscope
3 x HAMEG HZO50 current probes (1 mA - 30 A, 100 kHz, DC)
4 x HAMEG HZ355 (10:1 probes, 500 MHz)
1 x HAMEG HMC 8012 digital multimeter with storage function
Power Supply
Corsair AX860i with modified outputs (taps)

A Lot Can Happen in 100 Milliseconds...

...and we mean a lot! Let’s take a look at an analysis of all three voltage rails using a 2 ms sample across 100 ms (giving us 50 data points). Just looking at those results makes us pity the power supply. Jumps between 91 and 355 W over the auxiliary power connectors are pretty harsh. The fluctuations aren't as crazy on the other rails.

On the bright side, neither AMD nor Nvidia graphics cards with auxiliary power connectors fully utilize the 75 W made available through a PCI Express x16 slot. That hasn't always been the case. Additionally, there's far less variance over the PCI Express interface, no doubt benefiting stability.

High-Resolution Measurements

We wrap this part of our introduction up with illustrations of power consumption at idle and under a gaming workload. Again, all of this will get explained in more detail in an upcoming article.

Here's what's interesting: AMD's Radeon R9 290X demonstrates an idle power figure under 14 W. However, the many peaks up to 32 W skew that figure up if you're sampling more slowly. With on-board memory factored out, really, all that's left of the power use comes in under 12 W.

The differences aren't just apparent at idle, either. Power consumption under the effects of a gaming workload also turns out to be lower than what older/slower equipment would have us believe. Those massive disparities between our gear and slower equipment only showed up in the last two generations of AMD's hardware, so it's a fairly recent phenomenon. But it does mean the company gets beaten up more than it should in most reviews.

4. How We Measure Noise

We measure each graphics card's noise levels with a calibrated high-quality studio microphone (supercardioid) 50 cm away from a position perpendicular to the middle of the board. This distance, as well as the strong cardioid microphone characteristic, represent a compromise between avoiding noise generated by the fan’s airflow and ambient noise that can never be completely eliminated. Our noise-dampening efforts certainly help minimize the latter, but they'll never be 100-percent successful. 

This year, we also had to decide (yet again) if we should use sone, dB, or dB(A) for our charts.

Decibel or Sone?

The definition of perceived loudness expressed in sone is based on sound pressure. One sone equals 40 phon, which in turn is defined as a pure 1 kHz tone at 40 decibel (dB). Sone scales with perceived loudness (that is, a sound pressure perceived to be twice as loud as 1 sone has 2 sone, and a sound pressure perceived to be half as loud as 1 sone has 0.5 sone). At first glance, this appears to be a logical, practical, and easy way to express noise level. Unfortunately, a closer look at how it works in practice reveals some irritating problems.

An increase in loudness by 10 phon from a starting point of 40 phon, totaling of 50 phon, results in a perceived doubling, which makes it 2 sone. However, the situation isn't straightforward under 40 phon. In that range, a reduction of less than 10 phon is enough to halve perceived loudness. And typically, the sound pressure produced by graphics cards at idle (along with quiet products under partial load) is almost exclusively below the 40 dB (40 phon, 1 sone) limit. So, recording an idle board's noise level in sone is difficult and potentially confusing. Overall, sone is better-suited to expressing higher sound levels.

Complex Noise Instead of Pure Tones

Another problem with sone is that it’s based on and scales with the perception of loudness of a pure 1 kHz tone. As we know, a graphics card's fan doesn't generate a pure tone at all. Rather, it produces complex noise covering a spectrum of frequencies.

The debate gets even more complicated when you try to compare noise from a centrifugal fan to that of an axial fan, and crazier still when you take different diameters and speeds into account. A sone value is strongly dependent on a graphics card cooling solution's specific sound profile, making the loudness rating it provides hard to interpret, even though that's theoretically the most exact way of expressing it.

The Human Ear

Coming back the other way, acoustic measurements use weighted sound pressure levels to reasonably model the human ear’s sound perception, simplifying our conundrum a bit. This is achieved through the use of filters, which are based on weighting curves defined in the DIN EN 61672-1/-2 norms. These filters are designed to provide a similar frequency response to that of the human ear for loudness measurements.

When you get right down to it, these are still only estimates. But, depending on the quality of your measurement device, they should be more representative for the range below 40 dB than sone values. Of course, providing dB(A) values only makes sense if the distance to the source of the sound is given as well (we make sure to do this).

With all of this considered, we'll keep using dB(A) for our noise measurements. Based on your feedback, though, I also want to give you the following table as a frame of reference. Hopefully dB(A) reference ranges with my own commentary adds meaning to the quantitative data that comes from our readings.

Audio Comparison Table
< 31 dB(A)
  • Very good cooling solutions at idle
  • Passively-cooled graphics cards
31 - 33.9 dB(A)
  • Mediocre cooling solutions at idle
  • Graphics cards with idle fan speed set too high (~40%)
  • Entry-level graphics cards with very good cooling solutions under full load
34 - 35.9 dB(A)
  • Entry-level graphics cards with average cooling solutions
  • Mid-level graphics cards with very good cooling solutions under full load
36 - 39.9 dB(A)
  • Mid-level graphics cards with good cooling solutions under full load
  • High-end graphics cards with very good cooling solutions
40 - 44.9 dB(A)
  • Mid-level graphics cards with below-par cooling solutions
  • High-end graphics cards with average cooling solutions
45 - 49.9 dB(A)
  • Generally loud graphics cards with below-par cooling solutions
> 50 dB(A)
  • Unbearably loud, usually graphics cards with reference coolers
5. 3DMark Fire Strike And Unigine Heaven

3DMark Fire Strike

We’re including two synthetic benchmarks in our charts, even if their correlation to real-world performance is often questioned. They do serve a valid purpose, making it easier for you to reproduce our findings without trying to match the in-game sequences we run. For all of the criticism synthetics receive, they often do provide a good overview of graphics performance, independent of the platform.

The video shows the entire benchmark run for both parts.

3D Mark FireStrike Graphics Test 1

3DMark FireStrike Graphics Test 2

We skip the CPU and combined metrics, since our benchmark system has an overclocked six-core CPU that's not representative of many gaming systems. The following table includes the settings we use:

3DMark
Resolution
1920x1080 (1080p)
Scene
Fire Strike
Settings
Default
Benchmarks
Graphics Test 1 and 2

Unigine Heaven 4.0

This synthetic benchmark's use of tessellation enables scalable analysis of geometry performance. In addition, it can be configured to tax shader hardware intensively.

We picked settings at 3840x2160 that allow high-end graphics cards to deliver a reasonably smooth experience. In general, all benchmarks run in full-screen mode.

3DMark, Unigine, and the rest of our benchmarks are run with each graphics card conditioned to operate at a steady load temperature.

In Heaven, we run the loop until temperature stops increasing. For the gaming benchmarks, we run the sequence (sometimes multiple times) before recording results. Furthermore, we’ve frozen all benchmarks the way they are right now; we won't be updating them. When new drivers add optimizations, we’ll re-run the numbers for as many cards as possible.

Unigine Heaven 4.0

Again, here are the settings in a table:

Unigine Heaven
Run 1
1920x1080 (1080p)
API: DirectX 11
Quality: Ultra
Tessellation: Normal
Anti-aliasing: x2
Run 2
3840x2160 (2160p)
API: DirectX 11
Quality: Medium
Tessellation: Moderate
Anti-aliasing: Off
Loops
1
6. Metro: Last Light And Thief

Metro: Last Light

Is it better to create your own benchmark, or use a game's built-in test designed to deliberately push graphics hardware? In this case, Metro: Last Light includes a tool for dialing-in settings and creating repetition. It's a worst-case example of what your GPU will have to endure when you play. And, if you already own Metro, it's easy to replicate the options we picked and compare your machine's performance.

The following video shows one of the four benchmark runs we execute. The first loop heats the GPU being tested, while results from the last three are averaged together.

Metro Last Light

The 4A engine pushes almost every graphics card to its limit, so its inclusion is intended to represent some of the lowest performance you'll see from any given board.

Metro: Last Light
Run 1
1920x1080 (1080p)
API: DirectX 11
Quality: Very High
AF: 16x
Motion Blur: Normal
Tessellation: Normal
SSAA: No
Run 2
3840x2160 (2160p)
API: DirectX 11
Quality: High
AF: 16x
Motion Blur: Low
Tessellation: Normal
SSAA: No
Loops
Four per resolution; three used for evaluation

Thief

Thief is demanding in its own right. It also includes a built-in benchmark, which gives you an open invitation to do some comparative testing at home. That metric is quite memory-heavy and it'll punish any graphics card without enough on-board RAM to handle the resolution and settings you pick.

The test is short enough that we're able to run it three times back-to-back. Again, the first iteration heats each GPU, while the second two are averaged. The video shows the benchmark sequence we use for our chart results.

Thief

And here are the settings in a table:

Thief
Run 1
1920x1080 (1080p)
Full-screen Mode (Exclusive)
V-sync: Off
Engine: 64 Bit
Preset: Very High
Run 2
3840x2160 (2160p)
Full-screen Mode (Exclusive)
V-sync: Off
Engine: 64 Bit
Preset: Normal
Loops
Three per resolution; two used for evaluation
7. DiRT 3 And BioShock Infinite

DiRT 3

The following benchmarks aren't as taxing as those on the previous page. That means dealing with much higher frame rates, even at the most demanding quality settings. High-end graphics cards can get CPU-limited pretty easily in titles like these, which is why we went with an overclocked Ivy Bridge-E-based processor.

The following video shows one of the three loops we run per resolution. Once again, we use an average of the last two runs, while the first run gets each GPU up to its operating temperature.

DiRT3

We made a conscious choice to use the most demanding detail settings possible in this game, since even a GeForce GTX 750 Ti cuts right through at 1920x1080. Higher resolutions are needed to push mid-range cards.

DiRT 3 - Michigan - Route 0 - 1 Car
Run 1
1920x1080 (1080p)
API: DirectX 11
Quality: Ultra
Anti-aliasing: 8x
Run 2
3840x2160 (2160p)
API: DirectX 11
Quality: Ultra
Anti-aliasing: 8x
Loops
Three per resolution; two used for evaluation

BioShock Infinite

BioShock Infinite is a good title for lower-end graphics cards as well, so we decided to include it. Our performance charts aren’t just meant for the high end, but also for regular users, after all.

We perform three benchmark runs per resolution, the first of which heats each GPU. The other two are averaged together, serving as the result in our charts section. The video shows our benchmark sequence.

Bioshock Infinite

And the obligatory settings:

BioShock Infinite
Run 1
1920x1080 (1080p)
Texture Detail: Ultra
AF: 16x
Dynamic Shadows: Ultra
Post-processing: Alternate
Light Shafts: On
Ambient Occlusion: Ultra
LoD: Ultra
Run 2
3840x2160 (2160p)
Texture Detail: Medium
AF: 16x
Dynamic Shadows: High
Post-processing: Normal
Light Shafts: On
Ambient Occlusion: Medium
LoD: High
Loops
Three per resolution; two used for evaluation
8. Tomb Raider And Hitman: Absolution

Tomb Raider

We’re going relatively easy on our test group with Tomb Raider. Typically, this game is made more demanding by enabling its compute-heavy TressFX feature. We disable the AMD-biased capability, though. And we aren't using PhysX in some of the other benchmarks. Fair is fair.

The benchmark runs three times, though our video only depicts one iteration. Of course, that first time through is discarded, and the second two are averaged together.

Tomb Raider

We adjusted the settings once again to let us test a wide and balanced range of boards rendering at smooth frame rates.

Tomb Raider
Run 1
1920x1080 (1080p)
API: DirectX 11
Quality: Ultra
Anti-aliasing: FXAA
Texture Quality: Ultra
AF: 16x
Hair Quality: Normal
Shadows: Normal
Shadow Resolution: High
SSAO: Ultra
DoF: Ultra
Reflection Quality: High
LOD Scale: Ultra
Post-processing: On
High Precision RT: On
Tessellation: On
Run 2
3840x2160 (2160p)
API: DirectX 11
Quality: Ultra
Anti-aliasing: Off
Texture Quality: High
AF: 8x
Hair Quality: Normal
Shadows: Normal
Shadow Resolution: High
SSAO: Normal
DoF: Normal
Reflection Quality: High
LOD Scale: Normal
Post-processing: On
High Precision RT: On
Tessellation: Off
Loops
Three per resolution; two used for evaluation

Hitman: Absolution

Hitman is also lightweight enough that it can be played on almost any graphics card (Ed.: In fact, poor scaling was why I pulled it from our graphics card launches). It might not be the most recent game, but we still like to include it for this reason.

Another three benchmark runs per resolution give us one warm-up and two results to average. The video showcases the sequence used for our test.

Hitman Absolution

Once again, here are the settings we use:

Hitman: Absolution
Run 1
1920x1080 (1080p)
MSAA: 2x
Texture Quality: High
AF: 16x
Shadows: Ultra
SSAO: Normal
Global Illumination: On
Reflections: High
FXAA: Off
LoD: Ultra
DoF: High
Tessellation: On
Bloom: Normal
Run 2
3840x2160 (2160p)
MSAA: Off
Texture Quality: High
AF: 16x
Shadows: High
SSAO: Off
Global Illumination: On
Reflections: High
FXAA: Off
LoD: High
DoF: High
Tessellation: On
Bloom: Normal
Loops
Three per resolution; two used for evaluation
9. Battlefield 4 And Far Cry 3

Battlefield 4

Our graphics cards are presented with more of a challenge by Battlefield 4. Unfortunately, the only way to get repeatable results is through the single-player campaign. Playing this game online will almost always yield lower performance, typically resulting from a platform limitation. To our credit, we did search for a fast-paced, scripted sequence that'd serve as a suitable benchmark.

The video contains one of the two benchmark runs per resolution. We’re only using the second run for our results; the score is very reliable, since it's, again, scripted.

Battlefield4

You'll find our test settings in the table below:

Settings
Run 1
1920x1080 (1080p)
API: DirectX 11
Quality: Ultra
Run 2
3840x2160 (2160p)
API: DirectX 11
Quality: Normal
Loops
2 per Resolution, 1 Used for Evaluation

Far Cry 3

By no means is Far Cry 3 a recent release, but it's still graphically challenging. And it's pretty safe to assume that any driver optimizations for this one are already rolled in to the software we're testing. This makes it a safe and stable performance benchmark, providing reliable control measurements for a long time to come.

Even though there are three benchmark runs per resolution, the graphics card is warmed up by running around the test's starting point after loading our saved game. The average of all three manual runs is then used for the performance results. We’re including panning to provide a nice panoramic view, a scene that includes plenty of distance, and a change from land to water. Check out the sequence for yourself.

FarCry3

One last time, the settings are in the table below:

Far Cry 3
Run 1
1920x1080 (1080p)
Quality: Ultra (Preset)
Run 2
3840x2160 (2160p)
Quality: Normal (Preset)
Loops
Three per resolution
10. Covering The Bases

This last page includes some of the details that we don't want falling through the cracks.

Measuring Temperatures

For example, we’re measuring temperatures in a room with an ambient temperature of 22 degrees Celsius; it has a very good air conditioner that keeps up with this requirement easily. Twenty-two degrees strikes a good balance between an air-conditioned room in the summer and a heated space in the winter. Also, the room is large enough that even high-end configurations don't heat it up and affect the results.

Emulating Reference Graphics Cards

Vendors don’t always send us reference samples to review when a model launches. If the "new" board is old technology, rebranded, such as AMD's Radeon R7 265, we simply use the original card and adjust clock rates as necessary.

On the other hand, some products are never made available as a reference design, in which case we have to use a board partner's interpretation of the card set to reference frequencies. In those cases, we mark the chart entry with a (*) and leave out noise and temperature measurements.

Lower-End Graphics Cards

Some entry-level cards just cannot keep up with our benchmark settings, particularly at higher resolutions. We skip them with this happens. Again, I want to mention that we'll have a separate collection of charts for lower-end discrete boards and integrated graphics engines. The quality options will be re-calibrated with a lot less performance in mind. 

Nevertheless, mid-range cards are still important in the marketplace. So we fully intend to run them through our complete benchmark suite, while maintaining the quality of our measurements, refilling our charts section with precise benchmark results.

Charts like ours always represent a compromise between effort and depth, which is why we've forgone some blockbuster titles in favor of a more balanced mix of benchmarks. We think that we’ve succeeded in providing a good combination of game complexity, resolutions, and detail settings. Hopefully, this, in conjunction with our elaborate power consumption and noise data, allows to paint an objective picture of today's (and yesterday's) most popular graphics cards for you.

In The Weeks To Come...

Since we’d like to first and foremost provide a good overview of the current state of affairs, we’re starting with AMD’s and Nvidia’s current-gen reference card models. The next step is to include the prior generation still used by so many gamers. Finally, we’ll add board partner offerings from the past two years.