Sign in with
Sign up | Sign in
AMD FirePro W9100 Review: Hawaii Puts On Its Suit And Tie
By ,
1. Hawaii Goes Professional

For the first time since 2007, AMD has a FirePro-branded card based on a really big GPU. At 6.2 billion transistors, the Hawaii processor boasts 44-percent more logic than the FirePro W9000’s Tahiti chip. They're both manufactured at 28 nm, though.

How else are the previous flagship and AMD's more recent introduction similar? Glad you asked!

First, let's compare the technical specs of two Nvidia Quadro cards to AMD's FirePro W9100. Hopefully that'll give us some basis for a performance expectation. While the AMD flagship's $4000 suggested retail price is higher than the Quadro K5000, it's still shy of the Quadro K6000. So, we'll put the FirePro in the middle of the following chart.


Nvidia Quadro K6000
AMD FirePro W9100
Nvidia Quadro K5000
Shaders
2880 CUDA cores
2816 Stream processors
1536 CUDA cores
FP32 Performance (SP)5.2 TFLOPS
5.24 TFLOPS
2.2 TFLOPS
FP64 Performance (DP)1.73 TFLOPS2.62 TFLOPS
0.09 TFLOPS
Memory Size12 GB
16 GB
4 GB
Memory Bus384-bit
512-bit
256-bit
Memory Bandwidth288 GB/s
320 GB/s
173 GB/s
ECCYes
Yes
No
PCI Express Bandwidth32 GB/s
32 GB/s16 GB/s
4K2K Displays @ 30 Hz2
6
2
4K2K Displays @ 60 Hz2
3
2
Power Consumption (measured)187 W (3D load)
202 W (GPGPU)
245 W (3D load)
260 W (GPGPU)
126 W (3D load)
145 W (GPGPU)

When you have performance to offer, new opportunities present themselves. AMD identifies CAD and engineering, media and entertainment, medicine, and finance as some of its more traditional strong points. But with its big Hawaii GPU and the GCN architecture's alacrity in compute-intensive tasks, the company wants to lock down its share of the virtualization, cloud gaming, and signage segments as well.

The ambition makes sense. Workstation-oriented apps benefit more and more from the performance of modern GPUs, after all. Nowadays you can even run multiple CAD and CAE workflows at the same time. Cranking along on the next version of a drawing while rendering the previous one isn't a pipe dream. This stuff is actually doable. And the sky's the limit with a design equally adept in 3D- and general-purpose tasks.

AMD is already a seasoned vet when it comes to 3D. Now GPGPU is where it's trying to lead development. In order to better facilitate that initiative, the company is throwing its support behind the OpenCL standard as an alternative to Stream and CUDA. As we've seen in several different applications already, when there's a computationally difficult job that can be parallelized, the potential performance gains are well worth optimizing for.

There's also a notable trend toward the adoption of 4K (3840x2160) in the workplace. Those higher resolutions give engineers and artists a lot more room to work with. And while more detail obviously benefits 3D applications, even 2D tasks like programming are greatly enhanced by the extra screen space and pixel density of a 4K display.

Similarly, professional media-oriented titles see a lot of benefit as it becomes possible to edit high-res video in real time at full resolution. A workstation board like the W9100 should speed up the processing of video and photo filters, along with accelerating encoding/decoding.

The workstation graphics card market is clearly changing, and the lines between various segments are getting blurrier, even as the workloads and data sets are more specific than ever. CAD, CAE, M&E, oil and gas...the FirePro W9100 is AMD’s most recent effort to grab a larger share of all of them. But enough background. Let's put this card through its paces.

2. The Differences Between Hawaii And Tahiti GPUs

While Hawaii's 438 mm² die is still smaller than the GK110 on Nvidia's Quadro K6000, it's the largest GPU AMD has ever manufactured. The legendary R600 was a "mere" 420 mm².

In most respects, the implementation of AMD's Graphics Core Next architecture on Hawaii is almost identical to the FirePro W9000’s Tahiti GPU. Specifically, the Compute Unit building block is the same. All 64 IEEE-754-2008-conformant shaders consist of four vector and sixteen texture fetch load/store units.

There are, of course, improvements over the Tahiti GPU on AMD's FirePro W9000, such as device flat addressing to support standard calling conventions, precision improvements to the native LOG and EXP operations, and optimizations to the Masked Quad Sum of Absolute Difference (MQSAD) function, speeding up algorithms for motion estimation.

And with the introduction of DirectX 11.2, programmable LOD clamping and the ability to tell a shader if a surface is resident were added. Both are tier-two features associated with tiled resources.

The main departure from the W9000's GPU is the arrangement of Compute Units. Whereas Tahiti employs 32 CUs, totaling 2048 shaders and 128 texture units, Hawaii wields 44 CUs organized into four of what AMD calls Shader Engines. The math adds up to 2816 aggregate shaders and 176 texture units.

The new GPU employs eight revamped Asynchronous Compute Engines, responsible for scheduling real-time and background tasks to the CUs. The W9000 has only two. Each ACE manages up to eight queues, totaling 64, and has access to L2 cache and shared memory.

It makes a lot of sense to dedicate more resources to the arbitration of GPU resources between computation and graphics; this improves overall efficiency.

The W9000’s front-end fed vertex data to the shaders through a pair of geometry processors. Given its quad-Shader Engine layout, the FirePro W9100 doubles that number, facilitating four primitives per clock cycle instead of two. There’s also more inter-stage storage between the front- and back-end to hide latencies and realize as much of that peak primitive throughput as possible.

In addition to a dedicated geometry engine (and 11 CUs), Shader Engines also have their own rasterizer and four render back-ends capable of 16 pixels per clock. That’s 64 pixels per clock across the GPU—twice what the W9000’s GPU could do. The W9100’s Hawaii chip enables up to 256 depth and stencil operations per cycle, again doubling Tahiti’s 128.

On a graphics card designed for high resolutions, a big pixel fill rate comes in handy, and, according to AMD, in many cases, this shifts the chip’s performance bottleneck from fill to memory bandwidth.

The shared L2 read/write cache grows from 768 KB in Tahiti to 1 MB, divided into 16 64 KB partitions. This 33% increase yields a corresponding bandwidth increase between the L1 and L2 structures of 33% as well, topping out at 1 TB/s.

It makes sense, then, that increasing geometry throughput, adding 768 shaders, and doubling the back-end’s peak pixel fill would put additional demands on Hawaii’s memory subsystem. AMD addresses this with a redesigned controller.

The new GPU features a 512-bit aggregate interface that the company says occupies about 20% less area than Tahiti’s 384-bit design and enables 50% more bandwidth per mm².

How is this possible? It actually costs die space to support very fast data rates. So, hitting 6 Gb/s at higher voltage made Tahiti less efficient than Hawaii’s bus, which targets lower frequencies at lower voltage, and can consequently be smaller. Operating at 5 Gb/s in the case of the FirePro W9100, the 512-bit bus pushes up to 320 GB/s. In comparison, Tahiti maxed out at 288 GB/s.

3. FirePro W9100: Dimensions, Weight, And Features

Let’s take a quick look at the mechanical specs of the card:

Dimensions and Weight
Length
11.1”, including power connectors > 12”
(Remember, the PCIe power connectors are at the rear!)
Depth1.34”PCB to top of fans
0.2”back of PCB to top of back plate
Height4.06” from the top of the PCIe slot
Weight2.42 lbs

The card looks quite inconspicuous. Its plain black plastic cover reminds us of the Radeon HD 6970. Even the reference cooler seems to have stayed the same, which is somewhat sobering compared to Nvidia's redesigned Quadro cards.

The W9100’s thermal solution employs the same vapor chamber cooler we know from AMD's FirePro W9000. The prominent red fan forces air through the cooler; the hot air is expelled through the left side of the card, out of its I/O slot panel. As we already know from our Radeon R9 290X and 290 coverage, there is no way for this configuration to be quiet. But we are certain that it does its job.

The back of the card is dominated by a metal plate, which adds rigidity and does double duty cooling the memory packages mounted on that side of the PCB.

There's not much to see on the bottom except for this card's shroud.

The top of the card doesn’t sport any CrossFire connectors; the Hawaii GPU employs a DMA engine that enables CrossFire support through the PCI Express bus. There is one header up there though, which is also present on the FirePro W9000, and it's used for connecting the FirePro S400 synchronization module.

Six- and eight-pin auxiliary power connectors are found at the back of the card. We'll revisit their purpose when we dive deeper into the FirePro's power consumption.

Slot Panel Connectors

Apart from the six mini-DP connectors, which support up to six 4K monitors at 30 Hz, or up to three 4K monitors at 60 Hz, there is also a three-pin mini-DIN connector for 3D displays.

4. How We Test AMD's FirePro W9100

Test Systems and Environment

For this story, we don't overclock Intel's Core i7-4930K, since the workstation world is very stability-sensitive. As a result, our processor runs at a base close rate of 3.5 GHz. But this machine's test system does sport three SSDs now. We keep the operating system separate from the benchmark suite binaries and data logs.

Normally, we would only test with drivers approved by each ISV. However, this isn’t possible for a brand-new card, so we had to use the latest driver available for AMD's FirePro W9100 (refer to table).

The power draw measurements deserve a section of their own, and we're eager to find out how the W9100 differs from AMD's desktop boards. Right out of the gate, we know the FirePro isn’t factory overclocked like so many of those gaming products.

Here's the list of hardware we're using for benchmarking:

Lab BenchMicrocool Banchetto 101
PC System:Intel Core i7-4930K (Ivy Bridge-E), 3.5 GHz, 6C/12T
Asus Rampage IV Black Edition
64 GB Corsair Dominator Platinum DDR3-2133 at 1600 MT/s
Enermax TLC 240 Closed-Loop Liquid Cooler
  Samsung 840 Pro 256 GB (System and applications)
Video Editing
480 GB Corsair Neutron GX (Input)
 WorkloadsSamsung 840 EVO 500 GB (Output)
Power SupplyCorsair AX860i (slightly modified for probing the voltage)
Operating SystemMicrosoft Windows 7 Ultimate x64 (Pro apps and compute)
Microsoft Windows 8 Professional x64 (Gaming)
DriversCatalyst Pro 13.35
Nvidia Quadro Desktop Driver 334.95
Environment22°C room temperature, air-conditioned
5. OpenCL: Compute, Cryptography, And Bandwidth

Shader Performance: FP32 vs. FP64

Let’s start with an OpenCL benchmark, which should push the theoretical ceiling of 32- and 64-bit precision compute performance.

Although this benchmark (along with the cryptography test) is synthetic, its still illustrates Nvidia’s half-hearted support of OpenCL.

Yes, Nvidia offers its proprietary CUDA API, and there are plenty of applications that support it. Increasingly, though, ISVs don't want to support two compute languages, and OpenCL is gaining traction as a result. Even long-time bastions of CUDA support, like Adobe, are adopting OpenCL.

Folding It Up: Folding@Home

Let’s run the Folding@Home benchmark on this card, even though few people would use a $4000 workstation board for folding or Bitcoin mining.

Memory Bandwidth

In the memory bandwidth test, Nvidia’s sub-par OpenCL implementation almost catches AMD's latest. But switching over to DirectX allows GK110's Kepler architecture to beat the FirePro W9100 by 50%.

As we move on to application benchmarks, keep these synthetic benchmarks in mind. They help decipher the performance results of real-world benchmarks, which are subject to influence from other platform subsystems.

At least for now, we have to question whether Nvidia's lackluster support for OpenCL and emphasis on CUDA is the best strategy. Only time will tell.

6. OpenCL: Financial Mathematics And Scientific Computations

OpenCL: Financial Mathematics

Option pricing is a compute-intensive task that gives graphics processors an opportunity to shine. AMD's FirePro W9100 wins in both the single- and double-precision versions of this benchmark.

However, we did notice that the performance difference between the two metrics is a factor of three on the AMD cards, but less than two on Nvidia's Quadro K6000. The K5000 is completely unsuitable across the board.

Scientific Computations

The same ratio applies to AMD's outcome in the GEMM benchmark, and there's a massive drop from single- to double-precision throughput.

Whereas the Quadro K6000 holds onto much of its performance going from single- to double-precision, AMD's flagship sheds a ton of its speed, ending up behind Nvidia's quickest board.

7. 2D Performance: GDI And GDI+

Why Are We Still Looking At GDI and GDI+?

Even in 2014, many applications use GDI and GDI+ for drawing, even if only for their GUIs. Older productivity applications and specific business titles still leverage GDI/GDI+ predominantly. These applications range from simple 2D CAD programs and viewers to pre-print stage WYSIWYG layout programs and file import/export programs.

As modern graphics cards with unified shaders don’t feature dedicated 2D units anymore, and modern operating systems no longer access graphics cards directly, device drivers play a crucial role in facilitating fast 2D functions.

Text

Displaying text is a crucial task and, needless to say, both manufacturers make sure that their high-end graphics cards render large amounts of text almost instantly.

Lines

Another basic 2D element is lines (the lines in a menu, for example). Again, none of the cards encounter problems with this task, though we notice that AMD's products are around 20 percent faster than Nvidia's.

Splines / Bezier Curves

Curvy lines require some computational power, and it's only natural that they take longer to draw. Again, AMD wins by double-digit percentage margins.

Polygons

This benchmark draws filled and unfilled polygons with three to eight vertices, and AMD's hardware knows how to handle it. You can't say the same for Nvidia's Quadro K5000, which is just better than half as fast as the older FirePro W9000.

Rectangles

Yes, rectangles are polygons, too. But GDI exports a separate, simpler API for drawing rectangles. Needless to say, we expected the cards to draw rectangles faster than polygons.

But it seems like Nvidia optimized for this more thoroughly than AMD, as both Quadros outperform the FirePros we're testing.

Circles, Circle Segments and Ellipses

All four cards demonstrate comparable performance in the Arcs and Ellipses benchmark.

Bit Blitting

Bit blitting, which is copying a block from system to graphics memory, is becoming less important. After all, it's the graphics card itself that's supposed to fill its RAM with pixels, not the CPU. Not surprisingly, the performance of this operation hasn't increased much over the past few years. In fact, it actually went the other direction.

Nvidia seems to address the operation a bit better, though in truth every card posts somewhat disappointing results. A DIB (device-independent bitmap) helps; perform all drawing operations into a temporary bitmap residing in the computer’s RAM, and as the final step, push that bitmap to the graphics card.

Stretching

Stretching is even worse, since the CPU has to help out.

Summary

Neither Nvidia nor AMD earn a definitive win when it comes to GDI performance. Disappointingly, 2D alacrity seems to be at a standstill, and it has since 2010. At least we've seen AMD alleviate some bottlenecks since then.

In general, applications achieve better performance if they render everything into a temporary DIB (device independent bitmap) and copy the final result to the graphics card. But at higher screen resolutions, the amount of data that needs to be copied across the PCIe interface can be quite substantial, and it's appalling that, in the age of PCIe 3.0, copying data to the graphics card is still an order of magnitude slower than copy operations within the workstation’s RAM.

With that said, AMD's cards perform better than Nvidia's, mostly due to the faster line, spline, and polygon functions. Perhaps our scathing criticism of AMD’s drivers is partly responsible for this improvement.

The new FirePro W9100 renders complex 2D drawings via GDI almost twice as fast as Nvidia's Quadro K5000. Or only half as slowly. It depends on your point of view.

8. SPECviewperf12: CATIA, Creo, And Maya 2013

Introduction to SPECviewperf 12

SPECviewperf 11, introduced back in 2010, has been showing its age for a while. It wasn't really giving us a realistic-looking picture of modern workstation graphics hardware and driver performance anymore. The applications composing it were just too old. Moreover, AMD and Nvidia were thoroughly optimizing for the specific workloads, throwing off the suite's value.

So, the Standard Performance Evaluation Corporation (SPEC) chose to step up its game with a much-needed update. After all, SPEC’s mission is to create relevant benchmarks that closely adhere to current industry standards.

AMD and Nvidia are both members of SPEC, allowing them to exert some influence over the new collection of tests. The idea is that no company gets an unfair advantage. We'll see how that works out in practice, though.

CATIA V6 R2012

The FirePro W9100 surpasses its predecessor, as well as the Quadro K5000. But Nvidia's Quadro K6000 is in a class of its own. Even so, the W9100 still pulls off an impressive debut.

Creo2

While the new AMD card tops the previous flagship, both Quadro cards outperform it. Clearly, AMD’s driver team still has unresolved action items.

Maya 2013

The opposite is true in Maya 2013, where AMD's close collaboration with Autodesk pays off. Its FirePro W9100 beats Nvidia's Quadro K6000, if barely.

9. SPECviewperf 12: Showcase, Siemens NX, And SolidWorks

Showcase 2013

The FirePro W9100 places a close second after the Quadro K6000. Both cards dominate the rest of the field. Interestingly, Showcase 2013 is one of the very few professional apps completely based on DirectX.

Siemens NX 8.0

Again, Nvidia's flagship and AMD's FirePro W9100 dominate with massive leads over the rest of the field (though the Quadro K6000 has an advantage).

SolidWorks 2013 SP1

The Quadro K6000 offers massive performance at a similarly hefty price. AMD's FirePro W9100 barely manages to surpass its older W9000, but fails to beat Nvidia's Quadro K5000. As you can imagine, then, the performance difference between the FirePro W9100 and Quadro K6000 is humbling, yielding another situation where AMD’s driver team has work cut out for it.

10. SPECviewperf12: Synthetic Simulations

Synthetic Tests: Energy

This benchmark simulates a typical volume rendering application, which is used for geophysical surveys (think seismology, along with oil and natural gas exploration) and medical imaging. During the surveys, 2D images are combined to form volumetric representations, creating 2D and 3D views that can further analyzed and evaluated.

The energy-01 viewset takes advantage of hardware support for 3D textures and the associated trilinear interpolation, which in turn depends on a lot of fast graphics memory. But despite its copious memory bandwidth, AMD's FirePro W9100 finishes far behind the Quadro K6000. In fact, the Hawaii-based board is barely any faster than the W9000. At least the two AMD cards beat Nvidia's Quadro K5000 by quite a bit.

Synthetic Tests: Medical

As with the Energy viewset, which covered geophysical surveys and imaging, SPECviewperf 12 uses a synthetic suite to represent the medical field, making use of functionality that is often used for this kind of texture-based volume rendering. Two-dimensional images, created through the use of computer tomography (CT) or magnetic resonance imaging (MRI), are combined into a 3D representation.

The direct volume rendering is achieved by lining up the image slices in parallel. This is done based on texture coordinates, which are specified at every single vertex. They consist of the location in the 3D space (x, y, and z) and also define the alignment and scaling of the texture on the polygon via an object. Next, the values needed for the actual display are calculated based on the texture coordinates. This is called compositing. The entire volume can be thought of as a large number of voxels, or volume pixels, which contain opacity and color on top of the texture information.

Volume ray casting is used to calculate the actual image from the voxels. The present benchmark has two parts. The “4D Heart Data Set” contains several 3D objects, and the “Stag Beetle” places large demands on memory. AMD's FirePro W9100 is a perfect match for both tests and wins hands-down.

11. OpenCL: 4K Video Post-Processing

Video Editing and Encoding

For multimedia and entertainment applications, professionals want smooth and efficient processing of high-resolution content. OpenCL and CUDA are well-suited for speeding up such complex calculations.

Because 4K (3840x2160) is becoming more and more common in the professional and desktop spaces, we picked two applications that employ OpenCL to accelerate processing (filtering) and encoding of this up-and-coming format.

We modified our test setup slightly by adding a third SSD, Samsung's 500 GB 840 EVO. It receives the output data, which are large H.264-encoded video files. The input files (several 4K TIFF files and a 4K video) reside on a 480 GB Corsair Neutron GX. We wanted to make sure that storage wasn't introducing any performance-altering bottlenecks.

Adobe Premiere CC Pro

Our two tests include a sequence of TIFF-based images affected by OpenCL-accelerated filters and a high-res video run through another series of filters.

In the first test, AMD's FirePro W9100 is just slightly behind Nvidia's Quadro K6000. After dialing back the number of OpenCL-accelerated filters to a more realistic number, this small performance gap shrinks even more to just a few seconds.

Sony Vegas Pro

The FirePro W9100 flexes its muscles in Vegas Pro, leading the Quadro K6000 by more than it trailed in our Adobe Premiere Pro CC test.

Overall, AMD's FirePro W9100 holds its own. And we can see that, in general, the more multimedia content you work on, and the more complex your filters become, GPU acceleration provides greater performance benefit.

12. OpenCL: Rendering Performance

LuxMark vs. RatGPU

Meet two different rendering engines that take different approaches. First, there's the popular LuxRender, on which LuxMark is based. This one finally attracted Nvidia's attention after showing up time and again as a weak spot for the company's GeForce and Quadro cards. RatGPU, on the other hand, didn't need that special attention; Nvidia's offerings did well in it right out of the gate.

LuxRender demonstrates that Nvidia's cards do support OpenCL fairly well, if there's no CUDA option. AMD once enjoyed a significant performance advantage in this test, though the magnitude of its wins is shrinking. Still, the brand-new FirePro W9100 enjoys a significant lead.

The following charts represent LuxMark at three difficulty settings:

AMD doesn't do as well in the ratGPU benchmark. As with LuxMark, we run this test at three difficulty settings:

13. DirectX11 Gaming: Full HD Versus Ultra HD

Gaming Performance at Different Resolutions

For the sake of fairness, we picked four DirextX 11-based games. At 1080p (1920x1080 pixels), the FirePro W9100 wins two of our tests, while the Quadro K6000 rises to the top in the others. But at higher resolutions, AMD's flagship workstation board outperforms Nvidia's in all four titles, even though one benchmark ends in a near-tie.

The FirePro W9100’s advantage comes from its wider and higher-bandwidth memory interface, not its big 16 GB capacity. Of course, there may be other games out there (like Assassin's Creed IV, for example) where the Quadro K6000 is faster. But the outcome generally isn't surprising; on the desktop, Hawaii-based boards tend to outperform their direct competition from Nvidia.

14. How We Test Power Consumption

Test Equipment and Test Procedure

Our power consumption test setup was planned in cooperation with HAMEG (Rohde & Schwarz) to yield accurate measurements at small sampling intervals, and we've improved the gear continuously over the past few months.

AMD’s PowerTune and Nvidia’s GPU Boost technologies introduce significant changes to loading, requiring professional measurement and testing technology if you want accurate results. With this in mind, we're complementing our regular numbers with a series of benchmarks using an extraordinarily short range of 100 μs, with a 1 μs sampling rate.

We get this accuracy from a 500 MHz digital storage oscilloscope (HAMEG HMO 3054), while measuring currents and voltages with the convenience of a remote control.

The measurements are captured by three high-resolution current probes (HAMEG HZ050), not only through a riser card for the 3.3 and 12 V rails (which was custom-built to fit our needs, supports PCIe 3.0, and offers short signal paths), but also directly from specially-modified auxiliary power cables.

Voltages are measured from a power supply with a single +12 V rail. We're using a 2 ms resolution for the standard readings, which is granular enough to reflect changes from PowerTune and GPU Boost. Because this yields so much raw data, though, we keep the range limited to two minutes per chart.

Measurement ProcedureContact-free DC measurement at PCIe slot (using a riser card)
Contact-free DC measurement at external auxiliary power supply cable
Voltage measurement at power supply
Measurement Equipment1 x HAMEG HMO 3054, 500 MHz digital multi-channel oscilloscope
3 x HAMEG HZO50 current probes (1 mA - 30 A, 100 kHz, DC)
4 x HAMEG HZ355 (10:1 probes, 500 MHz)
1 x HAMEG HMC 8012 digital multimeter with storage function
Power Supply
Corsair AX860i with modified outputs (taps)

A Lot Can Happen in 100 Milliseconds...

...and we mean a lot! Let’s take a look at an analysis of all three voltage rails using a 2 ms sample across 100 ms (giving us 50 data points). Just looking at those results makes us pity the power supply. Draw over the auxiliary connectors jumps from 94 to 356 W within a few milliseconds. Fortunately, the test points on the PCIe riser don't have to endure the same drastic load changes.

In contrast to most consumer cards, there is no coil chirping. Of course, given the FirePro's price point, there better not be...

We like that neither AMD nor Nvidia top out the PCIe slot's 75 W ceiling. Instead, the auxiliary connectors shoulder most of the load. There aren't any drastic transients on the motherboard connector, which helps ensure system stability and benefits multi-GPU setups.

15. Power Draw: Detailed Test Results

Power Draw at Idle

Without enabling AMD’s Zero Core power-saving mode, the card dissipates 15.4 W at (almost) idle as it drives a monitor at 60 Hz. We'd like to see lower numbers, though that's still an acceptable result for such a high-end card. Despite the 16 GB of fast GDDR5 memory, the on-board RAM only draws 1 W at idle.

Maximum Power Draw: 3D (OpenGL)

Peaking at 245 W, the FirePro W9100 draws slightly less than the 250 W TDP of AMD's top desktop cards. Approximately 51 W of that comes from the motherboard, while the balance of 194 W is supplied by the auxiliary power cables.

Maximum Power Draw: Compute (OpenCL)

Even at a 100-percent workload, we couldn’t get the card up to its specified 275 W TDP. The pre-heated card bumps against its thermal limit  before maxing out power consumption.

16. Temperature And Sound Level

Temperature Transients

We measure each card's thermal behavior at a constant 22 °C (72 °F) ambient temperature, at normal humidity.

To put the following diagram into perspective, almost every card we benchmark bumps up against its factory-set temperature limit.

Model
Idle
3D Workload
Quadro K5000
30 °C
76 °C
Quadro K6000
32 °C
80-82 °C
FirePro W9100
40 °C92-93 °C
FirePro W9000
34°C
78 °C

Measuring the Sound Level

We measure each graphics card's noise levels with a calibrated high-quality studio microphone (supercardioid) 50 cm away from a position perpendicular to the middle of the board. This distance, as well as the strong cardioid microphone characteristic, represent a compromise between avoiding noise generated by the fan’s airflow and ambient noise that can never be completely eliminated. Our noise-dampening efforts certainly help minimize the latter, but they'll never be 100-percent successful.

As we've seen many times before, reference-class cards typically achieve their cooling performance at the cost of higher sound levels. High-end workstation cards, in particular, exhaust waste heat from their I/O panels to avoid affecting other platform components. However, this is enabled through the use of a radial fan, and our results show that they're quite noisy.

Here are the detailed sound level readings:

Model
Idle
3D Workload, Open Lab Bench
3D Workload, Closed Case
Quadro K5000
30.8 dB(A)
37.7 dB(A)
37.1 dB(A)
Quadro K6000
30.8 dB(A)
42.7 dB(A)
41.2 dB(A)
FirePro W9100
33.5 dB(A)51.3 dB(A)
49.8 dB(A)
FirePro W9000
33.2 dB(A)55.4 dB(A)52.7 dB(A)
17. Does FirePro W9100 Take The Workstation Graphics Crown?

If the purpose of your professional graphics quest is efficiency, AMD's FirePro W7000 is tough to beat. Really, the FirePro W9100's purpose is high performance in 3D and compute-heavy tasks. In those disciplines, the card does not definitively grab the crown from Nvidia's best effort. AMD's latest does, however, appear to be the most cost-effective way of getting close to the pinnacle of what's possible in the workstation world. Take our CAD and CAE, or multimedia and entertainment benchmarks, for example. In those applications, the FirePro W9100 is a perfect match.

As long as AMD enjoys continued success promoting OpenCL acceleration in professional applications, the company's FirePro family should continue claiming market share. Adding the Mac as a supported platform is a step in the right direction, even though the volume of Mac-compatible cards is still low.

4K Resolution and Connectivity Galore

The FirePro W9100 is the first (and currently only) card that can drive up to six 4K monitors at full resolution, even if that means stepping down to 30 Hz when more than three are connected. A massive 16 GB of fast GDDR5 memory is more than enough for anything that you can throw at it.

Cooling and Power Draw

One opportunity for improvement is an underwhelming thermal solution, which we've seen previously on AMD's desktop-oriented reference cards. By redesigning the cooler, some of the company's board partners have already demonstrated that Hawaii can be made to run at much lower temperatures than 92 °C. The challenge, of course, is that those gaming products start exhausting heat inside your chassis, and that just doesn't fly in the workstation world.

Instead, professional cards need to push thermal energy out from their I/O brackets. Nvidia does this successfully with its Quadro cards, and AMD should start following suit. The FirePro W9100's heat sink and fan undoubtedly sacrifice some of the board's performance potential, since Hawaii is known to perform best under optimal cooling.

Suggested List Price

AMD is showing some confidence in pricing its FirePro W9100 at $4000. Compared to the slower Quadro K5000 at $1800 and the faster Quadro K6000 at $5000, AMD isn't far off the mark, though. And in the end, the FirePro W9100 surfaces as a strong candidate for high-end workstation duty, particularly when your workload is well-suited to the GPU's strengths (and the driver team's priorities). 

How does the FirePro W9100 fare in our final analysis, then? The $4000 card's price tag is justified by excellent performance, versatility across mature professional segments and the latest workloads, and unmatched connectivity. You get a mix of speed in 3D tasks and general-purpose compute-intensive apps, or both at the same time.

It'll be interesting to see how many professionals dig deep for no-compromise speed in their performance-sensitive software. If the audience is out there, AMD's FirePro W9100 should help reclaim some of the company's workstation market share.