Dell T3600 Workstation + R9 Nano – power delivery issue

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510
Hi everyone,

I’m in the process of migrating from a Dell Precision T3500 to a T3600, and I bought an XFX R9 Nano to upgrade my old Radeon HD 7850. The problem is, I can’t get my R9 Nano to do anything besides surf the web in terms of power – if I try Furmark, games like Fallout 4, even the Windows Experience Index tool, the PC crashes to a black screen and then restarts.

It’s obvious that I have a power delivery issue (or a defective card) – the bitch is determining WHAT is causing it. Before you say ‘Dell PSUs suck, replace it,’ I’m pretty sure I’ve got enough power.

Dell T3600 PSU: 635W, graphics power rated to 300W (for 2 cards), with 2 6-pin PCIe cables (which get combined to an 8-pin lead to the Nano).

Additionally, the T3600 is approved by Dell to run 225W cards like the Quadro K6000 and Tesla C2075. Here’s an example I found on the forums of someone running a card with higher power draw, using a lesser-rated PSU without incident.

The R9 Nano has a TDP of just 175W, powered by 1 8-pin PCIe connector. It SHOULD work, but doesn’t so far – just black screens when I stress it at all.

Here’s what I’ve tried so far:

1. Reinstalled old/new drivers. No difference in results.
2. Tested the Nano in the T3500 (which has an 875W PSU from a T5500). Same black screen happens.
3. Replacing the Nano with the HD 7850 in the T3600, running Furmark for ~20 minutes, using each of the 6-pin cables for a test run. Both tests worked fine, which leads me to think that each PCIe line is working and capable of delivering enough juice to run the Nano.

As far as I can tell, here are the possible problems.

A. The 6-pin/8-pin PCIe combiner cable that came with the card might be the culprit. Replacing it is an option.

B. The R9 Nano is bad, and needs replacement.

C. The PSU or power distribution board in the T3600 is bad, but I’m not convinced because the same thing happened when I tested the card in my older workstation.

D. Something I’m not smart enough to diagnose.

SMRT people of Tom’s Hardware, what should I do?
 
Solution
While I would normally blame the PSU in this case, it seems that it is most likely the Nano. I would RMA it, and if that doesn't fix it, get a new Power Supply.

Kraszmyl

Distinguished
Apr 7, 2011
196
0
18,760
Responding to your PM here so others might see it.

Some of this isn't relevant but might be helpful in the future to you.

So the rails on the t3600/t3610/t5810 are split into 18 amp chunks over however many rails. I found this to be an issue with a gtx690 until I realized what I dumbass I was. I had assumed they were single rail like the previous generation.

The rails on the t3500/5500 if I recall correctly should be single rail.

But that should all be fine with the nano.

So since you've tested it in both the t3600 and the t3500 with the same results I have two possible for you. The 6x2 to 8pin adapter is screwing up or the Nano itself is defective.

Cause ive also had two gtx 680s in a t5500 and t7500 without issue and those pull considerably more than your Nano too.

 


MinisterOfEtc,

You've been methodical in looking for answers in context of the T3600 specification- which was designed for large, hot-running workstation GPU's, eliminated the R9 itself as defective, as well as the drivers so the only comments not already offered are:

1. The R9 Nano is overheating and having a thermal shutdown.

This is unusual design as it is a double height 4GB card with 4096 shaders in a case the footprint of a GTX 750Ti 2GB. Because of potential heat problems in the small format, the Nano was slightly declocked from the Fury which is twice is long and has two fans. And, while it seems to have a good arrangement of heatpipes and dissipation with the card upside down- meaning the fan is on the underside pulling the heat-which rises- out the bottom may be marginal or not ramping up fan speed. Because the card continues to work at browsing loads but shuts down on games maked this a possibility

You might run the T3600 with an always on top temperature monitor. The Nano will shut down at 95C. If this happens at a lower temperature, make a note of the temperature at the time it shuts off. Run the T3600 with the case side off and set the most power floor or table fan in the house blowing directly on the Nano and try again. this will test for thermal shutdown and the case off might reveal if the firmware is shutting it off at the wrong temperature.

If the problem persists, and the card is new and can be returned, consider buying the full length Fury which is less likely to have the possible problems due to the very compacted cooling system.

2. The R9 Nano is shutting down due to a glitch in power saving function:

"AMD PowerPlay Technology
AMD PowerPlay technology is designed to enable power saving profiles that help reduce power consumption when the GPU is idle or in minimal use in comparison to previous AMD products. This dynamic power management enables the GPU to automatically adjust power between low, medium and high states for a tremendous power efficiency advantage. For example, when receiving and composing emails little demand is on the GPU and it runs in a low state, whereas when gaming, there is high demand on the graphics engine and the GPU runs in a high state."


In this scenario, the power saving function is turning it off.

Can the firmware be updated or are there settings that can turn off that function?

Let us know how you progress.

Cheers,

BambiBoom

1. HP z420 (2015) > Xeon E5-1660 v2 (6-core @ 3.7 / 4.0GHz) > 32GB DDR3 1866 ECC RAM > Quadro K4200 (4GB) > Intel 730 480GB (9SSDSC2BP480G4R5) > Western Digital Black WD1003FZEX 1TB> M-Audio 192 sound card > 600W PSU> > Windows 7 Professional 64-bit > Logitech z2300 speakers > 2X Dell Ultrasharp U2715H (2560 X 1440)>
[ Passmark Rating = 5064 > CPU= 13989 / 2D= 819 / 3D= 4596 / Mem= 2772 / Disk= 4555]
[Passmark V9.0 Beta Rating = 5019.1 > CPU= 14206 / 2D= 779 / 3D= 5032 / Mem= 2707 / Disk= 4760] 3.31.16
[Cinebench R15 > CPU = 1014 OpenGL= 126.59 FPS] 7.8.15

2. Dell Precision T5500 (2011) (Revised) > 2X Xeon X5680 (6-core @ 3.33 / 3.6GHz), 48GB DDR3 1333 ECC Reg. > Quadro K2200 (4GB ) > PERC H310 / Samsung 840 250GB / WD RE4 Enterprise 1TB > M-Audio 192 sound card > Logitech z313 > 875W PSU > Windows 7 Professional 64> HP 2711x (27", 1920 X 1080)
[ Passmark system rating = 3844 / CPU = 15047 / 2D= 662 / 3D= 3550 / Mem= 1785 / Disk= 2649] (12.30.15)

3. Dell Precision T3500 (2011) (Rev 2) Xeon X5677 4-core @ 3.46 / 3.73GHz > 12GB (6X 2GB) DDR3-1333 ECC > Quadro 4000 (2GB) > PERC 6/i + Seagate 300GB 15K SAS ST3300657SS + WD Black 500GB > 525W PSU> Windows 7 Professional 64-bit > 2X Dell 19" LCD
[Passmark system rating = 2751, CPU = 7236 / 2D= 658 / 3D=2020 / Mem= 1875 / Disk=1221

 

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510


BambiBoom,

Thanks for the 2 suggestions to investigate. I just fired up the Nano from cold/off overnight, and used GPU-Z to monitor its sensors while I ran the Windows Experience Index test. Here're the stats reported the last instant before the PC predictably rebooted midway through:

GPU Core Clock [MHz] 991.7
GPU Memory Clock [MHz] 500.0
GPU Temperature [°C] 58.0
Fan Speed (%) [%] 32
Fan Speed (RPM) [RPM] 1580
GPU Load [%] 64
Memory Controller Load [%] 26
Memory Usage (Dedicated) [MB] 221
Memory Usage (Dynamic) [MB] 85
VDDC [V] 1.2375

58 degrees is nowhere near the thermal limit of 75 C for this card (which is when the Nano starts thermal throttling, according to reviews), so I'm unconvinced that it's a heat issue -- especially since I can crash the card by pressing the "start" button in Furmark. No rendering ever appears on screen, we just go from Windows desktop to instant reboot.

Now, how could I test PowerPlay? Would PowerPlay interrupt things while the card is under 60%+ load?

Thanks again for the suggestions!

MoE

p.s. Just to clarify, I've been able to replicate this exact behavior in 2 different boxes, my new-to-me T3600, and a T3500 that I dropped a T5500 PSU and harness into. The only common variable are the PCIe cable (splicing 2 6-pin leads into 1 8-pin) and the Nano itself.
 

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510


Kraszmyl,

Thank you for responding to my PM and weighing in here. Given that you've written in the past about knowing how to configure Dell workstations to max out their graphics power draw, I knew you'd have something to contribute.

I think you're right that it's the card or the 6/8-pin adapter, I just have to find another adapter that I can test before I RMA the card (Newegg's mad at me because I bought a 390x that didn't fit in my T3500 last year).

I saw in the thread I linked earlier that you're capable of running 2 GTX 980s in a 3610 on the 685W PSU. How are you delivering enough power to them? Is all the power coming from the 8-pin connector on the PSU, or are you splicing in from the SATA power connectors as well?

Thanks again for the help!

MoE
 
MofE,

Interesting results- and good to eliminate thermal throttling as a possibility.

1. It would inexpensive to buy a 4-pin Molex to 8-pin adapter and that could eliminate the power continuity. In the old days of 225W GPU's, I've had GPU's requiring multiple power connections and used Molex to 6 and 8 pin adapters- as the Quadro FX4800 used in a Precision T5400 and T5500 requires one of each..

2. Have a look at Control Panel > Power Options > and ensure it is set to "Performance"

3. While the adapter is on it's way, the symptoms of the problem sound increasingly like driver conflict. Consider starting the system in Safe Mode and load only the graphics driver.

4. As a system new to you, update the BIOS to the latest from dell.com

There can't be too many more options other than having a different GPU, so here's hoping it's the simplest thing!

Cheers,

BambiBoom


 

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510


BambiBoom,

Thanks for the additional suggestions! Here's where I stand on each.

1. I'm going to Best Buy after I post this, and seeing if they have a SATA/6-pin PCIe adapter handy (no Molex in these Dells, sadly), as well as a replacement 8/6-pin combiner cable, to rule out the wiring as the root cause.

2. Set Power Options to Performance on both boxen. Still crashes Windows Experience Index.

3. I've tried both AMD Catalyst drivers (15.x.x, from the driver CD that shipped from XFX), as well as the latest Crimson drivers (16.x.x, downloaded via AMD's auto-detect utility), both with the same result. Furthermore, my Radeon HD 7850 ran through its stress-tests on the T3600 using the latest Crimson driver without incident (don't worry, I uninstalled/reinstalled the driver between card pulls). So, I'm not inclined to keep blaming software, especially since it's recurring on 2 different machines, with slightly different OS builds (T3600 = Win7Ult x64, T3500 = Win7Pro x64).

But, if you're still skeptical, perhaps you could point me to a how-to for Save Mode + GPU drivers, and I'll give it a whirl!

4. I updated to the A14 BIOS from Dell, which is the latest, when I initially set up the machine last week, which should rule that possibility out.
 

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510
UPDATE

After checking Best Buy and 4 other stores, only to find that I can't get a replacement cable today. I said "screw it" and returned the Nano for RMA.

At the same time, I'm getting a new 8-pin cable made, so I can plug directly into the PSU with the new card, and rule out cabling as a potential cause.

Once the replacement card arrives, I'll revisit this with either more headaches or my discovered solution.

Thanks everyone for your help!
 

Kraszmyl

Distinguished
Apr 7, 2011
196
0
18,760
685w with the 6pins split into two more 6pins on each. Unfortunately I then learned that uh t3610 is not certified for SLI and had to do some work arounds there I wasn't happy with. Which is why I incorrectly assumed it was still single rail. I then realized my mistake when I put a gtx690 within the t3600 and it wouldn't even post, so splitting 6x2 to 8x2 didn't work, but oddly enough 6x2 to 6x4 was fine. Might also be that the t3600 distributes power differently too. I unfortunately just threw a r9-285 in it for him and called it a day so I no longer have that machine.

Regardless the Nano isn't putting anywhere enough juice to touch even the normal limits you could buy and configure on the dell website. So power is very likely not your issue and I stand by either cable issues or the card.

Bambiboom also brings up a good point about heat and airflow.

I had a windforce 680 that kept fighting with my t3600. The dells are designed to be rackable and do front to back air flow. They are also designed to keep fans low as possible and let stuff run hot.The nano if I recall correctly throws air into the front of the case along with the back so that might be an issue as well. The symptoms of my windforce 680 was the system would remain stable but the gpu and system fans would max out and the 680 would then start to throttle badly since it could over power the intake fans.

Edit

I don't have one of the finished product with the splitter cables. But you can see the ghetto quick and dirty test.

http://imgur.com/014fSmv

Edit 2

Bonus pictures of the poor windforce 680 I butchered before realizing there would be a cooling issue. I then also after buying a 780/titan cooler I learned gigabyte doesn't use standardized mount points........http://imgur.com/a/NsXcE
 

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510


Kraszmyl

Wow, you've officially made everyone who knee-jerk reacts that Dell PSUs suck eat their words! Sure, their consumer-grade stuff is largely pot metal, but the Precision and server lines get it done!

So when you ran the dual 980s, you were drawing ~450W of power for the cards (150W from the SATA rails, 150W from the PCIe rails, and 150W from the mobo), leaving the rest of the system with 235W, of which an E5 Xeon needs ~130W, which leaves the system with around 100W of headroom. What workarounds did you do to make it all work?

Also thanks a ton for the advice on the airflow. That's part of why I splurged for the Nano in this rebuild -- it's small, efficient, and doesn't throw off much heat. Luckily, I get really good ventilation where I situate my workstation, with a gentle front-to-back draft. It's no data center, but it kept my T3500 happy.

The dumbest thing that I see in the T3600 is that they flipped the chassis, so that PCIe cards hang upside down, and their fans blow hot air into the bottom of the case. That worries me most.

Well, I had a custom 8-pin cable made (for $7), and the RMA is underway. I'll be back to you when I have new clay to work with!
 

Kraszmyl

Distinguished
Apr 7, 2011
196
0
18,760


The Xeons rarely pull that much same deal with the cards

Good example is my current setup with a e5-1650v2 + 980. PSU reports about 270w pulled total even when playing ARK, MWO, or Division at 3440x1440. http://imgur.com/4D0Y9em. This is well under what the theoretical max is and ignores all the other random goodies I have running too as my resting pull is ~70-80.

My current guess is two of the 18a rails are directed to the 8 pin on the power splitter board with an extra 18 going into the mobo to drive the random pcie devices. Like youd see on the x79/x99 gaming boards that have an extra 6 pin that goes into the mobo.

Because you could order the thing with two w7100 which are effectively a 280x if I recall correctly or the newer 5810 can come with two m5000s were are gtx980s. The power distribution board looks the same for both systems, but I don't see a part number handy sadly to confirm.

So no real work around for power stuff. Just the stupid workarounds to trick SLI into working....seriously screw you NVidia and your SLI certifications.

I did convert the sata to two extra 6pins to see if it worked, but in the final product I just split each of the 6s off the mobo which worked fine for the dual 980s.

The 780 was 6+8 and converting off the two 6pins again worked fine.

The 690 which was two 8pins did not work converting off the 6pins and I frankly don't know why as it should have been fine. I did not attempt to convert off the sata and split it.

Ya dell consumer products are god awful. Tho the XPS line has seen some neat stuff recently. Alienware/Precision/Latitude are some good shit tho for parts quality.

As for noise that's variable. After my t7500 + ssds I'm pretty much used to a completely silent system. Like my current motherboard is super unhappy about the fact the fans spin at around 600rpm and kept throwing up warnings that they might be dying until I told it to stfu.

Edit -

One thing that does piss me off is that Dell does limit connectors in the 3/5/7 series machines to make you go up one tier. Like the t7xxx has all the damn 8/6 pin connectors in the world and the t5xxx has slightly fewer. Then the t3xxx get screwed hardcore. Like the t3500 has a single six pin connector on the 525w psu harness. Like the t5810 comes with a 800w psu option but iirc is still limited to 2x6 officially.
 

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510


Kraszmyl

Wow, thanks for all this detail -- I think this is the first time this is being put up on the forums, because I did a pretty thorough search before I posted my question/problem (which of course directed me to you and BambiBoom as my Dell Workstation gurus...).

So it sounds like the 8-pin on the Power Distribution Board has a limit of between 225W and 300W, but that your twin 980s perhaps didn't trip it because they never went above the max load for the port, while the 690 not posting remains a mystery. I wish they'd just publish the spec!

Out of curiosity, I looked at the 7600 PDB, and it's too wide to back-fit into the 36xx/56xx chassis, and would hit the case edges (which doesn't matter in the 76xx because they moved the PSU location). And if the 56xx/58xx had come with a PDB that had two 8-pin connectors, it'd be fine for running almost any GPU, because at that point you'd have anywhere from 450W to 600W of available juice (if our suspicions are right about the rails and rated load).

How can you trick SLI to work on these motherboards (like I'll care, since I'm going AMD, unless this'd apply to CrossFire too)?

MofE
2. Ever give any thought to
 

Kraszmyl

Distinguished
Apr 7, 2011
196
0
18,760
Different SLI and SLIPatch are two I recall off hand, however here are several programs that do it. What they do is either modify the drivers to accept any motherboard as sli certified or modify how your motherboard reports itself and shows a certified one. However I do not believe they are compatible with Maxwell cards as updates are spotty from all of them. The google terms you generally would use are SLI emulation, sli on non certified, sli on non sli board, so on.

Plus side most games have ignored sli and dx12 might make it pointless in general. Dumped my extra 980s a few weeks ago.

Sorry for the slow responses.

Yep if you look at the older precisions so the 500s and lower they actually have very detailed power distribution diagrams. But Sandy-E and higher its pretty vague sadly.

I swear the big guys fit? I could be mistaken I guess, the 3 and 5 definitely share however and the 800 should be able to carry anyone. The distribution boards look like the same mounts. That being said the extra 4pin is labled cpu 2. That being said every modular psu ive dealt with the 6+2 gpu connectors shared the same sockets as the 8 pin cpu power and were interchangeable so might be doable. Still a risk tho and the pin sockets are different shapes on the dell product, but dunno. Also the power boards look pretty dumb, so everything might be in the psu and the 800 might shunt more through the 5600? Ive only directly dealt with the 7s and 3s. Honestly don't see the point of 5s.

http://www.ebay.com/itm/Dell-Precision-T3600-T3610-Power-Distribution-Board-Backplane-599RD-0599RD/252382442279?_trksid=p2047675.c100005.m1851&_trkparms=aid%3D222007%26algo%3DSIC.MBE%26ao%3D1%26asc%3D36142%26meid%3Df2e4af355009480ea9c1449ebc437233%26pid%3D100005%26rk%3D3%26rkt%3D6%26sd%3D152010774026

http://www.ebay.com/itm/Dell-Precision-T5600-CVHT6-Power-Distribution-Board-T5610-/231821550129

Also for shits and giggles, this is how it looks currently. http://imgur.com/PpryddQ
 

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510


Kraszmyl,

Great timing on your part. After a month-long RMA process, I just got my replacement Nano, and a new 8-pin PCIe cable.

I get the same result. :(

So, my options to diagnose the issue are:

1. Buy a $5 SATA-to-PCIe 6-pin adapter, and see if the card works when pulling from another rail on the PSU

2. Buy a $15 replacement power distribution board, and see if the power supply is indeed bad.

3. Buy a $65 replacement PSU, see if that fixes it.

What would your plan be?
 

MinisterOfEtc

Commendable
Apr 10, 2016
10
0
1,510


Shame on me for trusting Dell's Hardware diagnosis tools.

Another problem was that Dell's ProSupport wasn't an option (they refused to help me), because I wasn't running an "approved GPU," and the original owner of the workstation didn't send their GPU along with it when I bought it.

What I'd like more than anything is the month-plus of my time back. :(

What I also don't get is why the first Nano I had demonstrated the same behavior in both the Dell Precision 3500 (old box) and the 3600 (new box), and yet I thought I had a solid PSU in the old box (I specifically upgraded it to an ~800W unit).

Oh well, from now on I replace the PSU first.
 

gffermari

Commendable
Oct 13, 2016
2
0
1,510
I'm facing the same problem too.

My pc shuts down every time i enter to a game or after playing for some seconds. But there are some games that are playable only if i turn the VSYNC on. I assume that the vsycn limits the gpu by reducing the gpu usage. So these games run normally.
Some of them are ARMA 3, CS:GO, PES 2016, FIFA 17, Battlefield 4.
On the other hand Eurotruck Simulator 2, Ryse Son of Rome, some other simulators such as mining etc., push unreasonably hard the gpu, even when the vsync is turned on, making the pc shut down and restart.

1.It's weird but the pc manages to pass the stability test (AIDA64) or the benchmarks (3D Mark/Steam VR) that i put it to run.

2.System temperatures are quite good. The cpu marginally reaches to 60oC at full load while the gpu hits 67-70oC at medium-low rpm and 60-62 at high.

3.I have upgraded the mobo's BIOS and gpu's drivers to the latest available.

My pc consists of an i7 4770K (non overclocked), 16GB HyperX Beast, XFX R9 Nano (just bought, replacing a Gigabyte 7970), some ssd's etc, all powered by an old but wholly respectfable Enermax Infiniti 720W.

I don't know what else to do and i do not have an another 1150s mobo to test the gpu.

@MinisterOfEtc have you solved your problem?