GTX 680 Throttling and temperature issue

owlovethbl

Honorable
Jun 22, 2014
16
0
10,510
I recently added a second GTX 680 to my rig, in SLI with the first one.

The top card is a Palit GTX 680, and the second is an MSI twin Frozr II GTX 680. Neither card is overclocked.

According to EVGA PrecisionX, the Palit card is idling at 810mhz GPU with idle temps of 43°C, while the MSI card is idling at 342mhz GPU with idle temps of 30°C.

I know the MSI card has better cooling at stock, but why is the Palit card not throttling down to the same mhz as the MSI card when idling? I've set the global power properties in NVidia Control Panel to adaptive, but the cards are still having this issue. It's annoying that NVidia Control Panel doesn't tell you if it's applying changes to one card or both.

So what can I do to make the Palit card lower it's idle GPU clock?
 
Solution
Could still be hardware, it just means Direct X stopped responding.

I'm going to guess at this point your PSU isn't outputting enough power for both GPU's, they are both pretty power hungry. I didn't see you had SLI on a 750W, especially with 680's pulling almost 200W each. You might need a bigger and good quality PSU.

If you really want to confirm, I'd test each GPU in each PCIe slot and see if there are any issues. If everything is flawless in that situation, you might be looking at power related issues. You have basically ruled out everything else.

owlovethbl

Honorable
Jun 22, 2014
16
0
10,510
The temperatures on the Palit card get very high after a while, and after a while of gaming I get a blue screen with a "dxgmms1.sys" message after a couple od hours of gaming, which I think may be temerature related?

I'm running the GeForce 337.88 driver from Nvidia by the way.
 


This is a different issue.

What's a "high" temperature?
 


75 isn't hot for a GPU, it's not a heat issue. They throttle at 90c. They are usually rated up to 105c. GPU's tend to run hot.

Can you run Bluescreenview or get the dump files for the bluescreen error?
 

owlovethbl

Honorable
Jun 22, 2014
16
0
10,510
I've only been having the bluescreen issue since I put the second card in so maybe something's not fully connected?

Also the idle speed I htink is related to temperature maybe? I just went out for a bit and when now I've come back the idle temp on the Palit card has dropped to 34°, and the idle GPU mhz has dropped to 324, in line with the MSI card. So is idle speed dictated by temperature, or the other way round?

EDIT I found the dumpfile, how should I post it?
 


What's the make/model?
Are you plugging it directly into the card or using a type of cable converter?
 
This is a little messy

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 50, {fffffa8002effad0, 0, fffff880107a8e1d, 2}


Could not read faulting driver name
Probably caused by : dxgmms1.sys ( dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForPreparation+19 )

Followup: MachineOwner
---------

0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced. This cannot be protected by try-except,
it must be protected by a Probe. Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: fffffa8002effad0, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff880107a8e1d, If non-zero, the instruction address which referenced the bad memory
address.
Arg4: 0000000000000002, (reserved)

Debugging Details:
------------------


Could not read faulting driver name

READ_ADDRESS: GetPointerFromAddress: unable to read from fffff80003b03100
fffffa8002effad0

FAULTING_IP:
dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForPreparation+19
fffff880`107a8e1d 488b02 mov rax,qword ptr [rdx]

MM_INTERNAL_CODE: 2

CUSTOMER_CRASH_COUNT: 1

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

BUGCHECK_STR: 0x50

PROCESS_NAME: System

CURRENT_IRQL: 0

TRAP_FRAME: fffff880079df610 -- (.trap 0xfffff880079df610)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=fffff880079df7c8 rbx=0000000000000000 rcx=fffffa80115d4000
rdx=fffffa8002effad0 rsi=0000000000000000 rdi=0000000000000000
rip=fffff880107a8e1d rsp=fffff880079df7a0 rbp=fffffa8012ede290
r8=fffffa8014650900 r9=0000000000000000 r10=0000000000000000
r11=0000000000000039 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei ng nz na po nc
dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForPreparation+0x19:
fffff880`107a8e1d 488b02 mov rax,qword ptr [rdx] ds:92fc:fffffa80`02effad0=????????????????
Resetting default scope

LAST_CONTROL_TRANSFER: from fffff800039485e4 to fffff800038cbbc0

STACK_TEXT:
fffff880`079df4a8 fffff800`039485e4 : 00000000`00000050 fffffa80`02effad0 00000000`00000000 fffff880`079df610 : nt!KeBugCheckEx
fffff880`079df4b0 fffff800`038c9cee : 00000000`00000000 fffffa80`02effad0 00000000`00000000 00000000`00000000 : nt! ?? ::FNODOBFM::`string'+0x43836
fffff880`079df610 fffff880`107a8e1d : 00000000`ffffd94a 00000000`00000003 fffffa80`115c1000 fffffa80`115c2a90 : nt!KiPageFault+0x16e
fffff880`079df7a0 fffff880`107a5ff7 : 00000000`00000000 fffffa80`0cec80d8 00000000`0000001e 00000000`00000000 : dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForPreparation+0x19
fffff880`079df7d0 fffff880`107c07d9 : 00000000`00000000 fffff8a0`0e276910 fffffa80`00000000 fffffa80`14650990 : dxgmms1!VIDMM_GLOBAL::prepareDmaBuffer+0x43f
fffff880`079df9a0 fffff880`107c0514 : fffff800`00b96080 fffff880`107bff00 fffffa80`00000000 fffffa80`00000000 : dxgmms1!VidSchiSubmitRenderCommand+0x241
fffff880`079dfb90 fffff880`107c0012 : 00000000`00000000 fffffa80`147b19b0 00000000`00000080 fffffa80`115b2410 : dxgmms1!VidSchiSubmitQueueCommand+0x50
fffff880`079dfbc0 fffff800`03b6773a : 00000000`01d47098 fffffa80`0d647b50 fffffa80`0c9e4890 fffffa80`0d647b50 : dxgmms1!VidSchiWorkerThread+0xd6
fffff880`079dfc00 fffff800`038bc8e6 : fffff800`03a46e80 fffffa80`0d647b50 fffff800`03a54cc0 00000000`00000000 : nt!PspSystemThreadStartup+0x5a
fffff880`079dfc40 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16


STACK_COMMAND: kb

FOLLOWUP_IP:
dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForPreparation+19
fffff880`107a8e1d 488b02 mov rax,qword ptr [rdx]

SYMBOL_STACK_INDEX: 3

SYMBOL_NAME: dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForPreparation+19

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: dxgmms1

IMAGE_NAME: dxgmms1.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 5164dc13

FAILURE_BUCKET_ID: X64_0x50_dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForPreparation+19

BUCKET_ID: X64_0x50_dxgmms1!VIDMM_GLOBAL::ReferenceAllocationForPreparation+19

Followup: MachineOwner
---------

0: kd> lmvm dxgmms1
start end module name
fffff880`10788000 fffff880`107ce000 dxgmms1 (pdb symbols) C:\Program Files\Debugging Tools for Windows (x64)\sym\dxgmms1.pdb\93AB163000A844D5BF442A979FB6FF991\dxgmms1.pdb
Loaded symbol image file: dxgmms1.sys
Mapped memory image file: C:\Program Files\Debugging Tools for Windows (x64)\sym\dxgmms1.sys\5164DC1346000\dxgmms1.sys
Image path: \SystemRoot\System32\drivers\dxgmms1.sys
Image name: dxgmms1.sys
Timestamp: Tue Apr 09 23:27:15 2013 (5164DC13)
CheckSum: 00043C9B
ImageSize: 00046000
File version: 6.1.7601.18126
Product version: 6.1.7601.18126
File flags: 0 (Mask 3F)
File OS: 40004 NT Win32
File type: 3.7 Driver
File date: 00000000.00000000
Translations: 0409.04b0
CompanyName: Microsoft Corporation
ProductName: Microsoft® Windows® Operating System
InternalName: dxgmms1.sys
OriginalFilename: dxgmms1.sys
ProductVersion: 6.1.7601.18126
FileVersion: 6.1.7601.18126 (win7sp1_gdr.130409-1534)
FileDescription: DirectX Graphics MMS
LegalCopyright: © Microsoft Corporation. All rights reserved.
 

owlovethbl

Honorable
Jun 22, 2014
16
0
10,510
As for the power question: it's an EVGA 750W Supernova modular PSU. Both cards have 2 pci-e connections directly to the PSU - no cable bodging.

I will try the method you suggested for the video card and see if it fixes the problem. I'll post back here with results - thanks :)
 

owlovethbl

Honorable
Jun 22, 2014
16
0
10,510
So I tried the method outlined and I'm still getting the dxgmms1.sys BSOD.

Could it be RAM related? I recently installed 2 more 4GB RAM sticks at the same time as the second GPU, raising my total system RAM from 8 to 16GB.
 


Did you install the same RAM?
Or did you just mismatch DDR3?

Don't mismatch DDR3. Buy either 8 GB kit or 16 GB kits. Don't mix kits. DDR3 is very fast, and very pron to errors if you don't use a kit,
 

owlovethbl

Honorable
Jun 22, 2014
16
0
10,510

No I didn't mismatch - it's all Corsair Vengeance 1600 RAM.

I also got a different blue screen today, saying "The video scheduler has encountered an unexpected fatal error."
http://www.filedropper.com/071614-10218-01 here is the crash dump.
 


Should be installed, check the Programs and Features and see if "Windows Debug Tools" are installed.

Uninstall them if they are, they seem to be popping up in all the BSODs for some reason. The file it's pointing to is Direct X related.


http://answers.microsoft.com/en-us/windows/forum/windows_7-system/blue-screen-in-dxgmms1sys/db3f0576-976f-4401-8793-40f01942fb9d

This guy ultimately solved the issue by replacing the motherboard. I would go into the BIOS and set everything to default, and change back anything you need.
Could be a DOA card, could be a power related issue with the motherboard. I would disconnect the PSU from everything and reconnect everything again, confirming it's in properly. Make sure the GPU is connected to the PCIe connector on the PSU, and not using any type of converter from the 4 pins.

If you have any type of 4 pin to sata, or 4 pin to anything converter in the system, remove it if possible. They can cause instability
 

owlovethbl

Honorable
Jun 22, 2014
16
0
10,510

There's no "Windows Debug Tools" listed in the programs and features, and they don't come up through a start bar search either. Pretty sure I don't have them installed.

Before when I was running with the Palit GTX 680, 8GB of Corsair Vengeance RAM and a 650W OCZ PSU, there were no issues.

When I was running with just the MSI GTX 680, 8GB of Corsair Vengeance RAM and a 650W OCZ PSU, there were no issues.

Now I'm running with both the MSI GTX 680 and Palit GTX 680 in SLI, 16GB of Corsair Vengeance RAM and a 750W EVGA PSU, suddenly I'm getting these BSODs.

Based on that, I don't see how it can be a motherboard issue, unless the second PCI-E slot is faulty, or something's not connected properly. I already checked all the connections between the PSU and the motherboard/ video cards, and I don't know how to go about testing the second PCI-E slot for faults. As I said before, there's no cable converters or any other such bodging going on.

I've checked all the RAM is seated properly, and ran Memtest 86+ for 4 complete passes and there were no faults with the RAM.

I've also completely removed all the video card drivers and completely removed both cards, then added them back in one at a time and installed the drivers again.

From what you're saying, both blue screen crash dumps are pointing to a directX issue, which sound to me like there's still a software problem, rather than hardware?

My only other thought is that maybe the PSU is not powerful enough to run everything. In addition to the two video cards, I also have one SSD, four HDDs and a Blu-Ray read-write optical drive. Could a lack of PSU power be causing the blue screens? I don't think so though, because the most recent BSOD happened while the system wasn't under heavy load.

I recently spent some time running stress tests on the machine using FurMark and Speccy to watch temps. The Palit card peaked at 92 degrees, which is pretty hot - the MSI around 74 degrees and the CPU at 57 degrees.

I recently changed my system's pagefil.sys size to an initial 8GB, with a max of 32GB. Could that be causing issues too? For now I've switched it back to "system managed".

Sorry to go over old ground but I wanted to be clear about what I've tried, and how my system is set up. Short of completely deconstructing the rig and building it again from scratch, I'm really out of ideas.
 
Could still be hardware, it just means Direct X stopped responding.

I'm going to guess at this point your PSU isn't outputting enough power for both GPU's, they are both pretty power hungry. I didn't see you had SLI on a 750W, especially with 680's pulling almost 200W each. You might need a bigger and good quality PSU.

If you really want to confirm, I'd test each GPU in each PCIe slot and see if there are any issues. If everything is flawless in that situation, you might be looking at power related issues. You have basically ruled out everything else.
 
Solution

owlovethbl

Honorable
Jun 22, 2014
16
0
10,510


How long should I test each card for, before I can say for sure that it's running with no problems?
 


As long as it would normally take to crash, 4-5 hours of gaming on Ultra detail I assume.