New build freezing in games (occasionally windows) NOT the GPU

Status
Not open for further replies.

Robmeng

Prominent
Jun 26, 2017
14
0
510
Hi guys

I built a new setup for my partner, specs as follows

Asrock AB350m pro4 latest bios

MSI GTX 970 gaming (factory OC)

Ryzen 1500x (stock)

2x4GB Corsair Vengeance LPX DDR4 2400 CMK8GX4M2A2400C14

2x 1TB WD green drives (storage) and a 120GB Hynix SL308 canvass (OS drive) (the two games mentioned below are installed to this drive also)

PSU 450W Seasonic G Series 80 plus gold Hybrid modular

Windows 10 x64 Updated

Main issue is when playing games (Team Fortress 2/ L4D2 mostly) the game will lock up, mainly when loading into a server, or just after loading into the server. But not always, she can play for hours sometimes without issue. We get a variety of errors in event viewer, the latest being -

The description for Event ID 13 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\Video3
Graphics Exception: ESR 0x404490=0x80000001

the message resource is present but the message is not found in the string/message table

Now as for what i've tried: Fresh clean install of windows. Reinstall GPU drivers using display driver uninstaller beforehand. move ram from A2 B2 to A1 B1. I switched my Evga GTX 980 SC 2.0 into her machine and the freezes persist, using the GTX 970 in my machine i have no problems, so i can rule out the card itself being defective. It's passed through Furmark 1080p test ok multiple times.

I've ran memtest for 3 passes with no errors. Used both the XMP profile and mobo default ram speeds. I'm using AMD's balanced power plan but with the pci-e setting off. I've tried debug mode in Nvidia control panel and prefer maximum performance. Temps are fine according to all monitoring software.

I was thinking the 450W PSU could be an issue but i believe it is a good PSU and the components shouldn't draw near the full capacity?

Is there anything i could try to pinpoint the cause of these freezes?

Appreciated, Rob.

 
Solution
You can test the memory with Memtest.
If it's the motherboard, you can try updating bios (this may bork your windows installation so be ready with windows installation disc for repairs), but be careful when you do as unsuccessfully updating bios can brick your motherboard. If even that doesn't help you can try memtest on your memory or installing it in your other computer to test it as well. If it still doesn't show anything, the gpu worked fine in another computer, and Prime95 runs ok and doesn't fail the tests like 10 min in, I'd likely say contact mobo manufacturer and see if they can help you, or rma.

Sedivy

Estimable
https://forums.geforce.com/default/topic/389688/geforce-drivers/nvidia-statement-on-tdr-errors-display-driver-nvlddmkm-stopped-/

In particular this part:
Common issues that can cause a TDR:


Incorrect memory timings or voltages
Insufficient/problematic PSU
Corrupt driver install
Overheating
Unstable overclocks (GPU or CPU)
Incorrect MB voltages (generally NB/SB)
Faulty graphics card
A badly written driver or piece of software, but this is an unlikely cause in most cases
Driver conflicts
Another possibility that people tend not to like to hear, is that you are simply asking too much of your graphics card. What I mean by this, is that if you have your settings too high and the graphics card struggles and falls to very low FPS, then something graphically complex occurs, the GPU may not be able to respond and a TDR error may occur
Some users have experienced TDR errors whilst browsing the web with the 280.xx, 285.xx and 290.xx drivers. Please head to this link to clarify if this is relevant to you - this is quite a specific issue which seems to predominantly affect web browsing as opposed to gaming. There are no categoric fixes but some users have found that changing the power management mode to 'Prefer Maximum Performance' has helped.





Examples of specific TDR causes:


Conflict with Realtek drivers causing TDR errors
Driver conflict with Logitech webcam drivers
Unstable overclock on the graphics card
Insufficient PSU
RAM problems (faulty, badly seated or not configured correctly)
Cleaning out dust resolved issue
AMD/ATI cards also have TDR problems





Things to check or consider initially in your troubleshooting:


Check for newer driver version or cleanly uninstall/re-install your drivers. Great description of how to do this here (full credit to DJNOOB for this).
If you have multiple 'GPU tools' like EVGA Precision and MSI Afterburner installed, consider that it is only advisable to have one tool such as this at any one time.
If the issue is only with a specific game, check for patches.
If this is a new problem for you, have you just added any new hardware or updated/installed any new drivers? Consider rolling them back.
Check temperatures. Its important you check these at load, which is generally when a TDR event will occur. Everest Ultimate Edition is a good tool for this, or OCCT's GPU stress test. If things are too hot, you can use tools such as EVGA Precision to increase GPU fan speeds on graphics cards. Cleaning your system of dust can help temperatures significantly. Common sense will normally tell you if something is too hot, but if you aren't sure, the information is generally available online.
Check that your RAM is running at the correct settings as defined by the manufacturer.
Remove any overclocks on your system and test with stock clocks. This includes memory, CPU and GPU (even factory OC'd cards). Best to try each separately so you can be sure if one solves the issue.
Attempt a CMOS reset to return all BIOS settings to default. This is a good hardware troubleshooting step as it also resets the IRQ assignments - you can normally reset the CMOS either through a jumper on the motherboard (see manual), or by disconnecting the mains power and taking out the motherboard battery for 5 minutes. You will likely need to go in to the BIOS after this reset to check the memory timings/voltages are correct, as these will not always do so automatically.





Additional steps:


Run memtest (memtest.org). This should complete with NO errors.
If you have just installed a new graphics card, check your PSU ratings. Is it providing enough power, and most importantly enough Amps on the 12V rail.
If you are using SLI, try each card separately to see if the fault lies with one.
Try graphics card/cards in another computer if you can.



_________________________________________________________



As most people who end up reading this will have slightly custom computers in one way or the other, please try to remember that checking things like RAM timings & PSU voltage go hand in hand with modifying or building a computer. A lot of people assume that any hardware they buy and plug in should just work, and any software they then install should be fine also... this is not entirely true. No hardware or software vendor can truly recreate all of the different possible combinations, so do expect some tinkering to be required every once in a while.



For those with laptops, I appreciate there are a lot of steps here you cannot complete. However, the confined space of a laptop plus dust and age can mean that overheating is a real possibility. Beyond this you need to look at reinstalling drivers and software, and then you should be looking at potential hardware issues and likely an RMA (assuming of course you have not been overclocking in software or making changes in the BIOS).



_________________________________________________________



Programs to use for stress testing CPU:

- Prime95 (would advise running for at least a few hours).

- Intel Burntest (run at least a few passes)

- OCCT (good linpack test for CPU)



Programs to use for stress testing GPU:

- OCCT

- 3DMark Vantage

- 3DMark 11 (DX11 GPUs only)

- Any of the Crysis series



Programs to use for monitoring temperatures:

- EVGA Precision (GPU only)

- MSI Afterburner (GPU only)

- Everest Ultimate Edition (now known as AIDA 64)

- CoreTemp (CPU only)

- RealTemp (CPU only)

- OCCT (stress testing and temp monitoring)



I can highly recommend Everest/AIDA64 as this shows you ALL your temperatures, including other GPU components. It is however not free - you can download a trial but it has some functions limited (including some temperatures).



At the end of the day, try not to become too frustrated with the issue. Generally a solution can be found. There a lot of topics you can look back on in relation to this issue, and a lot of good people around on this (and other) forums who are happy to help, assuming of course you are willing to take advice! People understand frustration, but they aren't going to help you if you are rude or abusive... an FYI for those who just want a slanging match!



If you post a topic regarding this issue, please state you system specs in as much detail as you can, plus anything you have tried so far. Feel free to PM me if you are having no luck and I will do my best to advise.



See the 'GeForce GTX & ION Drivers' forum section sticky for the official nVidia response on TDR errors.
 

Robmeng

Prominent
Jun 26, 2017
14
0
510
Stress tested PSU in occt for 15 minutes. Done the gpu and CPU tests too. Crashes are not 100% repeatable sometimes she can play for hours other times it will crash again and again.
 

Robmeng

Prominent
Jun 26, 2017
14
0
510
She has just played Reign of Kings for about 5 hours then it crashed. Occasionally watching a youtube video the whole screen will flash green for a split second.

The only stress test it has crashed in is Furmark, but only a couple of times, it's completed more furmarks than it has failed. I've had the gtx 970 in my pc the last 24 hours with no issues. Her pc will still have the crash/freeze with my 980 so it's a software issue (hopefully).

My Seasonic @450w should be fine right? It's not really a power hungry system and i've tested PSU in OCCT for 15+ minutes. It could potentially be RAM? That Reign of Kings freeze threw up no errors in event viewer.

I've just flashed to the very latest UEFI - 3.00.

Current Voltages
http://imgur.com/a/cLkq7
 

RCFProd

Expert
Ambassador
MERGED QUESTION
Question from Robmeng : "Seasonic G series 450W adequate?"





You are running an updated version of Windows? Can you detail this more? Has it been an update from Windows 8.1 to Windows 10?

Since when have the issues started?

Have you done a fresh install of Windows, followed up by removing graphics drivers with Display Driver Uninstaller (safe mode) followed up by installing the latest graphics drivers after?
 

Robmeng

Prominent
Jun 26, 2017
14
0
510


Fresh windows 10 install from usb. Issues since the start. Mobo updated, gpu drivers cleanly installed multiple times, different gpu used.

Mainly manifests in Team Fortress 2 (game freezes completely but can task manager to windows easily) event viewer will sometimes log a gpu driver fail sometimes it won't. Occasionlly when viewing videos on youtube the screen will flash green for a split second.

At this point i'm thinking power supply - CPU - MOBO or RAM. Hopefully it's a software issue though.

I have more details in the OP

Thank you for replying.
 

Robmeng

Prominent
Jun 26, 2017
14
0
510
Attempted to play TF2 about an hour ago, i loaded in the XMP ram profile in UEFI beforehand, game played for a while then BSOD (first time it's happened on TF2)

Currently Reign of Kings without issue (has been running fine for days now tbh)

Some screens from HWINFO while playing RoK

http://imgur.com/a/v5xBb
 

mmmme

Prominent
Jul 24, 2017
1
0
510


I had the same issue with the Asrock a320m pro4, 49 days later i was fed up and bought another mobo: msi b350m mortar & all my issues dissapeared.
 

Sedivy

Estimable
You can test the memory with Memtest.
If it's the motherboard, you can try updating bios (this may bork your windows installation so be ready with windows installation disc for repairs), but be careful when you do as unsuccessfully updating bios can brick your motherboard. If even that doesn't help you can try memtest on your memory or installing it in your other computer to test it as well. If it still doesn't show anything, the gpu worked fine in another computer, and Prime95 runs ok and doesn't fail the tests like 10 min in, I'd likely say contact mobo manufacturer and see if they can help you, or rma.
 
Solution
Status
Not open for further replies.