Please help! Freezing while rendering - trying to identify problematic hardware!

Status
Not open for further replies.

marfin

Honorable
Dec 17, 2013
6
0
10,510
Hi guys :)

My PC is often crashing while rendering frames in 3ds Max 2013 x64 with VRay 2.4... Normally if I leave it to render overnight it crashes a few hours in. It just freezes and locks up completely (display frozen at the time of crash). Looking at the logs, often this happens at the start of a new frame - somewhere in between GI (light cache) and main render. It seems to have no relation to how intense the scene is. Sometimes it's crashing on a 1min/frame job, sometimes a 30min/frame


I have tested with just one half of the RAM chips, and with the other half - still doesn't help, so I doubt it's that... (although there could be a faulty chip in both batches, but I doubt it). Also ran memtest86, found no errors. Ram is not maxing out in any case - right now it's rendering and using 5Gb / 32Gb


CPU temps seem normal - around 50° on full load - it has a be-quiet Dark Rock Pro cpu cooler so temps shouldn't be the cause...


Could it be GPU based? I find that option hard to believe as I'm rendering on the CPU, although I do have visual frame buffers (display of what I'm rendering) enabled...
NB: When I first had rendering problems on this PC a few months ago, I got BSODs, but with no sys driver information in the BSOD dumps... Trying everything, I rolled back to an old version of the graphics driver (the one that shipped with the GPU) and it seemed to have stopped it. No way of telling, maybe I just got lucky and had no crashes while I tested. Since then I have updated to the latest drivers from Nvidia's site, so maybe these things are related?

Could it be SATA / AHCI / storage controller related when it tries to read files (some textures used in the renderings are on an external USB 3.0 drive).
I ask this because sometimes I have system freezes (exactly the same) when copying large files (2Gb+) from another (different) external drive via eSATA - using ASMedia 106 eSata controller. It could be related perhaps?


Specs: P9X79 Pro mobo, i7-4960X, 64Gb ram (now using 32Gb), Palit Nvidia Geforce GTX 770 4Gb, Corsair 650W TXM650 PSU

I never have any other problems with the PC - only this...

Please help! Any advice appreciated!

 

gaymer1984

Honorable
Oct 20, 2013
43
0
10,560
It seems like you're reasonably proficient to test out the memory as you have done, and you are making theories as to why this is happening. The problem you are stating seems to be being caused by one specific application doing one specific task - that makes it much harder to pin down exactly the cause of the problem because it could (possibly) be anything in the CPU-GPU-driver area.

As far as specific suggestions go, the first thing I would try is an alternate renderer. I appreciate this software isn't exactly cheap so, maybe a free trial of something like Maya with your imported scene. If you can render overnight with that, then its a 3DS Max issue. Even though its unlikely, that should be your first step, before you think about buying new hardware (as it sounds like this driver issue is the probable cause).
 

marfin

Honorable
Dec 17, 2013
6
0
10,510
@swifty_morgan
Via internal sata that drive works perfectly - no problems at all.

But that drive isn't used at all during rendering - it has no files that are accessed by 3ds max, or anything. (there are two external drives: one eSata, one USB 3. Only the USB drive has files used during the render) Do you still think it could be related? I only mentioned it because the symptoms I get with both the render crashes and eSATA transfer crashes are similar.


@gaymer1984
I had the same thing in Max 2012 (previous version). Regarding Maya - it's not so much a cost issue (that too), but more the problem of workflow and relearning software just to get around a hardware problem...

Regarding "CPU-GPU-driver area" - do you think it could realistically be caused by the GPU? That's what I'm confused about. I always thought that CPU rendering was in no way related to the GPU...

When I'm actively working in 3ds max, modelling, etc (GPU based work) I have no issues at all. It only crashes when I am rendering (CPU based work) - I am not even touching the PC and the monitors are switched off...
 

gaymer1984

Honorable
Oct 20, 2013
43
0
10,560


My concern is how much of the rendering load has been offloaded onto the GPU via some kind of hardware acceleration. Is there the option in 3DS Max to disable this? If there is the option, you disable it, you render and experience no problems, then you will have found the cause of this.

The problem is if that isn't the problem. Replacing the CPU is unrealistic; especially if its working besides this one task. (CPUs are devices that don't tend to soft-fail; they either work or they don't)

When you are rendering are you using files that are being accessed via your external hard drive?
 

marfin

Honorable
Dec 17, 2013
6
0
10,510
It uses GPU acceleration in the 3d viewport (when I'm navigating the 3d scene, during modelling, texturing, etc) - but not while rendering. When it's rendering, all interfaces and navigation in 3ds max are disabled. AFAIK, during render it is entirely CPU-based, (other than the actual on-screen display of what it's rendering, of course)

Good to know about the CPU - thanks. Perhaps I will borrow a graphics card and run some tests

I access files on the USB drive - just some textures libraries which are loaded by 3ds max. But not the eSata drive - that one is completely unrelated...

As a guess, could the PSU not be providing enough power, perhaps?
 

gaymer1984

Honorable
Oct 20, 2013
43
0
10,560
Its possible, but esata external drives don't draw that much power... to test, I'd disconnect everything but your main hard drive, your graphics card and try again. Give the PSU as much headroom as possible.

If you find you don't have issues, then you've found your problem but to be honest - if it is the PSU not delivering enough power it is something you need to rectify sharpish.
 

marfin

Honorable
Dec 17, 2013
6
0
10,510
tried without the drives last night and it still crashed, so I guess that is unrelated.

How can I know if I am getting enough power from my PSU? I have 650W.

Will try a reformat and complete stock drivers reinstall over the weekend...
 

gaymer1984

Honorable
Oct 20, 2013
43
0
10,560


Well with the eSATA etc drives not connected, that means your computer drew slightly less power - meaning it still shouldn't have erred if it was a power supply issue.

I have rarely come across an issue with specific pieces of hardware for whatever reason being incompatible (I'm thinking GPU and motherboard) but none so recently as to seriously believe that is what is happening to you. I think from this point forward, to help you we should make a list of the possible causes and ascertain ways for you to test out each of them independently.

1. CPU fault
2. Memory fault
3. North/Southbridge fault
4. GPU Fault
5. Hard disk access fault
6. External device fault (solved)
7. PSU fault
8. Driver fault
9. Operating system fault
10. Expansion bus fault
11. BIOS configuration fault

(if anyone else reads this thread, and can suggest additional possible causes lets add them to the list and get marfin's problem sorted)

Ways to test:

1. Short of swapping out another CPU in the same computer, this is unlikely to be confirmed conclusively. I'm guessing you don't have a spare CPU of the same slot you could just drop in and test.

2. You've already done a memtest, but that is not the only potential problem with the memory. It's possible that its' timings are set up incorrectly in the BIOS (RAS to CAS etc) and that may be causing information errors. Again, unlikely.

3. This may be down to drivers for the specific motherboard, some motherboards are freakin' picky when it comes to the driver revision used to drive their components, others are less finnicky. Something else to try.

4. Similar to CPU, you'll need another GPU to drop in to test to see whether this is the cause. Reinstalling drivers will only get you so far if we are talking about a slightly damaged capacitor that only causes errors after x amount of time of intensive use. Same goes for your motherboard.

5. Swapping out the hard drive will only get you so far here, there's nothing to say that the controller/southbridge hasn't soft-failed. Again, one of the only ways you'll know conclusively if this is the cause is to swap out the motherboard.

6. At least you've eliminated this.

7. Given you've used your PC drawing less power, I'm less convinced as to the likelihood of this now, but its worth testing out with the GPU and one hard drive being the only devices connected.

8. It sounds like you have the majority of these already covered, but it may well be worth scouring the internet and making a list of every single device driver for your current setup and doing searches to see whether other people have experienced issues similar to yourself with each specific driver.

9. One of the easiest solutions to do, its potentially feasible that windows is somehow mismanaging the pagefile and that is what is causing your problem, or it could be an obscure windows service running in the background. Reinstalling should eliminate the likelihood of this being the cause.

10. You would be able to eliminate this if you only have a GPU connected to the slots. if you have a soundcard or anything else, remove and try again.

11. There may be an obscure setting hidden somewhere in your BIOS that might affect this. Stranger things have happened to me.

This list is intented to be an exhaustive list of possible causes, not probable. Its important to make that distinction. Again, I invite others to expand the list and make suggestions as to alternate causes.
 

marfin

Honorable
Dec 17, 2013
6
0
10,510
Thanks a lot, really appreciated. super useful :)

I will double-check No. 2 and No. 11 in the bios configuration just to be sure, but I think I have reset it to default. (unless it should be non-default?)
On friday I'll reformat and reinstall all stock drivers from the CD that came with the motherboard (At the moment I have the latest drivers for individual components from their respective websites...) I have a suspicion that it could be a chipset driver failure in either the north/southbridge or storage controllers... Hopefully this will test Nos. 3, 8, 9


No. 1 - unrealistic for now, will leave CPU and motherboard swap until last...

No. 4 - I will borrow a GPU from my work, a really simple server one, just enough to connect my monitor.

No. 5 - same as 1.

No. 7 - Will test this too. (although that only means disconnecting a hard drive and a DVD drive, so I doubt it is that...)

No. 10 - already eliminated as I only have a GPU in my slots.




PS. This is my rendering log from 3ds Max Vray. The last line "Setting up 12 thread(s)" is true for every crash - whenever it fails, it always fails on this step. (Although it does it successfully for many frames before that)

[2013/D/18|02:59:15] Threads completed
[2013/D/18|02:59:15] Merging light cache passes.
[2013/D/18|02:59:15] Light cache contains 3188 samples.
[2013/D/18|02:59:15] Light cache samples collected; compiling lookup tree.
[2013/D/18|02:59:15] Prefiltering light cache.
[2013/D/18|02:59:15] Rendering image.
[2013/D/18|02:59:15] Setting up 12 thread(s)

Do you think this is relevant for diagnosing my problem? Does it mean that the problem in the CPU or chipset? Or is it just coincidental with the render start time?


Thanks again for your time and effort, it's much appreciated.
 

gaymer1984

Honorable
Oct 20, 2013
43
0
10,560
Do I think it could feasibly be an issue with the CPU? It's possible, but task hand-off is accomplished automatically within the bounds of the CPU itself. That means that to the operating system it appears as a 12 core CPU - and any issues to do with the threading are going to be down to the CPU mismanaging (which simply doesn't happen unless its failed somehow). So kind of a moot point.

I would be really interested to see if you could load up prime95 or some other kind of testing program that would tax all 12 cores for say, 12 hours and see if it continues to provide errors.
 

marfin

Honorable
Dec 17, 2013
6
0
10,510
Ok, so I reformatted and installed only stock drivers from Asus's (mobo manufacturer's) website. I left it rendering overnight for 6 hours and it didn't crash... Need to do more testing but looking good so far... :)
 

gaymer1984

Honorable
Oct 20, 2013
43
0
10,560
Thats great, I hope this sorts it out. Let me know how the testing goes, if it ends up being chipset drivers then its obvious your northbridge is the issue and you'll need to stick with those specific drivers for as long as you have the computer.
 
Status
Not open for further replies.