16GB of RAM + 6950 xfire = BSOD? ...I have no idea.

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510
I have no idea what is causing this.

Story time.

I got my RAM RMA from PNY back today, and stuck the two sticks in my computer and went about my business. 2 minutes into playing skyrim it froze.

reboot.

this time I load a different save in a city. 2 seconds in it freezes.

reboot.

boot into UBCD, run memtest for 30 mins
No errors

reboot

Try a few other games with no problems, skyrim still gives trouble. Run furmark, Intel burntest, and finally memtest again, nothing. Figure bethesda messed up something with the latest patch and go play WoW. 2 hours in, BSOD, says the display driver didn't recover within the timeout peroid.

Take out the new RAM, reboot.

Skyrim freezes randomly, but then recovers after a bit. This has now stopped after a reboot.


Now for the really weird bit. I've isolated what causes it, but I have no idea what is causing the cause:
I discovered it only happens with crossfire enabled AND 16GB of RAM installed. Going back down to 12 GB (2x2 and 4x2) shows no issues, as well as running with 16GB and crossfire disabled.


Specs:
AMD Phenom II x4 955 @ 4.0GHz
12GB 1333MHz
two 6950's running @ stock settings, second one has 6970 shaders unlocked.
750W corsair PSU
ASUS m4a79XTD EVO


Any ideas?
 

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510
Update: reset everything in BIOS to default values and booted up the second card with the second BIOS switch (with locked shaders).

everything is k. (by "k" I mean I ran furmark and intelburn test with crossfire enabled and the 16 gigs in followed by 10 minute running around skyrim)
unlocked the 2 extra cores on my CPU (it's actually a phenom 2 x2) for that 4 core goodness.
everything is k.
overclocked the CPU to 4.0GHz
everything is k in furmark and intelburntest. skyrim causes a driver crash (I have tried the 12.1 and 12.3 pre certified drivers) when tabbing back into game once. To be fair I left skyrim open longer than the other tries and opened up chrome to type this out during that time, so it could have happened under the other tests had I done the same perhaps. No other games show instability, but no other games stress the system like skyrim atm (was running tweaked LoD settings and HD textures, atm reverted to regular).
Going to continue turning things back up (next is 2800 CPU-NB) to see how it goes.

edit: sigh, nevermind. reenabling the HD textures causes it to freeze rather quickly, in restrospect it makes sense, as disabling the HD textures would bring the stress level to around what WoW or a similar game would run, and it took nearly two hours for a problem to arise while playing them. back to square one it would seem.
 

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510
/sigh.

1.64 was fine for 12 hours but then wouldn't let me boot.
1.56 is working fine for now.
There's no problem with it normally...but there's one thing that still causes "the issue":
I'm using a program called RamDisk+ to create an 8GB RAMDISK and putting skyrim in that (dem load times) for lols. This still causes the issue, although I suppose there is a chance that it could just be an unrelated thing to either the program itself or the idea of trying to copy and run the whole game out of the RAMDISK, perhaps I should make a 4GB one and just do the texture files.
 

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510
I give up, I have no idea what's wrong and have done everything I can think of.

I've swapped the cards. tried running only one card (did this with both cards). undid every OC. overvolted things at stock settings (really running out of ideas at this point) and finally measured the voltage coming off of the 12v rail under full load (11.95 if anyone cares). I've even installed windows on a separate partition to verify that it wasn't something unique to this install.

Then there's the trouble with reproducing it, oh god. First I thought 3dmark06 could it, but after trying all of the above and still getting errors, I looked up the particular message it was giving me (something about device lost, this would occur after the driver would drop out and reset) and apparently it's a well known issue and not indicative of any problem.

currently back at what I had before this whole thing started, and so far it's working fine...prime 95 + furmark for 30 minutes with no problems. but I know that sometime tomorrow when I try and play a game, it'll happen. I swear it's like my computer is toying with me...


Ah yes, it's also not just with the 16GB anymore, happens with 8 and 12 as well, horray!
 

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510
Mhm.

The most agonizing part of this is that technically all my testing with 3dmark06 could mean nothing as other people with no problems reporting having the device lost issue as well, so I guess I have no idea.


One more observation: it seems to be related to how long the system has been up. When I came back to check things after a good 9 hours of leaving the system on idle, it started showing some weird artifacts in skyrim briefly before the driver crashed, a restart later with the same settings and it's fine. Don't know if that helps any.
 

Tavo_Nova

Distinguished
Dec 31, 2011
1,159
0
19,360
well it might be heat issues like maybe your case doesn't have good air flow

but if it's not that disregard my post above

how about drivers? try updating to the latest if not try going down a notch and see if that works fine.

I'm thinking maybe you got psu problems with your corsair psu not giving enough power in the long run since it's possible since you did say that lots of this stuff happened after xx hours of testing and stuff, it could be that too.


 

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510


unlikely heat problem as no temp sensors report anything. I doubt it's the PSU as I in these instances it's not like I left it testing for 8 hours, I left it on idle. BUT what I'll do is in 8 hours (just restarted it again) I'll run furmark and p95 again while testing what I can with my little meter thing.

In addition I'll move over to my backup windows installation after tonight and let it sit for another 8 hours and see, although I don't see why a driver would become unstable over time, but I'm out of ideas so why not.
 

Tavo_Nova

Distinguished
Dec 31, 2011
1,159
0
19,360
If you say It's not a psu problem then we will take psu out of the suspect.

do you use a UPS?

I suspect it may be program related issues, i just remembered sometimes i get similar problems on my old dell desktop, still working today but like when i view net for some time then i open up some games like red alert 3 and such it won't work having same issues as yours but moving on

maybe you got a bad ram? i had a situation before with my phenomII x6 1090t, with my 16gb ram, i just got my g.skill ram replaced and things work nicely.

also maybe it's your board if you got like dust collected in those slots try blowing it away with air blowers or such some people experienced that problem before with dust on slots
 

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510
1 No, I use USPS :p. I've heard of it but I don't know what it does nor do I have one.

2 if that's the issue then hopefully moving over to my fresh backup install of win7 should help, we'll see.

3. on the RAM note, I've tried the new PNY RAM I bought while I was waiting for the RMA to fill, and the RMA that I recieved, they passed memtest for a couple hours, but perhaps if I leave them testing for an extended period they'll fail? ALthought it seems unlikely that both kits would be defective in the exact same fail after a certain amount of time on way. Right now I'm running on my older gskill 2x2 set, we'll see (I'm going to be saying that a lot, I think).

Will try making sure my board is dust free as well.


This'll all take probably a week but hopefully I'll be able to find the culprit and start an RMA (if necessary).

Thanks for the suggestions.
 

Tavo_Nova

Distinguished
Dec 31, 2011
1,159
0
19,360
Well I don't know what a USPS is but well since your not using a UPS or AVR it couldn't be those so i'll put them out of the picture for now

if nothing totally goes well i suggest try over a fresh backup install of win 7, of course making sure you got back up with your games and other programs

We need to find out the culprit soon so you can RMA anything that is needed specially those that are close to the warranty end. if all are still in their middle days then the more we need to get it done.

let me ask are your pc connected directly to the wall without any extenders or something along that line?
 

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510
Was a joke
UPS = United parcel service I think
USPS = United states postal service
mail and stuff and whatnot.

Hm, my PC is going through a very old (think 90's) uh...I don't know what it is I think it's an overgrown surge protector, has buttons on the front for different components etc etc, should I just put directly into the wall?
 

dewittgarry

Distinguished
Dec 3, 2011
24
0
18,510
Well, after some time(12 hours?) I shut down (no problems) and flipped the BIOS switch on my second 6950 to unlock those shaders again and "re"overclocked the CPU to 4Ghz. Still running with the 2 x 2GB good ole Gskill RAM. Roughly 8 hours later and no problems still, I feel like I'm jinxing myself here but I guess this would narrow it down to

A: both 8GB kits of PNY memory being bad (after another day of no problems with the Gskill I will put both kits in and let memtest have its way with them for 12 hours or so)
B: my 790x AM3 motherboard starting to fault up (thoughts/how plausible is this? It ran fine with 12GB for 6 months before this, all that changed was another 8GB kit of the same model)
C: My windows 7 installation is somehow borked to give problems with anything more than 4GB of memory...somehow.


Edit: something I did notice in the BIOS awhile back and neglected to mention: In the BIOS with all four sticks in (doesn't matter if the Gskill kit is in there or not) it only shows "DIMM" 1 & 2 on the motherboard in the memory timings option(I have four slots), not sure if this is intentional or indicative of something awry. Running the latest BIOS from ASUS.