Symphony of Blue Screens (now in Stereo!)

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
Greetings all,

I have previously posted this thread in the hopes of determining some kind of solution. Well, rather than dump a bunch of cryptic data on the community, I thought I would provide a short list of the last bunch of BSOD's that I suffered. What follows is a single line from each minidump that purports to suggest what the likely problem is.

Probably caused by : processr.sys ( processr!AcpiC1Idle+12 )
Probably caused by : AmdK8.sys ( AmdK8+3552 )
Probably caused by : nv4_disp.dll ( nv4_disp!ReadRegistryD3D+884 )
Probably caused by : ntkrnlpa.exe ( nt!KiTimerExpiration+96 )
Probably caused by : ntkrnlpa.exe ( nt!KiTrap0E+243 )
Probably caused by : AmdPPM.sys ( AmdPPM+2832 )
Probably caused by : ntkrnlpa.exe ( nt!KiTimerExpiration+96 )
Probably caused by : AmdPPM.sys ( AmdPPM+2832 )
Probably caused by : ntkrnlpa.exe ( nt!KiTimerExpiration+96 )
Probably caused by : AmdPPM.sys ( AmdPPM+2832 )

I can't tell you how many things I have done - from detailed HD integrity scans, to overnight MemTests. I'm almost to the point where I will be reinstalling windows for about the third or fourth time in the past 12 months. This alone tells me there is some kind of hardware that's the problem. Of course, I don't have a few hundred dollars available to bring it to a techie for a detailed analysis. Any thoughts based on the above information?

My system specs are:

Computer Type ACPI Uniprocessor PC
Operating System Microsoft Windows XP Professional
OS Service Pack Service Pack 3
Internet Explorer 8.0.6001.18702
DirectX 4.09.00.0904 (DirectX 9.0c)
Computer Name BROMELIAD (Primary PC)
Date / Time 2009-06-09 / 10:10

CPU Type AMD Athlon 64 FX-55, 2600 MHz (13 x 200)
Motherboard Name Biostar NF4 Ultra-A9A (4 PCI, 2 PCI-E x1, 1 PCI-E x16, 4 DDR DIMM, Audio, LAN)
Motherboard Chipset nVIDIA nForce4 Ultra, AMD Hammer
System Memory 2048 MB (PC3200 DDR SDRAM)
DIMM1-thru-4: Corsair XMS CMX512-3200XL 512 MB PC3200 DDR SDRAM (2.0-2-2-5 @ 200 MHz)
BIOS Type Award (10/31/06)
Communication Port Communications Port (COM1)
Communication Port Printer Port (LPT1)

Video Adapter NVIDIA GeForce 9600 GT (512 MB)
3D Accelerator nVIDIA GeForce 9600 GT
Monitor Samsung SyncMaster 2253BW/2253LW/MagicSyncMaster CX2253BW (Digital) [22" LCD] (HVLQ301334)

Audio Adapter Realtek ALC655 @ nVIDIA nForce4 (CK8-04) - Audio Codec Interface

IDE Controller NVIDIA nForce Serial ATA Controller
IDE Controller NVIDIA nForce4 Parallel ATA Controller
 

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
I gave up on playing techie, so I reinstalled windows. During the process of bringing windows up-to-date I got some missing DLL errors. Shortly after this, I installed a piece of software and then found that it would not run. When I went into Add & Remove Programs, I found that the software was installed, but that it took up 0.0 MB of space. Aha! I thought. The hard drive's bad - problem solved!

After a shut down, I pulled the hard drive out and swapped it with another one. Over the course of the evening, I re-installed Windows again, but on this new drive. Everything went smoothly and I had hope that I'd get an install of XP that might last a few months.

A day later it blue screened on me. No game was being played - the computer was just sitting on the Windows desktop. So here I am again - stuck not knowing the cause.

There were remarkably few dustbunnies when I just looked at it, and none of those were located on any important parts. That said, I gave the insides a good air-pressure cleaning. I removed, sprayed, and replaced the ram, ensured that there was no dust lingering about the heat sync, and pulled the dust off the bundled wires.

Potential causal agents

RAM
I don't have any ram to swap out, but I have pretty much confirmed that it's not the problem. Previously, I removed two sticks (I have four 512MB sticks of Corsair ram) and still got a blue screen. While I haven't put the second stick in on its own and got a blue screen, I have run Memtest-86 four times over the past month. Each time, it runs through the whole battery of tests with no negative results.

CPU
I ran a stress test of the CPU using "CPUStabTest.exe". It's old, but I got it from Major Geeks. I can't seem to find any cutting edge CPU testing software, so if any of you have an idea, let me know.

Power Supply
I have a 510watt power supply. A short time after the last time that I had problems, I had purchased an Antec 550watt True PSU (for a different computer). However, the mobo connectors on that Antec do not work with my Biostar mobo so I'm stuck with what I have. In the following picture, you can see the input/output information. I haven't a clue about what they all mean, but maybe some of you do.

2868605759_e69da4b265.jpg


Installed hardware elements

■a CD/DVD drive (which I seldom use)
■geForce 9600GT video card
■Three hard drives (Windows drive, Games drive, and Media drive)

I realize that three drives is a bit much, but I don't have anything more installed. All other elements (such as sound and LAN) are on the motherboard.

The big problem that I'm having is that I'm not a trained system builder. While I can mix and match parts to build a system, I have never used a multimeter and can't do any electrical testing whatsoever. I want to determine the problem, but money is a very large concern of mine right now (as it is for so many of us).
 

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
I have just run a series of tests with Prime95. Since I have dual-threaded RAM, I have to try two sticks (out of four) at a time. I have labeled them and have been swapping them out.

For the purposes of nailing down the problem, I have been running the Torture Test with the Blended setting. Each RAM stick has been given the name A, B, C or D.

A + B = Error after a minute and a half
C + D = No error after ten minutes
A + D = Error after a minute and a half
B + C = No error after ten minutes

Based on this series of tests, Stick A must be the bad ram-stick.

When I purchase replacement ram, I assume I must get ram that is the same speed as the rest of the RAM.

My type: Corsair XMS CMX512-3200XL 512 MB PC3200 DDR SDRAM (2.0-2-2-5 @ 200 MHz)

I am concerned that I won't be able to confirm that I'm getting the exact same RAM as what I have. Is the relevant identifier "CMX512-3200XL"? Basically, if I buy the (two sticks of) ram that have that code, am I safe?
 

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
Just when I thought I had the problem nailed down, I came home to find that my computer crashed no fewer than eight times today. Removing the (allegedly) bad RAM did nothing. After re-running the Prime95 tests, I have found that every single configuration of two 512MB RAM chips results in errors. However, I've now run more types of tests and found something interesting.

Test Type:

Torture Test (Small FFTs (maximum FPU stress, data fits in L2 cache, RAM not tested much)

Test Result:

Test 1, 800000 Lucas-Lehmer iterations of M172031 using FFT length 8K.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
Torture Test completed 0 tests in 2 minutes - 1 errors, 0 warnings.
Worker stopped.

In searching for solutions, I came across the following post from the Unreal Tournament forums. It appears related to what I'm dealing with.

From this page:
http://utforums.epicgames.com/showthread.php?p=4929296
(Edited for a few spelling & grammar issues)

I finally tried out the Prime95 blended test today and got exactly [that error] in no time at all. Of course, this was after I tweaked, prodded, and probed everything in sight for the last two weeks. Mine would just crash when ever it felt like it, while playing - otherwise it worked fine.

I fooled with some of the memory settings and got it to pass the Prime95 test for about an hour. Ten minutes into UT, it crashed.
Still keeping the memory problem in mind, I looked at Corsair's site (my memory is Corsair) for some clues. After rereading my mobo's manual again, I tried raising the memory voltages to no affect. Then it finally dawned on me that I'm using PC3200 RAM and it says its running at 400Mhz, but the board only officially supports 333Mhz tops. I went into the bios and dropped the memory speed down to 333mhz and it runs UT like a charm now.

I went into my BIOS and could not find any incorrectly set speeds. The speed of my ram was set for 200 Mhz, but I'm not exactly pro when it comes to navigating a BIOS. I've provided the relevant information regarding my motherboard and BIOS below:

Motherboard Properties:

ID: 10/31/2006-NF-CK804-6A61FB09C-00
Name: Biostar NF4 Ultra-A9A

Front Side Bus Properties:

Bus Type: AMD Hammer
Real Clock: 200 MHz
Effective Clock: 200 MHz
HyperTransport Clock: 1000 MHz

Memory Bus Properties:

Bus Type: Dual DDR SDRAM
Bus Width: 128-bit
DRAM-FSB Ratio: CPU/16
Real Clock: 163 MHz (DDR)
Effective Clock: 327 MHz
Bandwidth: 5227 MB/s
Ram Type: http://www.corsair.com/products/xms/default.aspx

BIOS Properties:

BIOS Type: Award
BIOS Version: NF4 Ultra-A9A (NF4UAA31 4F)
Award BIOS Type: Phoenix - Award WorkstationBIOS v6.00PG
Award BIOS Message: NF4 Ultra-A9A (NF4UAA31 4F)
System BIOS Date: 10/31/06
Video BIOS Date: 02/14/08

BIOS Manufacturer:

Company Name: Phoenix Technologies Ltd.
Product Information: http://www.phoenix.com/en/products/default.htm
BIOS Upgrades: http://www.esupport.com/biosagent/index.cfm?refererid=40

According to the BioStar website, there is a BIOS update available. However, I haven't owned a working floppy drive in years. When I visited Phoenix's website, I was appalled to find that they charge you to update your BIOS (along with other drivers). If there is no way to update the BIOS via Windows (and I searched for quite a while), I'm having some difficulty knowing just what I can do.

While updating the BIOS might even be a good idea, the fact still remains that my PC was very recently given the whole new-Windows-on-a-new-hard-drive treatment, which should have had this thing working just like it did before these problems very suddenly materialized in the first place.

Is there any way to test motherboards the way I tested the CPU/RAM with Prime95? For that matter, how do I know it's not my CPU causing this? I have a hard time believing that all four sticks of RAM went bad on the same day.

As always, I appreciate any help you folks can offer. I only wish I knew more because I've very clearly hit the limit of what I already know.

 

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
A helpful soul on another board has made recommendations. I'd like to include his thoughts and my responses to help give a clearer story of where I stand:

Thank you very much for your thoughts! To make things as clear as possible, I will respond to your recommendations in a line-by-line fashion:

Have you ran msconfig and used selective startup?
No, I have not. This is a brand new install of Windows and I hadn't considered what I would even do. I don't know what I would avoid running. There was no crap running in the background to begin with (besides the usual OS-stuff, I'm sure).

How about rolling back drivers to previous versions?
This is not possible since I've just installed Windows from scratch. There's no past drivers to go back to.

If you start with the most basic configuration, (mobo, cpu, vid card, 1 stick of ram), and use selective startup to see if the machine crashes.
I'm already doing this. I don't have any other peripherals to use, anyhow. Also, I can't use one stick of RAM. The motherboard screams (via beeps) if I try to do that, so I've had to put at least two sticks in.

I would also just run the most necessary Microsoft critical updates and SP2 and not install SP3 yet.
The next time I reinstall Windows, I will attempt to do this. Should I stop just short of installing SP3 and see how the computer performs? My thought, though, is that I had SP3 installed on every iteration of Windows that I've had in this computer and (previously) it never did a thing that I can see.

Run your machine and if it doesn't crash, then start adding services and updated drivers to see what crashes it.
I'd love to know more about what this means. As I mentioned, there are no previous drivers to go back to. As for services, I'm not sure what that means, either. (Forgive my apparent ignorance)

If it still crashes with just the basics running, then you need to start looking at hardware. What about your temps?
My temps are within the specs. When I last stress tested my CPU via Everest (45 minutes of 100% capacity), the PSU was holding quite steady within the normal parameters and my CPU never topped 62 degrees centigrade.

Intended Actions Going Forward:
My work's IT department was kind enough to loan me a floppy drive, mobo cable (for the drive), and a blank disk. I'm going to update the BIOS with these tools. Assuming that doesn't help, I will reinstall windows for a third time and do what I described above (stop short at SP3). Does this sound good?
 
I would break the system down to the CPU, MB, video card and 1 DIMM RAM. In BIOS be sure to set the RAM to specs including proper voltage, timings and data transfer rate as Mfg. stated. Be sure to use new SATA/IDE cables on the HD/optical drive and try to boot to the OS system disk and load the OS. If the system fails, try moving the HD to a diferent SATA header. If at this point the system fails, you know one thing. The cause is one of the seven things you have in use as your 'system'. MB, CPU, RAM, video card, HD, optical drive or PSU. Spare parts would be needed to test the system any farther in determining any hardware problem.
 

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
Hi Badge. As I've previously mentioned, I'm not an uber-techie with a whole PC's worth of parts lying around. If I were, I'd already have done this. ;) But I do appreciate the attention. What's been frustrating me is precisely that I don't have said tools - that's what's limiting my prospective actions. However, I have confirmed that my ram is running at the proper timing. I went into the bios and found that the ram was set to auto and it conformed to my ram specs.

But get this: on Tuesday, I had eight BSOD's over the course of the day. They hit my wife when she played Plants vs. Zombies a few times, at least one hit just sitting there on a blank desktop. The others happened during times that I wasn't aware of my PC's functions.

But Wednesday (yesterday) was the most taxing PC day in recent memory. My wife and son played PvZ's for at least two hours. I had remote access running, pulpTunes hooking me up with my iTunes collection remotely. Then I came home and ran Folding@home while I played Team Fortress 2. My computer had zero BSOD's.

So my question is where exactly on the motherboard do I go to find the gremlin, gut him, stuff him, and place him over my mantle? I mean, what the eff? :pt1cable:
 
The MB and sytem are old. I have several socket 939 systems running curently. I would swap out the SATA and IDE cables. You might swap the hard drive cable and move the hard drive with a new SATA cable to a different MB header. If IDE HD, replace the IDE cable with new. If the system crashes and BS's under stress, I would try some different RAM with proper voltage (not auto) and look into a new power supply.

Edit. PC3200 DDR defaults to 2.6v. Manually try 2.7-2.8v if using PC3200 400MHz. DDR. Memtest passed with auto settings so that is good sign. Bumping the memory voltage a bit couldn't hurt and your memory may call for slightly more voltage than auto provides. Corsair XMS will require 2.75v which auto setting does not provice.

http://www.newegg.com/Product/Product.aspx?Item=N82E16820145575
 

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
Hi Badge. Thanks so much for the thoughtful reply! I like the idea of swapping cables. It has two benefits: (1) I haven't yet tried something like that and (2) it doesn't cost any money because I actually have those lying around.

One thing I do remember is that I have three voltage options for my ram (when on manual). They are 2, 2.5, and 3. Given those options, do you have an idea what I should choose? I believe that I am running at 200MHz for my RAM, but I'm operating from my (rapidly failing) brain-memory right now. My brain runs at a much lower frequency, I assure you. :)
 
2.5v is not enough to properly power Corsair XMS requiring 2.75v. I doubt 3v is an option, I have never seen that. JEDEC standards will cause the system BIOS to default to 2.6v when it detects PC3200. So, the memory voltage should be set manually to something above auto or 2.6v default. 2.7v-2.8v would be ideal. Not familiar with ECS board BIOS.

Edit. The socket 939 NF4 chipsets are very picky about RAM as I'm sure others have told you. RAM is often the cause of system crashes and BS's when running software (games).
 

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
Among my limiting factors is this: I do not have access to alternate parts. That's the whole reason I'm poking at you techies out there. Had I the money, I'd go out and get parts. I can't swap ram, try another power supply or install a new motherboard. Once my wife's done with school, I will buy a brand new computer, but I can't do that now. Imagine that $50 is a prohibitive expense right now.

That limits the data I can collect - for sure. Regardless of the outcome (of the PC and my sanity) I want to again thank all of you who have been helping me through this trial. What follows is my blue-screen timeline for the past few days, obtained by opening the minidump with WinDbg and pulling the basic bugcheck analysis.

Saturday (06/13/2009) 1 Crash
9:09 PM - Probably caused by Unknown_Image ( ANALYSIS_INCONCLUSIVE )

Sunday (06/14/2009) No Crashes
I avoided using the computer at all and it spent most of the day shutdown.

Monday (06/15/2009) 1 Crash
5:23 PM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )

Tuesday (06/16/2009) 7 Crashes
8:56 AM - Probably caused by memory_corruption ( nt!MiPfPutPagesInTransition+a8 )
9:24 AM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )
10:43 AM - Probably caused by dxg.sys ( dxg!DdHmgLock+5e )
11:23 AM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )
2:01 PM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )
2:41 PM - Probably caused by ntoskrnl.exe ( nt!WmipQueryLogger+260 )
7:09 PM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )

Wednesday (06/17/2009) No Crashes
In spite of heavy game use and the aforementioned Folding@home with TF2 stress test, the PC ran beautifully.

Thursday (06/18/2009) 1 Crash
10:11 PM - Probably caused by memory_corruption ( nt!MiPfPutPagesInTransition+a8 )

Friday (06/19/2009) 6 Crashes
6:39 AM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )
7:11 AM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )
7:12 AM - Probably caused by nvata.sys ( nvata+12f63 )
12:44 PM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )
3:15 PM - Probably caused by processr.sys ( processr!AcpiC1Idle+12 )
4:29 PM - Probably caused by dxg.sys ( dxg!DdHmgLock+5e )

A LOOK AT MY BIOS' RAM SETTINGS

Within the Phoenix BIOS, I have chosen Advanced Chipset Features, then DRAM Configuration. Within that, I have the following itemized options:

Timing Mode: Currently Auto (may choose: Manual or Auto)
When on Auto, I cannot change any other values. Once Manual is chosen, then I may alter the other values.

Memclock index value (Mhz): 200Mhz (by default)
May choose: 100Mhz, 133Mhz, 166Mhz, or 200Mhz

CAS# latency (Tcl): 2.5 (by default)
May choose: 2, 2.5, or 3

Min RAS# active time (Tras): 8T (by default)
May choose: 5T, 6T, 7T, 8T, 9T, 10T, 11T, 12T, 13T, 14T or 15T

RAS# to CAS# delay (Trcd): 4T (by default)
May choose: 2T, 3T, 4T, 5T, 6T, or 7T

Row precharge Time (Trp): 2T (by default)
May choose: 2T, 3T, 4T, 5T, 6T, or 7T

Row to Row delay (Trrd): 2T (by default)
May choose: 2T, 3T, or 4T

A word about my BIOS version: Technically, I can update my BIOS to bring it up-to-date. However, it was working just fine for 10 months or so before any of this started, so I wonder if I should bother (BIOS updates make me edgy). While I have obtained (from my work's IT department) a floppy drive, mobo connector, and a diskette, I realized this evening that I don't have a compatible power connecter to make it work. I will have to acquire an adapter if I decide to make the upgrade.

ACTIONS TAKEN TONIGHT

Before I make modifications to the RAM (based on feedback relating to the above information), I have taken some minor actions. I have found replacement SATA cables for all three hard drives, so have substituted them. These cords are unused and just ripped out of the package. Also, I gave the interior another good spraying. At some point over the past few days, I think that the front fan (that pushes air from the front to the rear fan), got stuck. I have unstuck it and its moving at its usual speed now.
 

docbadwrench

Distinguished
Apr 22, 2008
71
0
18,630
I put out a whole bunch of threads, but will have to post a final follow-up so that more people can benefit from whatever the heck can be learned from this thread.

Technical documentation seems to lack any reasonable way of assessing what the temperature-load of a system is. For instance: My video card is reasonably new (within the last year and a half), but my CPU is more like 3 years old. Right now, I only get intermittent blue screens when my wife plays Plants vs. Zombies (for god knows what reason).

While my card is pretty new, the fact is that my blue screen instances went way down after I took a compressed-air can to it and created a wind tunnel from the front to and through the back. My next step is to buy a card-fan to attempt to draw excess heat from the videocard and hopefully drop the temp a few more degrees.

Obviously, the one piece of advice you get from everyone is "swap every damned thing out of your PC and see when it stops failing". Of course, not all of us have an exact replacement part hanging around for this (my situation).

This is kind of rambling (apologies), but let me leave you with this observation, which I did not find enough of - or fully appreciate - until this happened. As our hardware ages, it gets less reliable. We all know that. But it hadn't really sunk in that - in addition to that lowered performance, it also has a lower tolerance to heat that may never have seemed to bother it just a year ago.

After what feels like an eternity, I have finally learned that lesson. I hope it's useful to any other person frantically googling and searching for answers to an interminable series of blue screens that crush the soul.

I only get about one blue screen a week as of this posting. I can live with that until such time as a new system becomes a reality. In the mean time, I keep the PC very clean, make sure it's not running 24/7 and look for ways to keep the ambient temperature down.
 

bogdangabriel30

Distinguished
Nov 11, 2009
1
0
18,510
Hi, i m having the same model of mobo (BIOSTAR A 939 A - rev 1,2 ), and she is working fine , with a 3700+ processor and 4 dimm from diferent manufacturers. Im guesing that your mobo has the old original BIOS , and that is incomaptible with revision E procesors . Also , try to set the memories to manual and the folowing tabs like this :tCL 3, tRCD 4 ,tRP 4 , tRS 8.If you have the revision 1,2 of the mobo , ihave the latest BIOS , and you d ont need a floppy drive : you can use WINFLASH , the program provided with the mobo s software cd . Also , the mobo dont work with extented power suply connector and without 12v+connector
 

TRENDING THREADS