Ran memtest86+ last night and got errors, what are my next steps?

chainers

Honorable
May 10, 2013
160
0
10,710
Hey all,

I have been having memory issues this weekend. Stable settings on my RAM suddenly would not post, and I have been experiencing crashes - even at stock settings (2133). My memory is rated for 3200, and it was running at 2997 just fine until this weekend.

Last night, I ran the memtest86 module, and it came up with 542 errors after pass 3 before crashing (side note, is a crash during memtest with errors normal, or should it have continued through?) So i know something is up, but I really don't know where to go from here.

I am thinking of testing each stick individually to see if I can narrow it down to a stick, but what else should I look at? Can Memtest show issues with the motherboard or the CPU, or is it safe to say that the memory is bunk and I need to RMA it?

Im still learning with PCs, so any tips are appreciated.
 

chainers

Honorable
May 10, 2013
160
0
10,710


I did buy as a kit, thanks! Good to know that I don't need to test them individually. Ill probably test to see if one can work while I wait for the RMA to come through so hopefully I can game somewhat in the next couple of days, but we will see if I have the energy for that!
 


Yes, most likely only one stick is failing and you can use that until RMA comes through.
 

InvalidError

Titan
Moderator
If the problem is a specific DIMM, the address or data pattern should show that errors tend to affect specific addresses, data bits or even specific data bits at specific addresses. In that last case, you would be definitely dealing with a bad/weak bit in a DRAM chip.

For error patterns that seem to affect an address or data bit on the bus, the problem could be the CPU, CPU socket, motherboard traces, DIMM slots or the DIMMs themselves. If you test DIMMs in each slot, get clean passes on one slot but errors using the same DIMM(s) in other slots, that narrows the problem down to the motherboard or CPU. You'll have to swap either of those to find out which is which.

If you are overclocking the CPU, reset your OC.
 

chainers

Honorable
May 10, 2013
160
0
10,710


I was running the test with everything at stock. Is it safe to say that it is the memory, if this is the case, or is there something else I need to do with my CPU?

When I first started getting issues with my computer freezing, I did run things OC, but it was always the safe settings (game boost, XMP, etc).
 
If it's the RAM that's bad, no sense in testing them individually because they are factory tested as a "matched pair" and mixing an old stick with the new may not work and is not guaranteed to work. You buy sticks as a matched pair and they should by replaced with a matched pair.

OTOH, it could be a socket ... to find out you will have to test the sticks single in each socket to determine which stick / socket is bad.

I have only had this happen a handful of time and in each case I proceeded as follows:

1. Call tech support and describe what actions you hav etaken, they will likely want to make sure that you haven't changed any of the default settings (XMP if applicable)

2. If it's confirmed to be the RAM, they will issue an RMA.

3. Tell them you need to use your PC. Give them your credit card info and then will obtain approval from CC company to charge your card.

4. Then they will ship the RAM.

5. When it arrives, put it in and make sure it passed memtest86+ overnight

6. Put the old RAM, both sticks, in the same packaging and ship it back to the address provided.

7. Once rec'd, they will credit the charge and nothing will appear on your bill.
 

InvalidError

Titan
Moderator

Try the DIMMs individually. If one DIMM produces errors while the other doesn't when you install them in the same slot, then you know that one DIMM is bad. If both are OK on one slot but have errors on another, then the RAM is most likely fine and the problem is either the CPU or motherboard.
 

InvalidError

Titan
Moderator

If you test them individually in the same slot and they both test good and both test bad in one of the other slots, you know that something's gone wrong with one the other slot(s). If only one tests good, then you know the other one is bad. If both test bad, the test may be inconclusive. If you don't test them separately, then you may be wasting your time and the vendor/manufacturer's time and money with an unwarranted RMA.
 

chainers

Honorable
May 10, 2013
160
0
10,710


I was under the impression that memtest86 only shows errors if there is a problem with the memory, not if there is a problem with the motherboard. So is this incorrect? Please let me know because they already approved the RMA :-(.
 

InvalidError

Titan
Moderator
memtest86 will report errors if it encounters any, wherever they may originate from. It has no way to differentiate between an occasional arithmetic error in the CPU, an occasional glitch in the CPU's memory controller, some deviation on the motherboard or an issue with the DIMMs themselves.

The only time where memtest is 100% sufficient on its own without having to do any DIMM shuffling is when you consistently get errors on the same subset of address and data bits. In those cases, you know beyond reasonable doubt that those bits are weak and unreliable without any further testing.
 

chainers

Honorable
May 10, 2013
160
0
10,710


Thank you, I appreciate all the help. Right now I am testing both sticks individually, and then I will test the other DIMM to see if I need to RMA the Motherboard or the RAM.

It sounds like a failure is definitely an issue somewhere on the hardware, but now it is just a matter of narrowing down what might be the right one.

 

chainers

Honorable
May 10, 2013
160
0
10,710


No other system to check. I tested one stick of RAM last night and it passed 8 passes with no errors. I have the second stick in now, and it is passing so far (4 passes) with no errors. If it passes, I will test the other DIMM to see if I can reproduce the errors.

If not, then I will be thoroughly confused and post on here...
 

InvalidError

Titan
Moderator

If errors mysteriously go away after shuffling DIMMs between slots, then the original errors may have been caused by improperly seated DIMMs, dirty contacts or other similar reasons.
 

chainers

Honorable
May 10, 2013
160
0
10,710


Even if the errors started showing up after 4+ months of using? It seems like the issues would have happened before then.

The only other thing I can think of would be the heat wave we got this past weekend (heat was 20f higher than normal) but I can't imagine it having that high of an effect on the PC. Please correct me if I am wrong on this!
 

InvalidError

Titan
Moderator
Heat + moisture + dirt + electricity + time = corrosion.

If you had a weak contact for whatever reason with some dirt in it, corrosion could have pushed it over the edge over time. By removing and re-installing the DIMMs, you scraped the corrosion and dirt off, now it is good again.

This is just one possible explanation. Almost anything can happen given the right circumstances.
 

chainers

Honorable
May 10, 2013
160
0
10,710


Thank you for all the advice, I appreciate it! Ill check when I get home to see if the memtest passed on the second stick, and if it did I will try to put the memory in the second DIMM to see if I can replicate the error. No error, Ill plug and play (and try to OC again). I hope it is a heat issue because I would hate to take apart my whole damn PC to get the motherboard out!
 

chainers

Honorable
May 10, 2013
160
0
10,710


Hey,

So my second stick went through the memtest86 just fine, and I did 2 passes on the other DIMM with no errors (will test more later tonight, but I kind of want to game for a bit).

Sadly, I think something is still wrong, because I can no longer clock my memory at higher than 2133, despite using it at 2997 for 4 months or so. I think the issue is with the motherboard, however, and will try resetting my CMOS later to see if it helps. Hopefully it will, otherwise I will go ahead and RMA (or just replace, I hate this thing) the motherboard. Let me know if that sounds good, or if there is something I could be missing.

Update; Ok wow, I spoke too soon. Something is definitely wrong. Right after posting this, my PC crashed again, after showing nothing on the memtest86 for each sticks individually, and in each DIMM slot. Any advice what else it could be? I am thinking the Motherboard is faulty at this point, but would appreciate a more knowledgeable opinion on it.

Ok, last update. For some reason, my system can only use half of my RAM now! It shows Hardware Reserve of 8gb, and only lets me use 8gb. Any idea what might be happening here?
 

chainers

Honorable
May 10, 2013
160
0
10,710


I will run it again tonight. Any way to test the CPU to see if it is faulty, or is it just a guess between these two?
 

InvalidError

Titan
Moderator
To test ONE thing at a time, you need to eliminate all other variables first. The easiest way to do that is swapping parts with a known-working system or some other source of known-good parts. If you don't have access to that, then all you have left after exhausting what tests you can run with what you have on hands and not finding a smoking gun is guessing.
 

chainers

Honorable
May 10, 2013
160
0
10,710


Thank you for the help. I ordered a new motherboard, and hopefully this one will work. If not, then my last guess would be a faulty CPU.
 

chainers

Honorable
May 10, 2013
160
0
10,710
Hey, just in case anyone is reading this through searches, I ended up RMAing the ram and they came back to me with 2 new sticks. After replacing the RAM, all issues went away. Looks like faulty RAM was the culprit on this after all!