To start, this is my build - nothing is OC'd. Everything is still set at stock values. If it is any consolation, I built this based on reading through these forums.
ASUS Sabertooth X79
16GB (4x4) Corsair Vengeance 1866
Corsair TX750M (PSU)
Corsair H110 (water cooler for CPU)
Soundblaster SBX Pro
Cooler Master Cosmos 2 (Case)
This rig was built two weeks ago and has been having random intermittent issues. This is far from my first rig, but this is the first time I have seen this behavior. Usually if the PSU is bad, the effect is more dramatic. Likewise with the other components
So, I installed and used the following.
Prime95 - ran for a few hours and even continued using the box (surfing the web... etc) while the CPU was pegged. No errors or voltage drops.
MemTest - ran for 16 hours - no errors.
HWMonitor - All temps are fine. The processor gets up to 70c after running prime for a few minutes.
FurMark - ran for a few hours. I even ran this with Prime95 for an hour. No voltage dips while both ran or while Furmark was running solo.
I have been getting multiple warnings of voltage drops occurring across different parts of the motherboard and video card from the ASUS monitor utility. So far, +3.3V, +12V and VCore voltages have randomly dropped to 0v or only slightly above it. I am assuming this is the PSU. However, I have read that the Intel Chipset will drop voltage if it is under stress. This does not seem likely in my case as the drops are random and not only when the system is under stress. Oddly enough, the system remains stable. So, if these tools were not enabled, I would not be seeing this issue... yet.
Regardless of stress levels, Speccy shows the same thing as the ASUS monitor utility in terms of voltages and when they drop.
Am I correct to assume it is the PSU, or could it be the motherboard instead?
Are there any other checks I can perform to confirm which it is?
The PSU is a quick replace - so no gripe there. However, if it is the motherboard, then not so much.
Can you lay your hands on another PSU to test with? and what PSU do you have?
I am currently using the Corsair TX750M as a PSU. I do have another PSU (in my previous rig).
I continued troubleshooting and changed a few things.
I updated the BIOS to 4302 (8/29/2013). I did this right after I built it, but I did not see this version before.
(Post BIOS update) In the BIOS, I turned on XMP and set RAM to the right speed. I had to do this twice. The first time, the BIOS reported the correct speed, but the tools in the OS (Speccy / ASUS AI) did not. I haven't seen the voltage flicker again so far. Of course, I haven't really used it yet.
If this turns out to fix it, then great! Perhaps others will search for the same issue and find this post.
If this did not fix it, I'll swap the PSU from my old box and see how well it does. I won't be able to push the video card that hard, though, as the GTX780 requires 42A on the +12 and the other PSU can only push 40A.
Regardless, I will post again with an update and hopefully close this issue out. Thank you for the response
Might have been it, and depending, best bet to check DRAM freq is CPU-Z under the Memory tab - it will show as true frew (i.e. if you have 1600 sticks it will show 800 as it's DDR (Double data rate), a lot of the programs like Speccy show the default freq from the DRAMs SPD which is often 1333
Just after I posted, I got another alert. This time for overvoltage (24v on 12v). Speccy and ASUS AI still showing drops. In Speccy, the voltage on the following drops to 0.000v for a split second and then goes back to the norm. It happens to quick to snapshot it.
Strangely, the system acts perfectly normal. If I didn't have this software, I would never know the voltages were doing this.
I am assuming the voltage sensors are located at the regulators, because if suddenly power were cut from all of those at the component level - if even for a split second - I should have had a BSoD.
I guess I'll shut it down for the night and replace the PSU tomorrow. That is the simplest. If it is the PSU, I just hope none of the onboard components were damaged.
Make sure you are not using any other polling tools with AI SUite monitoring active. Anything like CPU-Z, AIDA, HWinfo can cause the AI Suite polling to misreport voltages because of IO contention. It is very likely there is nothing wrong, and all you are seeing is polling errors.
That all said, ensure you are running the latest AI suite version as well (download latest from ASUS product page for your board).
So, I swapped out the PSU and downloaded SpeedFan. It reported that my CPU was 143c, but seemed to report the temps of each core accurately. The +12 in SF also shows 6.6v constantly. Every other program I have used shows 12, so I am going to assume it is wrong. I also enabled logging and created a program to output to a CSV whenever voltage dropped below normal levels. I left it running while I was at work, but I forgot to disable AI Suite. That was about 10 hours of just monitoring the voltages through AI Suite (inadvertently) and SpeedFan.
I came home to a nice surprise. Speedfan and AI Suite both said my motherboard temperature was 127c. After a brief moment of panic, followed by the realization that the computer was, in fact, not on fire (and there was no smoke or smell of burning electronics / popped caps), I rebooted it and looked at the temp in the BIOS. It was 40c and stable. Whew - so there isn't a temp issue. AI Suite and SpeedFan must have had an argument over resources and decided the best method of resolution was to freak out the owner of the PC. I'll have to keep that in mind for future dev projects at work.
For the SpeedFan logs, none of the usual suspects dropped below tolerance. Unfortunately, SF only records every 3 seconds. So, it could have dropped several times and just wasn't caught by the logger. Also, the program that writes when they drop below had no entries, but I am unsure if the event handler on it runs more often than the logger.
For AI Suite, I had an alert that 5.0v was 0v. Unfortunately, there was no context information about when this happened. Besides both tools saying my computer was apparently on fire, the voltages were normal when I got home.
Is there a better logging tool for these? Perhaps one that records every second? I have 2TB of storage, so I am not really concerned with how large the logs get. However, due to file IO and such, it would be preferable if each log was broken into manageable sizes as to not hinder the logging process.
Edit: I just downloaded AI suite 2.00.01 (date stamp 2012/10/26). I will update if this resolves the random message issue. Thanks for the suggestion, Raja.
Is there a better logging tool for for voltages that records at a higher interval than SF?
Is there anything I can do for peace of mind to prove nothing is actually wrong?
The old rig had a Thermaltake toughpower 750w. I used to be big into Thermaltake, but with all of the new products Corsair is coming out with lately, I decided to try out the TX750M in this build.
This Thermaltake PSU can only handle 36A with 2x 12v on the video card (which requires 42A). So, it isn't ideal, but it was a good test.
Regardless of PSU ASUS AI Suite still had a 0v on 5v warning when I got home. So, it is still happening. The latest version (from the site) was the same version I had installed, but I installed it anyway.
I'll update if I get another of those warnings. Only AI Suite is monitoring now. So, there is no excuse for resource contention.
I think AI suite allows you to ednable or disable some of the polling (maybe even adjust it). You can disable all the polling if you wish and use something like HWinfo instead. As I said earlier, just make sure two things are not polling the super IO at the same time.
I put the new PSU back in and put the Thermaltake back in my spare rig (aka media machine). I let it just sit all night (usual time of failure was random and stress levels didn't seem to matter). There were no warnings when I woke up this morning.
This was the weirdest experience I have ever had. The more I analyzed it, the worse it seemed to get. I do analysis daily. So digging in like this is second nature. Unfortunately, it wasn't the right thing to do - lesson learned.
So, the fix was...
- Reinstall AI Suite. The version on the site was the same as the one on my machine. However, the warnings popped up before I obtained all of the analysis tools. So, I think the install was either faulty, or the upgrade of the BIOS caused something to change.
- Quit over-analyzing
Thanks for the replies and suggestions! Stay awesome
I'm seeing +3.3v drop to 0.0v (very rarely - less than ~once per two hours)
I'm seeing VCore fluctuations - down to 0.0v (rarely, ~once per hours), but often fluctuating between 1.336 & 0.862 (baseline @ ~1.332)
This is at almost zero load on CPU, GPU, Network, and all Drives, all temperatures nominal.
I however AM having an issue where the system "stutters" randomly (keyboard input will repeat ~50 chars; audio will stutter/pop; mouse will stutter) - though I'm unable yet to determine whether the two issues are connected for sure.
Anyone else seeing similar issues? Any idea whether this is an actual problem (perhaps bad MB)?
I noticed similar behavior to what you mentioned. The voltage dip issue was alleviated by either shutting off AI tools and using only a single other tool to monitor (e.g. Speccy) or by only using AI tools. I had to update the AI tools install, and that is what seemed to fix the initial voltage drop issue.
For the random stutters, my computer would randomly act like it was locking up for 5-10 seconds every couple of minutes. When it would resume, it would catch up with the input I had given it.
I ran MemTest for a total of 32 hours, as I thought RAM/Paging was having an issue. I also ran Prime95 for a total of 16 hours, in case it was the processor.
I updated my VPU drivers to the latest from nVidia. The locks could have been video feed locks. I also uninstalled eVGA's tools. I like the fan curve feature, but it was causing other issues like randomly crashing for no specific reason. I tried updating it and it still crashed randomly.
It did not happen after that, but I did more (because I had to move the computer). I changed the USB slots that my input devices were using and made sure that they were not in the 3.0 slots (the blue ones). I have not had the issue since.
AI tools has a USB 3.0 boost tool. Perhaps it was causing issues with my input tools - maybe it caches data to enhance performance. Either way, between eVGA software uninstall and using only USB 2.0 slots, the issue seems to have disappeared.
Thx dracoaroch ... I've had suspicions of the USB 3.0 ports as well ... though I tried shutting off the ASMediia USB 3.0 in the Bios and the stuttering still occurs. Sometimes it won't happen for a while day, other times, it happens quite frequently.
After watching the Voltages for a few days, I see that the stuttering and Voltage fluctuations don't seem to be related (in time), so that matches with what you and others are saying re: the fluctions simply being due to polling errors between monitoring tools, or bad AI Suite installation.
I will check whether I have EVGA Tools installed and try removing them if so.
Thanks for the tips, it gives me a bit more to go on.