I'm very experienced building PCs, but they've all been for family and friends. Now, I've built one for myself, and I'm looking to get more than I paid for.
The purpose of this system is primarily for digital photos, and I have some 233MB scans to edit.
I read the C2D Overclocking Guide (thanks), and I followed most of it. I didn't adjust the vMCH, vFSB or ICH until my most recent test.
With the FSB at 333, the system ran Orthos (blended) for 6 hours, averaging about 55C (measured with Everest Ultimate; up from 43ish when idle) with vCore <1.4 and memory voltage at 2.0. I decided to go for more, so I tried FSB @344, and upped vCore to 1.4 when the system rebooted after 10 minutes of Orthos.
The system seemed solid. The next day I ran Orthos, but then had to go out for longer than I expected. When I returned, Orthos had encountered a rounding error after 7hr 59m 58s. Being a software-type I figured that it was suspicious that it failed practically right on the 8 hour mark, especially since the system clock had been adjusted during the run. Core temp had been in the 52-53 range at the time of the failure.
Suspecting memory, I ran OCZ's memtest overnight (7 hours) without errors.
Anyway, I went through the C2D Overclocking Guide and changed all of the voltages to the recommended values. I upped the memory voltage to 2.1, figuring if the problem existed it was the RAM. I also corrected the RAM Write Recovery time to the value read from the SPD (so, 5-5-5-12 tWR=6, from tWR=4).
I customized Orthos to run fewer small FFTs so that it would get to the big ones earlier. It failed with a rounding error after just over 6 hours during a 20480K FFT. Core temp was fine--it seems to be lower when running large FFTs because it is waiting for memory reads/writes.
So, after this long and boring tale, my question is: should I consider my system to be stable if it can run Orthos for 6 hours before getting an error? I'm figuring that there isn't much I'm going to run that will heat-up the RAM like it does, based on the assumption that it is a memory problem. I guess from a purist point-of-view a rock solid system should be able to run a stress test indefinitely, even though the workload is tougher and more sustained than normal use.
Am I being retentive about stability, or do I have more tweaking and testing to do for FSB@344 to improve stability?
Increase your Vcore and that should solve the rounding problem. Since Orthos failed after 8 hours, you should be at the borderline to achieve rock solid stability. Try increasing another 1-2 notch on Vcore, and run stress test on both cores for 24 hours overnight ...
Once you've passed 24 hours testing, I'm very sure that you won't have any problems during normal uses. Imagine, 100% load on it for 24 hours, there isn't anything around that's designed to run 24 hours under 100% cpu load, at least not for any normal users ...
Thanks very much for the suggestions, guys, and for confirming that I should be able to expect a stable system to run a stress test on both cores for 12-24 hours.
I upped the vCore to 1.4375, and I started Orthos running. So far, it has been running fine for 1hr 45min. Core temps look good--perhaps slighly lower: 52-56C from Everest Ultimate. (For some reason the beta version of Everest reports temps 1-2C higher that TAT, which I am just using as a second temp reading.)
Is the vCore I'm now using still considered fairly moderate, or is it "getting up there"? My goal is get a good performance boost without going too far along the risk curve, or upgrading my cooler.
I ended-up having some trouble at 3.1 MHz. It ran fine for 9.5 hours, then the system blue-screened with a page fault in non-paged memory. While that sounds like it could be a software bug, my guess is that it was a symptom of the long stress test run.
I did various steps after that, including increasing vCore another notch. The system spontaneously rebooted after about 3.5 hours with vCore just below 1.5V. During all of these tests the core temps averaged about 55C, with occasional peaks up at 58-59C. The CPU temp sensor on the mainboard seems to read about 13-15C less than the internal core temps, on my system at least--so, low to mid-40s during stress testing. The mainboard temp sensor was high 30s.
Anyway, I was sort of seeing some odd behaviour on the system. For example, Everest Ultimate was reporting that the CPU clock multiplier was occasionally 6x (instead of 9x), and the CPU clock was correspondingly lower. So, I turned the power off, unplugged the box for a few minutes, then restored my 3.0GHz settings that I was using previously.
I'm going to run an long stress test at these settings. I think I have run 6-8 hours successfully at this speed (before I knew longer tests were better), and I've done a bunch of scanning and Photoshopping of 233MB images at this speed without problems.
When the multiplier drops back down to 6 it's because the processor is going into its idle state where it cuts down on speed and voltage to save energy. Turn off C1E and EIST in the bios if that bothers you.
Interesting. EIST and C1E were disabled--at least I haven't turned them on since I disabled them a couple of weeks ago. On my board, I believe I cannot change vCore with C1E enabled, and it was definitely running at 1.45v. However, I was experiencing slightly flakey system behaviour, which is why I reloaded all of the BIOS settings from my 3 GHz overclocking profile.
Still, it's interesting that you point out that the behaviour I was seeing was processor power management. I wonder if something got corrupted in the BIOS.
Anyway, the system has been running Orthos "Blend" for 20.5 hours with vCore at 1.375v and memory at 2.0v with FSB @333MHz and memory ratio of 10:8. So, it seems like things are solid at those settings. The core temps are nice and low too: 48-53C.
I'm going to be a bit more methodical about my next attempt to get the system running faster. However, I'm going to take my time, since it's a nice fast system at 3 GHz!
Thanks man. Found your guide yesterday. It's amazing what happens when you look.
Managed to lower my vcore to 1.3 after making some adjustments you recommended in your guide. Couldn't get it as low as you did, kept getting rounding errors. Seems stable now though, but I haven't completed the torture tests for more than a few minutes. FEAR (multiplayer) has me hooked ATM.
I know I should have tested further before playing, but hey, it ran fine last night!