Reader How Tos: Building For Stability

Stability Testing

I cannot overstress the amount of pain and headaches that can be avoided with some basic testing of a clean system. Once you have installed all your other hardware and software, the number of permutations for the problem increases dramatically. Testing should be on a base hardware system with a minimal software install above driver and OS updates. The first question to ask is, "Is the computer stable?" If you know the basic computer is stable (by testing), this will help if you encounter problems with a fully loaded system. My experience has found three main causes of computer instability:

  1. Hardware
    Examples of this are faulty hardware, poor or intermittent connection, and PSU overloading. The PCI slot chosen and interrupts assigned need consideration on older OSes (predominantly Windows95/98/ME). Which RAM slot do I use? The position can have a significant effect on stability by changing the loading and timings. Read the manual and newsgroups for confirmation.
  2. Heat
    Heat has several effects. Heat increases susceptibility to the hardware problems mentioned above. This is because semiconductor properties, especially their timings, change with temperature. So, two unmatched components (e.g. Motherboard and RAM) can be brought to the limit of their timings by increased temperature. Heat can also shorten the lifetime of a component.
  3. Software
    Software can be blamed a lot for faulty hardware and vice versa; look at the history of the "infinite loop" problem in WindowsXP and the number of guesses to the solution. With a basic install, the only changes you can make to improve stability are the drivers, installation process (order of installation has made a difference) and BIOS settings.

Understanding the cause of instability is one thing. Testing for instability is another, and this is what we need to do. My first test always uses the 3dMark suite of tools (www.madonion.com ), which are free to download and test most of the components to their limits. I use the standard benchmark in the looping mode on 3dMark 2001, or 3dMark 2000 for low-end graphics cards. I run a 12-hour continuous loop before I declare a system stable. I have never found a PC that completes this stability test to have a problem afterward. I have found many PCs that took considerable time to pass this test due to poor component selection and cooling. The only exception to this rule I have found is MPEG encoding, which is more processor intensive and can raise the processor temperature even higher.

Check the "Looping" tick box from the Options menu to run continuous 3D Mark tests.

The test time can be tailored to the application - for an office/Internet machine, 12 hours of intensive graphics testing seems unreasonable. In my opinion, a minimum of a four-hour loop would give a reasonable level of confidence. I also need to mention that during the first hour I monitor the temperature of components (with my hand) to make sure things aren't getting too hot, which could be due to poor circulation or component installation. Additional temperature monitoring can be made in the BIOS, with the manufacturer's motherboard application or generic applications such as "Motherboard Monitor ". These methods normally use sensors on the motherboard to report the temperature.