Tom’s Hardware and QUE Publishing are teaming up once more to bring you four more chapters from the latest edition of Scott Mueller’s Upgrading And Repairing PCs. And once again, we’ll be giving ten lucky Tom's Hardware forum members a copy of the book. Enter to win by completing this contest form.

Upgrading And Repairing PCs 21st Edition
- Chapter 3: Processor Specifications
- Chapter 3: Processor Features
- Chapter 5: BIOS
- Chapter 10: Flash And Removable Storage
- Chapter 20: PC Diagnostics, Testing, and Maintenance
Processor Features
As new processors are introduced, new features are continually added to their architectures to improve everything from performance in specific types of applications to the reliability of the CPU as a whole. The next few sections look at some of these technologies.
System Management Mode (SMM)
Spurred on initially by the need for more robust power management capabilities in mobile computers, Intel and AMD began adding System Management Mode (SMM) to its processors during the early 1990s. SMM is a special-purpose operating mode provided for handling low-level system power management and hardware control functions. SMM offers an isolated software environment that is transparent to the OS or applications software and is intended for use by system BIOS or low-level driver code.
SMM was introduced as part of the Intel 386SL mobile processor in October 1990. SMM later appeared as part of the 486SL processor in November 1992, and in the entire 486 line starting in June 1993. SMM was notably absent from the first Pentium processors when they were released in March 1993; however, SMM was included in all 75MHz and faster Pentium processors released on or after October 1994. AMD added SMM to its enhanced Am486 and K5 processors around that time as well. All other Intel and AMD x86-based processors introduced since that time also have incorporated SMM.
SMM is invoked by signaling a special interrupt pin on the processor, which generates a System Management Interrupt (SMI), the highest priority nonmaskable interrupt available. When SMM starts, the context or state of the processor and currently running programs are saved. Then the processor switches to a separate dedicated address space and executes the SMM code, which runs transparently to the interrupted program as well as any other software on the system. Once the SMM task is complete, a resume instruction restores the previously saved context or state of the processor and programs, and the processor resumes running exactly where it left off.
Although initially used mainly for power management, SMM was designed to be used by any low-level system functions that need to function independent of the OS and other software on the system. In modern systems, this includes the following:
- ACPI and APM power management function
- Universal serial bus (USB) legacy (keyboard and mouse) support
- USB boot (drive emulation)
- Password and security functions
- Thermal monitoring
- Fan speed monitoring
- Reading/writing Complementary Metal Oxide Semiconductor (CMOS) RAM
- BIOS updating
- Logging memory error-correcting code (ECC) errors
- Logging hardware errors besides memory
- Wake and Alert functions such as Wake on LAN (WOL)
One example of SMM in operation occurs when the system tries to access a peripheral device that had been previously powered down to save energy. For example, say that a program requests to read a file on a hard drive, but the drive had previously spun down to save energy. Upon access, the host adapter generates an SMI to invoke SMM. The SMM software then issues commands to spin up the drive and make it ready. Consequently, SMM returns control to the OS, and the file load continues as if the drive had been spinning all along.
The fifth-generation Pentium and newer processors feature multiple internal instruction execution pipelines, which enable them to execute multiple instructions at the same time. The 486 and all preceding chips can perform only a single instruction at a time. Intel calls the capability to execute more than one instruction at a time superscalar technology.
Superscalar architecture was initially associated with high-output reduced instruction set computer (RISC) chips. A RISC chip has a less complicated instruction set with fewer and simpler instructions. Although each instruction accomplishes less, the overall clock speed can be higher, which usually increases performance. The Pentium is one of the first complex instruction set computer (CISC) chips to be considered superscalar. A CISC chip uses a richer, fuller-featured instruction set, which has more complicated instructions. As an example, say you wanted to instruct a robot to screw in a light bulb. Using CISC instructions, you would say the following:
- Pick up the bulb.
- Insert it into the socket.
- Rotate clockwise until tight.
Using RISC instructions, you would say something more along the lines of the following:
- Lower hand.
- Grasp bulb.
- Raise hand.
- Insert bulb into socket.
- Rotate clockwise one turn.
- Is bulb tight? If not, repeat step 5.
- End.
Overall, many more RISC instructions are required to do the job because each instruction is simpler (reduced) and does less. The advantage is that there are fewer overall commands the robot (or processor) has to deal with, and it can execute the individual commands more quickly, and thus in many cases execute the complete task (or program) more quickly as well. The debate goes on whether RISC or CISC is really better, but in reality there is no such thing as a pure RISC or CISC chip—it is all just a matter of definition, and the lines are somewhat arbitrary.
Intel and compatible processors have generally been regarded as CISC chips, although the fifth- and later-generation versions have many RISC attributes and internally break down CISC instructions into RISC versions.
Note: The ARM processor used by Windows RT tablets is a RISC processor. Windows RT uses the same tile-based interface as Windows 8, but x86 software is not compatible with Windows RT.
MMX technology was originally named for multimedia extensions, or matrix math extensions, depending on whom you ask. Intel officially states that it is actually not an abbreviation and stands for nothing other than the letters MMX (not being an abbreviation was apparently required so that the letters could be trademarked); however, the internal origins are probably one of the preceding. MMX technology was introduced in the later fifth-generation Pentium processors as a kind of add-on that improves video compression/decompression, image manipulation, encryption, and I/O processing—all of which are used in a variety of today’s software.
MMX consists of two main processor architectural improvements. The first is basic: All MMX chips have a larger internal L1 cache than their non-MMX counterparts. This improves the performance of any and all software running on the chip, regardless of whether it actually uses the MMX-specific instructions.
The other part of MMX is that it extends the processor instruction set with 57 new commands or instructions, as well as a new instruction capability called single instruction, multiple data (SIMD).
Modern multimedia and communication applications often use repetitive loops that, while occupying 10% or less of the overall application code, can account for up to 90% of the execution time. SIMD enables one instruction to perform the same function on multiple pieces of data, similar to a teacher telling an entire class to “sit down,” rather than addressing each student one at a time. SIMD enables the chip to reduce processor-intensive loops common with video, audio, graphics, and animation.
Intel also added 57 new instructions specifically designed to manipulate and process video, audio, and graphical data more efficiently. These instructions are oriented to thehighly parallel and often repetitive sequences frequently found in multimedia operations. Highly parallel refers to the fact that the same processing is done on many data points, such as when modifying a graphic image. The main drawbacks to MMX were that it worked only on integer values and used the floating-point unit for processing, so time was lost when a shift to floating-point operations was necessary. These drawbacks were corrected in the additions to MMX from Intel and AMD.
Intel licensed the MMX capabilities to competitors such as AMD and Cyrix (later absorbed by VIA), who were then able to upgrade their own Intel-compatible processors with MMX technology.
SSE
In February 1999, Intel introduced the Pentium III processor and included in that processor an update to MMX called Streaming SIMD Extensions (SSE). These were also called Katmai New Instructions (KNI) up until their debut because they were originally included on the Katmai processor, which was the code name for the Pentium III. The Celeron 533A and faster Celeron processors based on the Pentium III core also support SSE instructions. The earlier Pentium II and Celeron 533 and lower (based on the Pentium II core) do not support SSE.
The Streaming SIMD Extensions consist of 70 new instructions, including SIMD floating point, additional SIMD integer, and cacheability control instructions. Some of the technologies that benefit from the Streaming SIMD Extensions include advanced imaging, 3D video, streaming audio and video (DVD playback), and speech-recognition applications.
The SSEx instructions are particularly useful with MPEG-2 decoding, which is the standard scheme used on DVD video discs. Therefore, SSE-equipped processors should be more capable of performing MPEG-2 decoding in software at full speed without requiring an additional hardware MPEG-2 decoder card. SSE-equipped processors are also much better and faster than previous processors when it comes to speech recognition.
One of the main benefits of SSE over plain MMX is that it supports single-precision floating-point SIMD operations, which have posed a bottleneck in the 3D graphics processing. Just as with plain MMX, SIMD enables multiple operations to be performed per processor instruction. Specifically, SSE supports up to four floating-point operations per cycle; that is, a single instruction can operate on four pieces of data simultaneously. SSE floating-point instructions can be mixed with MMX instructions with no performance penalties. SSE also supports data prefetching, which is a mechanism for reading data into the cache before it is actually called for.
SSE includes 70 new instructions for graphics and sound processing over what MMX provided. SSE is similar to MMX; in fact, besides being called KNI, SSE was called MMX-2 by some before it was released. In addition to adding more MMX-style instructions, the SSE instructions allow for floating-point calculations and now use a separate unit within the processor instead of sharing the standard floating-point unit as MMX did.
SSE2 was introduced in November 2000, along with the Pentium 4 processor, and adds 144 additional SIMD instructions. SSE2 also includes all the previous MMX and SSE instructions.
SSE3 was introduced in February 2004, along with the Pentium 4 Prescott processor, and adds 13 new SIMD instructions to improve complex math, graphics, video encoding, and thread synchronization. SSE3 also includes all the previous MMX, SSE, and SSE2 instructions.
SSSE3 (Supplemental SSE3) was introduced in June 2006 in the Xeon 5100 series server processors, and in July 2006 in the Core 2 processors. SSSE3 adds 32 new SIMD instructions to SSE3.
SSE4 (also called HD Boost by Intel) was introduced in January 2008 in versions of the Intel Core 2 processors (SSE4.1) and was later updated in November 2008 in the Core i7 processors (SSE4.2). SSE4 consists of 54 total instructions, with a subset of 47 instructions comprising SSE4.1, and the full 54 instructions in SSE4.2.
Advanced vector extensions (AVX) was introduced in January 2011 in the second-general Core i-series “Sandy Bridge” processors and is also supported by AMD’s new “Bulldozer” processor family. AVX is a new 256-bit instruction set extension to SSE, comprising 12 new instructions. AVX helps floating-point intensive applications such as image and A/V processing, scientific simulations, financial analytics, and 3D modeling and analysis to perform better. AVX is supported on Windows 7 SP1, Windows Server 2008 R2 SP1, and Linux kernel version 2.6.30 and higher. For AVX support on virtual machines running on Windows Server R2, see http://support.microsoft.com/kb/2517374 for a hotfix.
For more information about AVX, see http://software.intel.com/en-us/avx/. Although AMD has adopted Intel SSE3 and earlier instructions in the past, instead of adopting SSE4, AMD has created a different set of only four instructions it calls SSE4a. Although AMD had planned to develop its own instruction set called SSE5 and release it as part of its new “Bulldozer” processor architecture, it decided to shelve SSE5 and create new instruction sets that use coding compatible with AVX. The new instruction sets include
- XOP—Integer vector instructions
- FMA4—Floating point instructions
- CVT16—Half-precision floating point conversion
3DNow!
3DNow! technology was originally introduced as AMD’s alternative to the SSE instructions in the Intel processors. It included three generations: 3D Now!, Enhanced 3D Now!, and Professional 3D Now! (which added full support for SSE). AMD announced in August 2010 that it was dropping support for 3D Now!-specific instructions in upcoming processors.
First used in the P6 (or sixth-generation) processors, dynamic execution enables the processor to execute more instructions in parallel, so tasks are completed more quickly. This technology innovation is composed of three main elements:
- Multiple branch prediction—Predicts the flow of the program through several branches
- Dataflow analysis—Schedules instructions to be executed when ready, independent of their order in the original program
- Speculative execution—Increases the rate of execution by looking ahead of the program counter and executing instructions that are likely to be necessary
Branch Prediction
Branch prediction is a feature formerly found only in high-end mainframe processors. It enables the processor to keep the instruction pipeline full while running at a high rate of speed. A special fetch/decode unit in the processor uses a highly optimized branch-prediction algorithm to predict the direction and outcome of the instructions being executed through multiple levels of branches, calls, and returns. It is similar to a chess player working out multiple strategies in advance of game play by predicting the opponent’s strategy several moves into the future. By predicting the instruction outcome in advance, the instructions can be executed with no waiting.
Dataflow Analysis
Dataflow analysis studies the flow of data through the processor to detect any opportunities for out-of-order instruction execution. A special dispatch/execute unit in the processor monitors many instructions and can execute these instructions in an order that optimizes the use of the multiple superscalar execution units. The resulting out-of-order execution of instructions can keep the execution units busy even when cache misses and other data-dependent instructions might otherwise hold things up.
Speculative Execution
Speculative execution is the processor’s capability to execute instructions in advance of the actual program counter. The processor’s dispatch/execute unit uses dataflow analysis to execute all available instructions in the instruction pool and store the results in temporary registers. A retirement unit then searches the instruction pool for completed instructions that are no longer data dependent on other instructions to run or which have unresolved branch predictions. If any such completed instructions are found, the retirement unit or the appropriate standard Intel architecture commits the results to memory in the order they were originally issued. They are then retired from the pool.
Dynamic execution essentially removes the constraint and dependency on linear instruction sequencing. By promoting out-of-order instruction execution, it can keep the instruction units working rather than waiting for data from memory. Even though instructions can be predicted and executed out of order, the results are committed in the original order so they don’t disrupt or change program flow. This enables the P6 to run existing Intel architecture software exactly as the P5 (Pentium) and previous processors did—just a whole lot more quickly!
The Dual Independent Bus (DIB) architecture was first implemented in the sixth-generation processors from Intel and AMD. DIB was created to improve processor bus bandwidth and performance. Having two (dual) independent data I/O buses enables the processor to access data from either of its buses simultaneously and in parallel, rather than in a singular sequential manner (as in a single-bus system). The main (often called front-side) processor bus is the interface between the processor and the motherboard or chipset. The second (back-side) bus in a processor with DIB is used for the L2 cache, enabling it to run at much greater speeds than if it were to share the main processor bus.
Two buses make up the DIB architecture: the L2 cache bus and the main CPU bus, often called FSB (front side bus). The P6 class processors, from the Pentium Pro to the Core 2, as well as Athlon 64 processors can use both buses simultaneously, eliminating a bottleneck there. The dual bus architecture enables the L2 cache of the newer processors to run at full speed inside the processor core on an independent bus, leaving the main CPU bus (FSB) to handle normal data flowing in and out of the chip. The two buses run at different speeds. The front-side bus or main CPU bus is coupled to the speed of the motherboard, whereas the back-side or L2 cache bus is coupled to the speed of the processor core. As the frequency of processors increases, so does the speed of the L2 cache.
DIB also enables the system bus to perform multiple simultaneous transactions (instead of singular sequential transactions), accelerating the flow of information within the system and boosting performance. Overall, DIB architecture offers up to three times the bandwidth performance over a single-bus architecture processor.
Intel’s HT Technology allows a single processor or processor core to handle two independent sets of instructions at the same time. In essence, HT Technology converts a single physical processor core into two virtual processors.
HT Technology was introduced on Xeon workstation-class processors with a 533 MHz system bus in March 2002. It found its way into standard desktop PC processors starting with the Pentium 4 3.06 GHz processor in November 2002. HT Technology predates multicore processors, so processors that have multiple physical cores, such as the Core 2 and Core i Series, may or may not support this technology depending on the specific processor version. A quad-core processor that supports HT Technology (like the Core i Series) would appear as an 8-core processor to the OS; Intel’s Core i7-3970X has six cores and supports up to 12 threads. Internally, an HT-enabled processor has two sets of general-purpose registers, control registers, and other architecture components for each core, but both logical processors share the same cache, execution units, and buses. During operations, each logical processor handles a single thread.

A processor with HT Technology enabled can fill otherwise-idle time with a second process for each core, improving multitasking and performance of multithreading single applications.
Although the sharing of some processor components means that the overall speed of an HT-enabled system isn’t as high as a processor with as many physical cores would be, speed increases of 25% or more are possible when multiple applications or multithreaded applications are being run.
To take advantage of HT Technology, you need the following:
- Processor supporting HT Technology—This includes many (but not all) Core i Series, Pen-tium 4, Xeon, and Atom processors. Check the specific model processor specifications to be sure.
- Compatible chipset—Some older chipsets may not support HT Technology.
- BIOS support to enable/disable HT Technology—Make sure you enable HT Technology in the BIOS Setup.
- HT Technology-enabled OS—Windows XP and later support HT Technology. Linux distributions based on kernel 2.4.18 and higher also support HT Technology. To see if HT Technology is functioning properly, you can check the Device Manager in Windows to see how many processors are recognized. When HT is supported and enabled, the Windows Device Manager shows twice as many processors as there are physical processor cores.
HT Technology simulates two processors in a single physical core. If multiple logical processors are good, having two or more physical processors is a lot better. A multi-core processor, as the name implies, actually contains two or more processor cores in a single processor package. From outward appearances, it still looks like a single processor (and is considered as such for Windows licensing purposes), but inside there can be two, three, four, or even more processor cores. A multi-core processor provides virtually all the advantages of having multiple separate physical processors, all at a much lower cost.
Both AMD and Intel introduced the first dual-core x86-compatible desktop processors in May 2005. AMD’s initial entry was the Athlon 64 X2, whereas Intel’s first dual-core processors were the Pentium Extreme Edition 840 and the Pentium D. The Extreme Edition 840 was notable for also supporting HT Technology, allowing it to appear as a quad-core processor to the OS. These processors combined 64-bit instruction capability with dual internal cores—essentially two processors in a single package. These chips were the start of the multicore revolution, which has continued by adding more cores along with additional extensions to the instruction set. Intel introduced the first quad-core processors in November 2006, called the Core 2 Extreme QX and Core 2 Quad. AMD subsequently introduced its first quad-core desktop PC processor in November 2007, called the Phenom.
Note: There has been some confusion about Windows and multi-core or Hyper-Threaded processors. Windows XP and later Home editions support only one physical CPU, whereas Windows Professional, Business, Enterprise, and Ultimate editions support two physical CPUs. Even though the Home editions support only a single physical CPU, if that chip is a multicore processor with HT Technology, all the physical and virtual cores are supported. For example, if you have a system with a quad-core processor supporting HT Technology, Windows Home editions will see it as eight processors, and all of them will be supported. If you had a motherboard with two of these CPUs installed, Windows Home editions would see the eight physical/virtual cores in the first CPU, whereas Professional, Business, Enterprise, and Ultimate editions would see all 16 cores in both CPUs.
Multi-core processors are designed for users who run multiple programs at the same time or who use multithreaded applications, which pretty much describes all users these days. A multithreaded application can run different parts of the program, known as threads, at the same time in the same address space, sharing code and data. A multithreaded program runs faster on a multicore processor or a processor with HT Technology enabled than on a single-core or non-HT processor.
The diagram below illustrates how a single-core processor (left) and a dual-core processor (right) handle multitasking:
It’s important to realize that multicore processors don’t improve single-task performance much. If you play non-multithreaded games on your PC, it’s likely that you would see little advantage in a multi-core or hyperthreaded CPU. Fortunately, more and more software (including games) is designed to be multithreaded to take advantage of multi-core processors. The program is broken into multiple threads, all of which can be divided among the available CPU cores.
The ability to run multiple operating systems on a single computer, a technology known as virtualization, was originally developed for IBM mainframe computers in 1965. However, it has also been available in PCs for over a decade. By running multiple operating systems and applications on a single computer, a single computer can be used for multiple tasks, can support legacy applications that no longer run on a current operating system, and makes technical support of a wide range of operating systems, utilities, applications, and web browser versions from a single system feasible.
Virtualization can take two forms:
- Hypervisor/client
- Host/guest
With either type of virtualization, a program called a virtual machine manager (VMM) is used to create and manage a virtual machine (VM), which is a section of RAM and hard disk space that is set aside for use by an operating system and its applications. Once a VM is created and started, the user can install an operating system supported by the VMM, install any utilities designed to make the operating system work better inside the VM, and can use the VM as if it is the only operating system running on the hardware. The disk space used by a VM is usually dynamic, expanding from a small size to a pre-set maximum only as needed, and the RAM allocated to a VM is available for other processes when the VM is closed.
In hypervisor/client virtualization, the virtual machine manager (VMM) runs directly on the hardware, enabling VMs to go through fewer layers of emulation for faster performance than with host/guest virtualization. This type of virtualization is sometimes referred to as Type 1 or bare-metal virtualization. This is the type of virtualization performed by Microsoft Hyper-V and most server-class virtualizers.
In host/guest virtualization, a host operating system runs a virtual machine manager (VMM) program. The VMM is used to create and manage operating systems loaded as guests. In this type of virtualization, the connections between hardware and the virtual machine must pass through two layers: the device drivers used by the host operating system and the virtualized drivers used by the VMM. The multilayered nature of host/guest virtualization makes this type of virtualization relatively slow. It is also referred to as Type 2 virtualization, and Microsoft Virtual PC and Windows Virtual PC are examples of this type of virtualization.
Most virtualization programs for Windows-based PCs, such as Microsoft Virtual PC 2004 and 2007, use host/guest virtualization. To enable virtualization to run faster and make virtualization more useful, both Intel and AMD have added hardware-assisted virtualization support to their processors. The original edition of Windows Virtual PC for Windows 7 required the use of processors with hardware-assisted virtualization support with the BIOS configured to enable this feature. Although the current version of Windows Virtual PC does not require the use of processors with hardware-assisted virtualization, this feature is highly desirable for any computer that will be used for virtualization.
Note: Windows 8 Pro does not support Windows Virtual PC. Instead, it includes Hyper-V Client. Enable it through the Add/Remove Features from Windows dialog in the Control Panel.
AMD-V
AMD refers to its hardware-assisted virtualization support as AMD-V, although BIOS setup programs might also identify it as VMM or virtualization. Support for AMD-V is almost universal across its processors starting with models supporting Socket AM2 and their mobile counterparts up through current processors for AM3+ and FM2 sockets.
Intel VT-x and VT-D
Intel refers to its hardware-assisted virtualization support as Intel VT-x. Intel supports VT-x on all of its second-generation and third-generation Core i3/i5/i7 processors, and with certain models in older product families. Intel processors with VT-D support also virtualize directed I/O for faster performance of I/O devices in a virtualized environment.
VIA VT
VIA Technologies refers to its hardware-assisted virtualization support as VIA VT. It is present in all Nano-family processors as well as QuadCore and Eden X2 processor families.
Enabling Hardware-Assisted Virtualization Support
To enable hardware-assisted virtualization support on a computer, the following must occur:
- The installed process must support hardware-assisted virtualization
- The BIOS must support hardware-assisted virtualization
- The BIOS settings for hardware-assisted virtualization must be enabled
- A VMM that supports hardware-assisted virtualization must be installed
Note: To determine if a system includes a processor that supports hardware-assisted virtualization, use CPU-Z, and check BIOS settings.
The following sections discuss the major features of these processors and the different approaches Intel and AMD take to bring 64-bit multicore computing to the PC.
Intel and AMD have created a set of socket and slots for their processors. Each socket or slot is designed to support a different range of original and upgrade processors. The table below shows the designations for the various standard processor sockets/slots and lists the chips that drop into them.
| Chip Class | Socket | Pins | Layout | Supported Processors | Introduced |
|---|---|---|---|---|---|
| Intel P4/Core | 423 | 423 | 39x39 SPGA | Pentium 4 FC-PGA | Nov. 2000 |
| 478 | 478 | 26x26 mPGA | Pentium 4/Celeron FC-PGA2, Celeron D | Oct. 2001 | |
| T (LGA 775) | 775 | 30x33 LGA | Pentium 4/Extreme Edition, Pentium D, Celeron D, Pentium dual-core, Core2 | June 2004 | |
| LGA 1156 (Socket H) | 1156 | 40x40 LGA | Pentium, Core i3/i5/i7, Xeon | Sept. 2009 | |
| LGA 1136 (Socket B) | 1366 | 41x43 LGA | Core i7, Xeon | Nov. 2008 | |
| LGA 1155 (Socket H2) | 1155 | 40x40 LGA | Core i7, i5, i3 | Jan. 2011 | |
| LGA 2011 | 2011 | 58x43 hexLGA | Core i7 | Nov. 2011 | |
| AMD K8 | 754 | 754 | 29x29 mPGA | Athlon 64 | Sept. 2003 |
| 939 | 939 | 31x31 mPGA | Athlon 64 v.2 | June 2004 | |
| 940 | 940 | 31x31 mPGA | Athlon 64 FX, Opteron | Apr. 2003 | |
| AM2 | 940 | 31x31 mPGA | Athlon 64/64FX/64 X2, Sempron, Opteron, Phenom | May 2006 | |
| AM2+ | 940 | 31x31 mPGA | Athlon 64/64 X2, Opteron, Phenom X2/X3/X4, II X4 | Nov. 2007 | |
| AM3 | 9412 | 31x31 mPGA | Athlon II, Phenom II, Sempron | Feb. 2009 | |
| AM3+ | 9412 | 31x31 mPGA | "Bulldozer" Processors | Mid-2011 | |
| F (1207 FX) | 1207 | 35x35 LGA | Athlon 64 FX, Opteron | Aug. 2006 | |
| AMD A | FM1 | 905 | 31x31 LGA | A4, A6, A8, Athlon II, E2, Sempron | Jul. 2011 |
| FM2 | 904 | 31x31 LGA | A4, A6, A8, A10 | Sept. 2012 |
Sockets 1, 2, 3, and 6 are 486 processor sockets and are shown together in the figure below so you can see the overall size comparisons and pin arrangements between these sockets.
486 Processor Sockets
Sockets 4, 5, 7, and 8 are Pentium and Pentium Pro processor sockets and are shown together in the figure below so you can see the overall size comparisons and pin arrangements between these sockets.
Pentium And Pentium Pro Processor Sockets
When the Socket 1 specification was created, manufacturers realized that if users were going to upgrade processors, they had to make the process easier. The socket manufacturers found that 100 lbs. of insertion force is required to install a chip in a standard 169-pin Socket 1 motherboard. With this much force involved, you easily could damage either the chip or the socket during removal or reinstallation. Because of this, some motherboard manufacturers began using low insertion force (LIF) sockets, which required a smaller 60 lbs. of insertion force for a 169-pin chip. Pressing down on the motherboard with 60–100 lbs. of force can crack the board if it is not supported properly. A special tool is also required to remove a chip from one of these sockets. As you can imagine, even the LIF was relative, and a better solution was needed if the average person was ever going to replace his CPU.
Manufacturers began using ZIF sockets in Socket 1 designs, and all processor sockets from Socket 2 and higher have been of the ZIF design. ZIF is required for all the higher-density sockets because the insertion force would simply be too great otherwise. ZIF sockets almost eliminate the risk involved in installing or removing a processor because no insertion force is necessary to install the chip and no tool is needed to extract one. Most ZIF sockets are handle-actuated: You lift the handle, drop the chip into the socket, and then close the handle. This design makes installing or removing a processor easy.
The following sections take a closer look at those socket designs you are likely to encounter in active PCs.
Socket LGA 775
Socket LGA 775 (also called Socket T) is used by the Core 2 Duo/Quad processors, the most recent versions of the Intel Pentium 4 Prescott processor and the Pentium D and Pentium Extreme Edition processors. Some versions of the Celeron and Celeron D also use Socket LGA 775. Socket LGA 775, unlike earlier Intel processor sockets, uses a land grid array format, so the pins are on the socket, rather than the processor.
LGA uses gold pads (called lands) on the bottom of the processor to replace the pins used in PGA packages. It allows for much greater clamping forces via a load plate with a locking lever, with greater stability and improved thermal transfer (better cooling). The first LGA processors were the Pentium II and Celeron processors in 1997; in those processors, an LGA chip was soldered on the Slot-1 cartridge. LGA is a recycled version of what was previously called leadless chip carrier (LCC) packaging. This was used way back on the 286 processor in 1984, and it had gold lands around the edge only. (There were far fewer pins back then.) In other ways, LGA is simply a modified version of ball grid array (BGA), with gold lands replacing the solder balls, making it more suitable for socketed (rather than soldered) applications. Socket LGA 775 is shown in the figure below.
Socket LGA775 (Socket T)
The release lever on the left raises the load plate out of the way to permit the processor to be placed over the contacts.
Socket LGA 1156
Socket LGA 1156 (also known as Socket H) was introduced in September 2009 and was designed to support Intel Core ix-series processors featuring an integrated chipset northbridge, including a dual-channel DDR3 memory controller and optional integrated graphics. Socket LGA 1156 uses a land grid array format, so the pins are on the socket, rather than the processor. Socket LGA 1156 is shown in the figure below.
Socket LGA1156 (Socket H)
Because the processor includes the chipset northbridge, Socket LGA 1156 is designed to interface between a processor and a Platform Controller Hub (PCH), which is the new name used for the southbridge component in supporting 5x series chipsets. The LGA 1156 interface includes the following:
- PCI Express x16 v2.0—For connection to either a single PCIe x16 slot, or two PCIe x8 slots supporting video cards.
- DMI (Direct Media Interface)—For data transfer between the processor and the PCH. DMI in this case is essentially a modified PCI Express x4 v2.0 connection, with a bandwidth of 2 GB/s.
- DDR3 dual-channel—For direct connection between the memory controller integrated into the processor and DDR3 SDRAM modules in a dual-channel configuration.
- FDI (Flexible Display Interface)—For the transfer of digital display data between the (optional) processor integrated graphics and the PCH.
When processors with integrated graphics are used, the Flexible Display Interface carries digital display data from the GPU in the processor to the display interface circuitry in the PCH. Depending on the motherboard, the display interface can support DisplayPort, High Definition Multimedia Interface (HDMI), Digital Visual Interface (DVI), or Video Graphics Array (VGA) connectors.
Socket LGA 1366
Socket LGA 1366 (also known as Socket B) was introduced in November 2008 to support high-end Intel Core i7-series processors, including an integrated triple-channel DDR3 memory controller, but which also requires an external chipset northbridge, in this case called an I/O Hub (IOH). Socket LGA 1366 uses a land grid array format, so the pins are on the socket, rather than the processor. Socket LGA 1366 is shown in the figure below.
Socket LGA1366 (Socket B)
Socket LGA 1366 is designed to interface between a processor and an IOH, which is the new name used for the northbridge component in supporting 5x-series chipsets. The LGA 1366 interface includes the following:
- QPI (Quick Path Interconnect)—For data transfer between the processor and the IOH. QPI transfers two bytes per cycle at either 4.8 or 6.4 GT/s, resulting in a bandwidth of 9.6 or 12.8 GB/s.
- DDR3 triple-channel—For direct connection between the memory controller integrated into the processor and DDR3 SDRAM modules in a triple-channel configuration.
LGA 1366 is designed for high-end PC, workstation, or server use. It supports configurations with multiple processors.
Socket LGA 1155
Socket LGA 1155 (also known as Socket H2) was introduced in January 2011 to support Intel’s Sandy Bridge (second-generation) Core ix-series processors, which now include Turbo Boost overclocking. Socket LGA 1155 uses a land grid array format, so the pins are on the socket, rather than the processor. Socket LGA 1155 uses the same cover plate as Socket 1156, but is not interchangeable with it. Socket LGA 1155 is also used by Intel’s Ivy Bridge (third-generation) Core ix-series processors. LGA 1155 supports up to 16 PCIe 3.0 lanes and eight PCIe 2.0 lanes.
Socket LGA 1155 is shown in the figure below.
Socket LGA1155 (Socket H2) before installing a processor.
Socket LGA 2011
Socket LGA 2011 was introduced in November 2011 to support high-performance versions of Intel’s Sandy Bridge (second-generation) Core ix-series processors (Sandy Bridge-E), which now include Turbo Boost overclocking. LGA 2011 supports 40 PCIe 3.0 lanes, quad-channel memory addressing, and fully-unlocked processor multipliers.
Socket LGA 2011 uses a land grid array format, so the pins are on the socket, rather than the processor. Socket LGA 2011 is shown in the figure below.
Socket LGA2011 before installing a processor.
Socket AM2/AM2+/AM3/AM3+
In May 2006, AMD introduced processors that use a new socket, called Socket AM2 (see figure below). AM2 was the first replacement for the confusing array of Socket 754, Socket 939, and Socket 940 form factors for the Athlon 64, Athlon 64 FX, and Athlon 64 X2 processors.
Socket AM2/AM2+: The arrow (triangle) at the lower left indicates pin 1.
Although Socket AM2 contains 940 pins—the same number that Socket 940 uses—Socket AM2 is designed to support the integrated dual-channel DDR2 memory controllers that were added to the Athlon 64 and Opteron processor families in 2006. Processors designed for Sockets 754, 939, and 940 include DDR memory controllers and are not pin compatible with Socket AM2. Sockets 939, 940, and AM2 support HyperTransport v2.0, which limits most processors to a 1 GHz FSB.
Socket AM2+ is an upgrade to Socket AM2 that was released in November 2007. Although Sockets AM2 and AM2+ are physically the same, Socket AM2+ adds support for split power planes and HyperTransport 3.0, allowing for FSB speeds of up to 2.6 GHz. Socket AM2+ chips are backward compatible with Socket AM2 motherboards, but only at reduced HyperTransport 2.0 FSB speeds. Socket AM2 processors can technically work in Socket AM2+ motherboards; however, this also requires BIOS support, which is not present in all motherboards.
Socket AM3 was introduced in February 2009, primarily to support processors with integrated DDR3 memory controllers such as the Phenom II. Besides adding support for DDR3 memory, Socket AM3 has 941 pins in a modified key pin configuration that physically prevents Socket AM2 or AM2+ processors from being inserted (see figure below).
Socket AM3: The arrow (triangle) at the lower left indicates pin 1.
Socket AM3+ is a modified version of AM3 designed for the new “Bulldozer” processors. It has 938 pins, and also supports processors made for AM3 sockets. The table below shows the essential differences between Socket AM2, AM2+, AM3, and AM3+:
| Socket | Pins | Supported Memory |
|---|---|---|
| AM2 | 940 | DDR2 (dual-channel) |
| AM2+ | 940 | DDR2 (dual-channel) |
| AM3 | 938 | DDR3 (dual-channel) |
| AM3+ | 938 | DDR3 (dual-cahnnel) |
Here is a summary of the compatibility between AM2, AM2+, AM3, and AM3+ processors and motherboards:
- You cannot install Socket AM2 or AM2+ processors in Socket AM3 motherboards.
- You can install Socket AM2 processors in Socket AM2+ motherboards.
- You can install Socket AM3 or AM2+ processors in Socket AM2 motherboards; however, the BIOS must support the processor, the FSB will run at lower HT 2.0 speeds, and only DDR2 memory is supported.
- You can install Socket AM3 processors in Socket AM2+ motherboards, but the BIOS must support the processor, and only DDR2 memory is supported.
- You can install Socket AM3 processors in Socket AM3+ motherboards, but the BIOS must support the processor.
Although you can physically install newer processors in motherboards with older sockets, and they should theoretically work with reductions in bus speeds and memory support, this also requires BIOS support in the specific motherboard, which may be lacking. In general, you are best off matching the processor to a motherboard with the same type of socket.
Socket F (1207FX)
Socket F (also called 1207FX) was introduced by AMD in August 2006 for its Opteron line of server processors. Socket F is AMD’s first land grid array (LGA) socket, similar to Intel’s Socket LGA 775. It features 1207 pins in a 35-by-35 grid, with the pins in the socket instead of on the processor. Socket F normally appears on motherboards in pairs because it is designed to run dual physical processors on a single motherboard. Socket F was utilized by AMD for its Quad FX processors, which are dual-core processors sold in matched pairs, operating as a dual socket dual-core system. Future versions may support quad-core processors, for a total of eight cores in the system. Due to the high expense of running dual physical processors, only a limited number of nonserver motherboards are available with Socket F.
Socket FM1
Socket FM1 was introduced by AMD in July 2011 for use by accelerated processing units (APUs – CPU plus GPU) and CPUs based on the Llano core. These include the Ax-3xxx series APUs and some Athlon II CPUs, as well as the E2-3200 APU. FM1 has 905 pins in a 31-by-31 grid and uses a PGA socket similar to those used by previous AMD processors. Socket FM1 supports DDR3 memory. It was replaced by Socket FM2.
Socket FM2
Socket FM1 was introduced by AMD in September 2012 for use by its Trinity series of APUs. These include the Ax-5xxx series APUs. FM2 has 904 pins in a 31×31 grid and uses a PGA socket similar to those used by previous AMD processors. Socket FM2 supports DDR3 memory. The figure below illustrates Socket FM2:
Socket FM2 before installing a processor.
One trend that is clear to anybody who has been following processor design is that the operating voltages keep getting lower. The benefits of lower voltage are threefold. The most obvious is that with lower voltage comes lower overall power consumption. By consuming less power, the system is less expensive to run, but more importantly for portable or mobile systems, it runs much longer on existing battery technology. The emphasis on battery operation has driven many of the advances in lowering processor voltage because this has a great effect on battery life.
CPU Operating Voltages
The second major benefit is that with less voltage and therefore less power consumption, less heat is produced. Processors that run cooler can be packed into systems more tightly and last longer.
The third major benefit is that a processor running cooler on less power can be made to run faster. Lowering the voltage has been one of the key factors in enabling the clock rates of processors to go higher and higher. This is because the lower the voltage, the shorter the time needed to change a signal from low to high.
Starting with the Pentium Pro, all newer processors automatically determine their voltage settings by controlling the motherboard-based voltage regulator. That’s done through built-in VID pins.
For overclocking purposes, many motherboards have override settings that allow for manual voltage adjustment if desired. Many people have found that, when attempting to overclock a processor, increasing the voltage by a tenth of a volt or so often helps. Of course, this increases the heat output of the processor and must be accounted for with adequate cooling.
Note: Although modern processors use VID pins to enable them to select the correct voltage, newer processors that use the same processor socket as older processors might use a voltage setting the motherboard does not support. Before upgrading an existing motherboard with a new processor, make sure the motherboard will support the processor’s voltage and other features. You might need to install a BIOS upgrade before upgrading the processor to ensure that the motherboard properly recognizes the processor.
Math Coprocessors (Floating-Point Units)
Older CPUs designed by Intel (and cloned by other companies) used an external math coprocessor chip to perform floating-point operations. However, when Intel introduced the 486DX, it included a built-in math coprocessor, and every processor built by Intel (and AMD and VIA/Cyrix, for that matter) since then includes a math coprocessor. Coprocessors provide hardware for floating-point math, which otherwise would create an excessive drain on the main CPU. Math chips speed your computer’s operation only when you are running software designed to take advantage of the coprocessor.
Note: Most applications that formerly used floating-point math now use MMX/SSE instructions instead. These instructions are faster and more accurate than x87 floating-point math.
Processor manufacturers use specialized equipment to test their own processors, but you have to settle for a little less. The best processor-testing device to which you have access is a system that you know is functional; you then can use the diagnostics available from various utility software companies or your system manufacturer to test the motherboard and processor functions.
Perhaps the most infamous of these bugs is the floating-point division math bug in the early Pentium processors. This and a few other bugs are discussed in detail later in this chapter.
Because the processor is the brain of a system, most systems don’t function with a defective processor. If a system seems to have a dead motherboard, try replacing the processor with one from a functioning motherboard that uses the same CPU chip. You might find that the processor in the original board is the culprit. If the system continues to play dead, however, the problem is elsewhere, most likely in the motherboard, memory, or power supply. See the chapters that cover those parts of the system for more information on troubleshooting those components. I must say that in all my years of troubleshooting and repairing PCs, I have rarely encountered defective processors.
A few system problems are built in at the factory, although these bugs or design defects are rare. By learning to recognize these problems, you can avoid unnecessary repairs or replacements. Each processor section describes several known defects in that generation of processors, such as the infamous floating-point error in the Pentium. For more information on these bugs and defects, see the following sections, and check with the processor manufacturer for updates.
Microcode and the Processor Update Feature
All processors can contain design defects or errors. Many times, you can avoid the effects of any given bug by implementing hardware or software workarounds. Intel documents these bugs and workarounds well for its processors in the processor Specification Update manual that is available from Intel’s website. Most of the other processor manufacturers also have bulletins or tips on their websites listing any problems or special fixes or patches for their chips.
Previously, the only way to fix a processor bug was to work around it or replace the chip with one that had the bug fixed. Starting with the Intel P6 and P7 family processors, including the Pentium Pro through Pentium D and Core i7, many bugs in a processor’s design can be fixed by altering the microcode in the processor. Microcode is essentially a set of instructions and tables in the processor that control the way the processor operates. These processors incorporate a new feature called reprogrammable microcode, which enables certain types of bugs to be worked around via microcode updates. The microcode updates reside in either the motherboard ROM BIOS or Windows updates and are loaded into the processor by the motherboard BIOS during the POST or by Windows during the boot process. Each time the system is rebooted, the updated microcode is reloaded, ensuring that it will have the bug fix installed anytime the system is operating.
The updated microcode for a given processor is provided by Intel to either the motherboard manufacturers or to Microsoft so the code can be incorporated into the flash ROM BIOS for the board, or directly into Windows via Windows Update. This is one reason it is important to keep Windows up to date, as well as to install the most recent motherboard BIOS for your systems. Because it is easier for most people to update Windows than to update the motherboard BIOS, it seems that more recent microcode updates are being distributed via Microsoft than the motherboard manufacturers.
The Core i processor family replaced the Core 2 and includes two different microarchitectures: The first generation of Core i processors is based on the Nehalem microarchitecture, and the second generation uses Sandy Bridge microarchitecture.
Nehalem Architecture
The Nehalem microarchitecture’s key features include the integration of the memory controller into the processor, and in some models, the entire northbridge including an optional graphics processor in a separate core on the processor die. The first Core i-series processor was the Core i7 introduced in November 2008. Initially built on a 45 nm process, later Core i-series processors were built on an improved 32 nm process allowing for smaller die, lower power consumption, and greater performance. All support DDR3 memory and include L3 cache, and some models include support for HT Technology. See the following table for details.
There are two main variants in the first-generation (Nehalem) Core i-series Family: high-end versions that use Socket LGA 1366 and more mainstream models that use Socket LGA 1156. The latter mainstream models include a fully integrated northbridge, including a dual-channel DDR3 memory controller, graphics interface, and even an optional full-blown graphics processor. Because the entire northbridge functionality is integrated into the processor, Socket LGA 1156 chips use a slower 2 GB/s DMI as the FSB connection to the Platform Controller Hub component on the motherboard.
Core i 900-series processors using Socket LGA 1366 include a triple-channel DDR3 memory controller and a high-performance FSB called QPI (Quick Path Interconnect) that connects to the northbridge component (called an I/O Hub or IOH) on the motherboard. The IOH implements the PCIe graphics interface.
Core i7 and i5 processors also support Turbo Boost (built-in overclocking), which increases the performance in heavily loaded processor cores while reducing performance to cores that are lightly loaded or have no work to perform. Turbo Boost is configured through the system BIOS.
The table below lists the Intel Core i-series family processors using Nehalem microarchitecture:
| Processor | Cores | CPU Speed | L2 | L3 | Core | Process | Power | HTT | Socket |
|---|---|---|---|---|---|---|---|---|---|
| Core i7 9xxX EE | 6 | 3.33-3.46 GHz | 1.5 MB | 12 MB | Gulftown | 32 nm | 130 W | Yes | LGA 1366 |
| Core i7 9xx EE | 4 | 3.2-3.33 GHz | 1 MB | 8 MB | Bloomfield | 45 nm | 130 W | Yes | LGA 1366 |
| Core i7 970 | 6 | 3.2 GHz | 1.5 MB | 12 MB | Gulftown | 32 nm | 130 W | Yes | LGA 1366 |
| Core i7 9xx | 4 | 2.66-3.2 GHz | 1 MB | 8 MB | Bloomfield | 45 nm | 130 W | Yes | LGA 1156 |
| Core i7 8xx | 4 | 2.66-3.06 GHz | 1 MB | 8 MB | Lynnfield | 45 nm | 82-95 W | Yes | LGA 1156 |
| Core i5 7xx | 4 | 2.4-2.8 GHz | 1 MB | 8 MB | Lynnfield | 45 nm | 95 W | No | LGA 1156 |
| Core i5 6xx | 2 | 3.2-3.66 GHz | 1 MB | 4 MB | Clarkdale* | 32 nm | 73-87 W | Yes | LGA 1156 |
| Core i3 5xx | 2 | 3.93-3.33 GHz | 1 MB | 4 MB | Clarkdale | 32 nm | 73 W | Yes | LGA 1156 |
* This CPU core also used by Pentium Processor G6950-60
The initial members of the Core i-series family included the Core i5 and i7 processors. These were later joined by the low-end i3 processors.
Sandy Bridge Architecture
Intel introduced the second generation of Core i-series processors, those based on the Sandy Bridge microarchitecture, in January 2011. The Sandy Bridge microarchitecture includes, as its predecessor did, an integrated memory controller and northbridge functions.
However, Sandy Bridge has many new features, including an in-core graphics processor on some models; the new AVX 256-bit SSE extensions; a new instruction cache for holding up to 1500 decoded micro-ops; a more accurate branch prediction unit; the use of physical registers to store operands; improved power management; Turbo Boost 2.0 for more scaled responses to adjustments in core usage, processor temperature, current, power consumption, and operating system states; and a dedicated video decoding/transcoding/encoding unit known as the multi-format codec (MFX). All Sandy Bridge processors use a 32 nm manufacturing process.
The table below lists the Intel Core i-series family processors using Sandy Bridge microarchitecture:
| Processor | Cores | CPU Speed | L2 | L3 | Power | TB 2.0 | HTT | Socket |
|---|---|---|---|---|---|---|---|---|
| Core i7 39xx | 6 | 3.2–4.0 GHz | 1 MB | 12-15 MB | 130–150 W | Yes | Yes | LGA 2011 |
| Core i7 38xx | 4 | 3.6 GHz | 1 MB | 10 MB | 130 W | Yes | Yes | LGA 2011 |
| Core i7 2xxx | 4 | 2.80–3.40 GHz | 1 MB | 8 MB | 65–95 W | Yes | Yes | LGA 1155 |
| Core i5 25xx | 4 | 2.3–3.3 GHz | 1 MB | 6 MB | 45–95 W | Yes | No | LGA 1155 |
| Core i5 24xx | 4 | 2.5–3.1 GHz | 1 MB | 6 MB | 65–95 W | Yes | Yes | LGA 1155 |
| Core i5 2390 | 2 | 2.7 GHz | 1 MB | 3 MB | 35 W | Yes | Yes | LGA 1155 |
| Core i5 23xx | 4 | 2.8–2.9 GHz | 1 MB | 6 MB | 95 W | Yes | No | LGA 1155 |
| Core i3 21xx | 2 | 2.5–3.1 GHz | 1 MB | 3 MB | 35–65 W | No | Yes | LGA 1155 |
Sandy Bridge processors using LGA 2011 processor sockets are classified as Sandy Bridge-E.
Sandy Bridge also includes Pentium processors in the 967-997, B940-B980, G620-G645T, and G840-G870 series. These processors feature lower clock speeds, less powerful integrated GPUs, and smaller cache sizes than Core i processors. Celeron processors in the B720, 847E, 787-797, 807-887, B710, B800-B840, G440-G465, and G350-G555 series are also based on Sandy Bridge but feature smaller cache sizes and slower clock speeds than Pentium processors based on Sandy Bridge.
Ivy Bridge Architecture
Intel introduced the third generation of Core i-series processors, those based on the Ivy Bridge microarchitecture, in April 2012. The Ivy Bridge microarchitecture represents an improved version of the Sandy Bridge microarchitecture. Ivy Bridge features support for PCI Express 3.0, a new fabrication process at 22 nm, lower power consumption, support for low-voltage DDR3 memory, and support for DirectX 11 graphics with integrated HD Graphics 4000. Existing Sandy Bridge motherboards can use Ivy Bridge CPUs, but a BIOS update might be needed in some cases. The following table lists Core i-series processors using Ivy Bridge microarchitecture.
The table below lists Intel Core i-series family desktop processors using Ivy Bridge microarchitecture:
| Processor | Cores | CPU Speed | L2 | L3 | Power | TB 2.0 | HTT | Socket |
|---|---|---|---|---|---|---|---|---|
| Core i7 3770 series | 4 | 2.5-3.4 GHz | 1 MB | 8 MB | 45-77 W | Yes | Yes | LGA 1155 |
| Core i5 35xx | 4 | 2.3-3.4 GHz | 1 MB | 6 MB | 45-77 W | Yes | No | LGA 1155 |
| Core i5 34xx* | 4 | 2.9-3.2 GHz | 1 MB | 6 MB | 35-77 W | Yes | No | LGA 1155 |
| Core 15 33xx | 4 | 2.7-3.1 GHz | 1 MB | 6 MB | 35-77 W | Yes | No | LGA 1155 |
| Core i3 32xx | 4 | 2.8-3.4 GHz | 1 MB | 3 MB | 35-55 W | No | Yes | LGA 1155 |
Processors with power levels below 35 W are also available but not listed here.
Pentium processors in the G2100 series also use the Ivy Bridge microarchitecture but feature smaller cache sizes and have only two cores without HTT, compared to Core i3’s four cores with HTT.
Intel Atom
Intel introduced its Atom ultra low power processors in 2008 and refreshed the line with new models with integrated graphics (D25xx series) in 2012. Although a few vendors have created very low-end desktop computers using Atom, this processor is designed primarily for netbooks, tablets, home servers, and other specialized uses. It is a 64-bit processor fully compatible with x86 and 64-bit versions of Windows and other operating systems, and some models support HT Technology. However, it supports only SSSE3 instructions, has a 4 GB memory limit, and includes only two cores.
AMD K10 Processors (Phenom, Phenom II, Athlon II, Athlon X2, Sempron)
The K9 was a stillborn project within AMD, resulting in a skip from the K8 to the K10 architecture. The first K10 processors were the Phenom models released in November 2007.
The AMD Phenom family of processors was designed as a flexible family of chips available with 1–6 K10 cores in a single die. These include the Phenom, Phenom II, Athlon II, and some models of the Athlon X2 and Sempron processors. The initial versions used Socket AM2+, which included support for DDR2 memory. Later versions used Sockets AM3 and AM3+, which support DDR3 memory. The image below is of a Phenom II X6 processor for Socket AM3:
The Phenom X3, X4, and Athlon X2 processors were made on a 65 nm process, whereas the Phenon II, Athlon II, and Sempron 1xx processors use a smaller 45 nm process, resulting in a smaller die with overall lower power consumption and higher performance. The figure below illustrates the interior design of the Phenom II X6 processor:
A simplified diagram of the Phenom II X6 core’s major components.
The higher-end chips in this family include three, four, or six cores, L3 cache, and higher clock rates and HyperTransport bus speeds (2 GT/s).
The table below provides a detailed comparison of the various AMD K10 family processors:
| Processor | Cores | CPU Speed | Turbo Core | L2 | L3 | Core | Process | Power | Socket |
|---|---|---|---|---|---|---|---|---|---|
| Phenom II X6 | 6 | 2.6-3.3 GHz | Yes | 3 MB | 6 MB | Thuban | 45 nm | 95-125 W | AM3 |
| Phenom II X4 | 4 | 2.9-3.5 GHz | Yes | 2 MB | 6 MB | Zosma | 45 nm | 95-125 W | AM3 |
| Phenom II X4* | 4 | 2.5-3.7 GHz | No | 2 MB | 4-6 MB | Deneb | 45 nm | 95-140 W | AM3 |
| Athlon II X4 | 4 | 2.2-3.8 GHz | No | 2 MB | N/A | Propus | 45 nm | 45-95 W | AM3 |
| Phenom II X3 | 3 | 2.4-3.2 Ghz | No | 1.5 MB | 6 MB | Heka | 45 nm | 65-95 W | AM3 |
| Athlon II X3 | 3 | 2.2-3.4 GHz | No | 1.5 MB | N/A | Rana | 45 nm | 45-95 W | AM3 |
| Phenom II X2 | 2 | 2.8-3.5 GHz | No | 1 MB | 6 MB | Callisto | 45 nm | 80 W | AM3 |
| Phenom II X2 | 2 | 3.4 GHz | No | 1 MB | 6 MB | Regor | 45 nm | 80 W | AM3 |
| Athlon II X2 | 2 | 1.6-3.3 GHz | No | 1-2 MB | N/A | Regor | 45 nm | 25-65 W | AM3 |
| Athlon II 1xxu | 1 | 1.8-2 GHz | No | 1 MB | N/A | Sargas | 45 nm | 20 W | AM3 |
| Sempron 1xx | 1 | 2.7-2.9 GHz | No | 1 MB | N/A | Sargas | 45 nm | 45 W | AM3 |
| Phenom X4 | 4 | 1.8-2.6 GHz | No | 2 MB | 2 MB | Agena | 65 nm | 65-140 W | AM2+ |
| Phenom X3 | 3 | 1.9-2.5 GHz | No | 1.5 MB | 2 MB | Toliman | 65 nm | 65-95 W | AM2+ |
| Athlon X2 | 2 | 2.3-2.8 GHz | No | 1 MB | 2 MB | Kuma | 65 nm | 95 W | AM2+ |
*Model 840 has no L3 cache
- Zosma = Thuban with two cores disabled
- Propus = Deneb with no (or disabled) L3 cache
- Heka = Deneb with one core disabled
- Rana = Propus with one core disabled
- Callisto = Deneb with two cores disabled
- Toliman = Agena with one core disabled
- Kuma = Agena with two cores disabled
AM3 processors can also be used in Socket AM2+ motherboards with appropriate BIOS update.
AMD “Bulldozer” and “Piledriver” FX Processors
AMD introduced its follow-up to its K10 architecture, the Bulldozer architecture, in October 2011. Although FX processors in this family use the same Socket AM3+ as late-model K10 processors do, the internal design of Bulldozer processors is very different from its predecessors.
Note: Bulldozer is also known as K11, but Bulldozer is the more common name for this architecture.
Bulldozer processors are modular. Each module contains a single L1 instruction cache, a multi-branched instruction decoder, and a multilayer dispatch controller. The dispatch controller is connected to two integer processing clusters and a single floating point unit. The results are connected to a write coalescing cache, a core interface unit, and up to 2 MB of L2 cache. A module is commonly referred to as a dual-core processor, although only the integer clusters are dualed. A Bulldozer CPU includes 8 MB of L3 cache memory, and Bulldozer CPUs were manufactured in eight-core, six-core, and four-core versions, known collectively as Zambezi.
A block diagram of an eight-core Bulldozer CPU.
Other features in Bulldozer include AMD’s Turbo Core (a built-in overclocking feature) and new CPU instructions (AES, AVX, FMA4, and XOP). These instructions support faster encryption, floating-point math, rendering, and video transcoding on software optimized for them. Bulldozer uses a 32 nm manufacturing process, compared to the 45 nm used by most K10-class parts. FX processors based on Bulldozer are completely unlocked for easier overclocking. AMD sells an optional liquid cooler for FX Bulldozer and Piledriver CPUs.
Bulldozer processors are optimized for multithreaded software, but performance benchmarks were disappointing, as most applications were not optimized for Bulldozer’s new architecture. Further specifications for Bulldozer processers are listed in the table below:
| Processor | Cores | CPU Speed | Turbo Core | L2 | Power |
|---|---|---|---|---|---|
| FX 81xx | 8 | 3.1-3.6 GHz | Yes | 4 MB | 125 W |
| FX 61xx | 6 | 3.3 GHz | Yes | 3 MB | 95 W |
| FX 41xx | 4 | 3.8 GHz | No | 2 Mb | 125 W |
AMD introduced an improved version of its Bulldozer architecture, Piledriver, in October 2012. Compared to Bulldozer, Piledriver includes these improvements:
- More accurate branch predictor
- Support for the latest integer instructions FMA4 and F16C
- Improved L1 and L2 cache designs
- Faster clock speeds
The table below lists the FX processors using Piledriver microarchitecture. These processors use the Vishera core.
| Processor | Cores | CPU Speed | L2 | Power |
|---|---|---|---|---|
| FX 83xx | 8 | 3.5-4 GHz | 4 MB | 125 W |
| FX 63xx | 6 | 3.5 GHz | 3 MB | 95 W |
| FX 43xx | 4 | 3.8 GHz | 2 MB | 95 W |
AMD Fusion/HSA (Heterogeneous Systems Architecture) APUs
Fusion was the original name for a variety of AMD mobile, desktop, and server processors with in-core graphics, which are now classified under the Heterogeneous Systems Architecture (HSA) designation. AMD refers to these processors as advanced processing units (APUs).
Note: AMD dropped the Fusion name after it was discovered that a Swiss firm, Arctic (originally Arctic Cooling), had been using Fusion for its power supply products since 2006, hence the change to the HSA designation.
AMD has released several lines of APUs, including the C-series (primarily for notebooks) and the E-series (used in notebooks and a few very low-cost desktops). However, the primary product line for desktops is the A-series, which has used two core designs. The initial A-series designs use the Llano core, based on Bulldozer, but with no L3 cache, while the second series uses the Trinity core, based on Piledriver, but again with no L3 cache. The Llano core uses Socket FM1 and includes models with two, three, or four cores and up to 4 MB of L2 cache. The Trinity core uses Socket FM2 and provides faster clock speeds, better GPU performance, and better thermal management. It also features two to four cores with up to 4 MB of L2 cache. The table below compares these processors:
| Processor | Cores | CPU Speed | Turbo Core | L2 | GPU | Power | Unlocked | Core |
|---|---|---|---|---|---|---|---|---|
| A10-5800K | 4 | 3.8 GHz | Yes | 4 MB | HD 7600D | 100 W | Yes | Trinity |
| A10-5700 | 4 | 3.4 GHz | Yes | 4 MB | HD 7600D | 65 W | No | Trinity |
| A8-5600K | 4 | 3.6 GHz | Yes | 4 MB | HD 7560D | 100 W | Yes | Trinity |
| A8-5500 | 4 | 3.2 GHz | Yes | 4 MB | HD 7560D | 65 W | No | Trinity |
| A8-3870K | 4 | 3.0 GHz | No | 4 MB | HD 6550D | 100 W | Yes | Llano |
| A8-3850 | 4 | 2.9 GHz | No | 4 MB | HD 6550D | 100 W | No | Llano |
| A8-3800 | 4 | 2.4 GHz | Yes | 4 MB | HD 6550D | 65 W | No | Llano |
| A6-5400K | 2 | 3.6 GHz | Yes | 1 MB | HD 7540D | 65 W | Yes | Trinity |
| A6-3670K | 4 | 2.7 GHz | No | 4 MB | HD 6530D | 100 W | Yes | Llano |
| A6-3650 | 4 | 2.6 GHz | No | 4 MB | HD 6530D | 100 W | No | Llano |
| A6-3600 | 4 | 2.1 GHz | Yes | 4 MB | HD 6530D | 65 W | No | Llano |
| A6-3500 | 4 | 2.1 GHz | Yes | 3 MB | HD 6530D | 65 W | No | Llano |
| A4-5300 | 3 | 3.4 GHz | Yes | 1 MB | HD 7480D | 65 W | No | Trinity |
| A4-3400 | 2 | 2.7 GHz | No | 1 MB | HD 6410D | 65 W | No | Llano |
| A4-3300 | 2 | 2.5 GHz | No | 1 MB | HD 6410D | 65 W | No | Llano |
