I would like to start with the block diagrams of Pentium 4, Pentium III and AMD's latest Athlon processor. I spent considerable time with PowerPoint to create those diagrams, so please don't just disregard them. Even if they might look scary at this stage, I promise to explain them to you in the following text.
This is my personal P4-diagram, which became necessary because Intel wasn't able to supply one that was good enough. It follows the traditional top-to-bottom flowchart idea and should include all the important units that influence Pentium 4's performance. Here's a little glossary:
- BTB = 'Branch Target Buffer'. In this table you'll find all the addresses to where a branch will or could be made. Athlon is also using a 'BHT' = 'Branch History Table', which stores the addresses where branches were made to. A software program is using branches to make decisions. The program asks a question and according to the answer a branch is made or not.
- µOP = 'Micro-Operation/Operand'. This is the name that Intel gives instructions, which can be directly understood by the execution units of the microprocessor. AMD calls them 'MacroOPs', because they are a bit advanced and can contain more information than Intel's µOPs. Both 'OPs' have one important thing in common. They represent very simple instructions that can be quickly carried out by the processor. Unlike x86-instructions, those 'OPs' are of a defined size and can thus easily be fed into the execution pipeline. The decoder translates an x86-instruction into one or many more 'OPs', unless the x86-instruction was so complex (and rare) that the 'Micro Instruction Sequencer' has to produce a sometimes rather longish sequence of 'OPs', using the 'Micro Code ROM' found in any modern super scalar microprocessor. In average, most x86-instructions get decoded to about two 'OPs'. Some extremely simple instructions like e.g. an 'AND', 'OR', 'XOR' or 'ADD' are often producing only one 'OP', while a 'DIV' or 'MUL', or an indirect addressed operand will produce more. Complex instructions like e.g. trigonometric commands can easily produce up to hundreds of 'OPs', coming out of the 'Micro Instruction Sequencer'.
- ALU - Arithmetic Logic Unit. This is the name of what we call the 'Integer'-unit. Arithmetic operations like adding, multiplying and dividing as well as logic operations such as 'OR', 'AND', 'ASL', 'ROL', ... are carried out by the 'ALUs'. Those operations represent the vast majority of program code in most software programs.
- AGU - Address Generation Unit. This unit is just as important as the 'ALU', because it is responsible for the data from or to the correct address to either be loaded or stored. Absolute addressing in programs is only used in rare exceptions. As soon as you've got arrays of data the program code is using indirect addressing, keeping the 'AGUs' busy.