As processor core speeds increased, memory speeds could not keep up. How could you run a processor faster than the memory from which you fed it without having performance suffer terribly? The answer was cache. In its simplest terms, cache memory is a high-speed memory buffer that temporarily stores data the processor needs, allowing the processor to retrieve that data faster than if it came from main memory. But there is one additional feature of a cache over a simple buffer, and that is intelligence. A cache is a buffer with a brain.
A buffer holds random data, usually on a first-in, first-out basis or a first-in, last-out basis. A cache, on the other hand, holds the data the processor is most likely to need in advance of it actually being needed. This enables the processor to continue working at either full speed or close to it without having to wait for the data to be retrieved from slower main memory. Cache memory is usually made up of static RAM (SRAM) memory integrated into the processor die, although older systems with cache also used chips installed on the motherboard.
Recent low-cost processor designs typically include two levels of processor/memory cache: Level 1 (L1) and Level 2 (L2). Mid-range and high-end designs also have Level 3 cache. These caches and their functioning are described in the following sections.
Tip
Use the popular CPU-Z utility discussed earlier in this chapter to determine the types and sizes of cache memory in your computer’s CPUs.
Internal Level 1 Cache
All modern processors starting with the 486 family include an integrated L1 cache and controller. The integrated L1 cache size varies from processor to processor, starting at 8 KB for the original 486DX and now up to 128 KB or more in the latest processors.
NoteMulti-core processors include separate L1 caches for each processor core. Also, L1 cache is divided into equal amounts for instructions and data.
To understand the importance of cache, you need to know the relative speeds of processors and memory. The problem with this is that processor speed usually is expressed in MHz or GHz (millions or billions of cycles per second), whereas memory speeds are often expressed in nanoseconds (billionths of a second per cycle). Most newer types of memory express the speed in either MHz or in megabyte per second (MB/s) bandwidth (throughput).
Both are really time- or frequency-based measurements. You will note that a 233 MHz processor equates to 4.3-nanosecond cycling, which means you would need 4 ns memory to keep pace with a 200 MHz CPU. Also, note that the motherboard of a 233 MHz system typically runs at 66 MHz, which corresponds to a speed of 15 ns per cycle and requires 15 ns memory to keep pace. Finally, note that 60 ns main memory (common on many Pentium-class systems) equates to a clock speed of approximately 16 MHz. So, a typical Pentium 233 system has a processor running at 233 MHz (4.3 ns per cycle), a motherboard running at 66 MHz (15 ns per cycle), and main memory running at 16 MHz (60 ns per cycle). This might seem like a rather dated example, but in a moment, you will see that the figures listed here make it easy for me to explain how cache memory works.
Because L1 cache is always built into the processor die, it runs at the full-core speed of the processor internally. By full-core speed, I mean this cache runs at the higher clock multiplied internal processor speed rather than the external motherboard speed. This cache basically is an area of fast memory built into the processor that holds some of the current working set of code and data. Cache memory can be accessed with no wait states because it is running at the same speed as the processor core.
Using cache memory reduces a traditional system bottleneck because system RAM is almost always much slower than the CPU; the performance difference between memory and CPU speed has become especially large in recent systems. Using cache memory prevents the processor from having to wait for code and data from much slower main memory, thus improving performance. Without the L1 cache, a processor would frequently be forced to wait until system memory caught up.
Cache is even more important in modern processors because it is often the only memory in the entire system that can truly keep up with the chip. Most modern processors are clock multiplied, which means they are running at a speed that is really a multiple of the motherboard into which they are plugged. The only types of memory matching the full speed of the processor are the L1, L2, and L3 caches built into the processor core.
If the data that the processor wants is already in L1 cache, the CPU does not have to wait. If the data is not in the cache, the CPU must fetch it from the Level 2 or Level 3 cache or (in less sophisticated system designs) from the system bus—meaning main memory directly.
- Processor Specifications Explained
- Data I/O Bus, Address Bus, And Internal Registers
- Processor Modes: Real Mode
- IA-32 Mode: 32-Bit And Virtual Real
- IA-32e 64-Bit Extension Mode (x64, AMD64, x86-64, EM64T)
- Processor Benchmarks And Comparing Performance
- Processor Efficiency
- Cache Memory
- How Cache Works
- Level 2 And Level 3 Cache
- Cache Performance And Design
- Cache Organization
Shouldn't that be "Foreword?"
Should be 10^2, 10^3, and in the next para, 2^x.
This was very interesting, considering both instructions were supported even by the humble 8086.
These sections seem more or less unchanged, except for the mention of Ivy and Vishera, and i think the CPU-z screenshots are new as well.
This was very interesting, considering both instructions were supported even by the humble 8086.
https://en.wikipedia.org/wiki/X86-64#Older_implementations
Yet at the very least the 80386 supported them:
http://css.csail.mit.edu/6.858/2011/readings/i386/LAHF.htm
So it appears that it was an early-64 bit CPU issue only.
The Prescott introduced 64-bit to the Intel world, not the Core 2. Kind of common knowledge. The Athlon XP had a 36-bit address bus? I don't remember ever seeing that.
Then we go to the misinformation about the 8086/8088 to 386.
In actuality, there were four modes in the 80386. Real, Virtual 86, Protected 286, and Protected 386. Yup, four. And no, Windows 3.0 was not expected to run on an 8088 or 80286, because it DID use Virtual 86, which those processors could not support. You know, the part where they let you go from one DOS task to another. That was in the hardware. And that hardware started with the 80386.
Moreover, the 80286 did NOT have the same instruction set as the 8086. Only in real mode did it. And why do you suppose it was called real mode? Maybe because the addresses were not virtualized? The 80286, as mentioned above, did have virtual addresses in what was called the 80286 Protected Mode. It not only ran Real Mode apps much faster, but when in Protected Mode was very capable of running multitasking Operating Systems, something that could not be done well on the 8086. It also increased the memory bus to 24-bits, albeit still using 64K bit segments.
OS/2 1.x was the best example of an OS using 286 Protected mode, although any software using "Extended Memory" was taking advantage of the greater addressing of the 286, albeit in an inelegant way.
I stopped reading after page three, as it's just discouraging to think people are writing books without being accurate. OK, so we have the author that got it wrong, fair enough, but what about the people who are supposed to error check it. I certainly don't know everything, and I know this stuff, and it's pretty basic. No one caught this? Are you kidding me? The 286 stuff might be a bit far away, but not knowing that x86-64 first appeared in the Prescott line is really difficult to understand, and is very basic. This is made more so because of all the rumors that the processor was made to support it, but Intel was hiding it so as to not undercut the Itanium. In time, it was proven true.
Please, don't spread misinformation. Someone will repeat this stuff, and then someone else will, and it becomes 'fact' despite being wrong. If you publish a book, make a friggin effort! I'm sure I could errors the rest of the way, but it's just too annoying for me to wade through this rubbish.
By the way, the term CPU bus is an ambiguous one. The CPU has multiple buses, and if you used that term with me, I'd wonder which one you were referring to. Find a more accurate term, like PCI-E bus if that's what you are trying to say.
The Prescott introduced 64-bit to the Intel world, not the Core 2. Kind of common knowledge. The Athlon XP had a 36-bit address bus? I don't remember ever seeing that.
Then we go to the misinformation about the 8086/8088 to 386.
In actuality, there were four modes in the 80386. Real, Virtual 86, Protected 286, and Protected 386. Yup, four. And no, Windows 3.0 was not expected to run on an 8088 or 80286, because it DID use Virtual 86, which those processors could not support. You know, the part where they let you go from one DOS task to another. That was in the hardware. And that hardware started with the 80386.
Moreover, the 80286 did NOT have the same instruction set as the 8086. Only in real mode did it. And why do you suppose it was called real mode? Maybe because the addresses were not virtualized? The 80286, as mentioned above, did have virtual addresses in what was called the 80286 Protected Mode. It not only ran Real Mode apps much faster, but when in Protected Mode was very capable of running multitasking Operating Systems, something that could not be done well on the 8086. It also increased the memory bus to 24-bits, albeit still using 64K bit segments.
OS/2 1.x was the best example of an OS using 286 Protected mode, although any software using "Extended Memory" was taking advantage of the greater addressing of the 286, albeit in an inelegant way.
I stopped reading after page three, as it's just discouraging to think people are writing books without being accurate. OK, so we have the author that got it wrong, fair enough, but what about the people who are supposed to error check it. I certainly don't know everything, and I know this stuff, and it's pretty basic. No one caught this? Are you kidding me? The 286 stuff might be a bit far away, but not knowing that x86-64 first appeared in the Prescott line is really difficult to understand, and is very basic. This is made more so because of all the rumors that the processor was made to support it, but Intel was hiding it so as to not undercut the Itanium. In time, it was proven true.
Please, don't spread misinformation. Someone will repeat this stuff, and then someone else will, and it becomes 'fact' despite being wrong. If you publish a book, make a friggin effort! I'm sure I could errors the rest of the way, but it's just too annoying for me to wade through this rubbish.
By the way, the term CPU bus is an ambiguous one. The CPU has multiple buses, and if you used that term with me, I'd wonder which one you were referring to. Find a more accurate term, like PCI-E bus if that's what you are trying to say.