Sign in with
Sign up | Sign in

How Cache Works

Upgrading And Repairing PCs 21st Edition: Processor Specifications
By

To learn how the L1 cache works, consider the following analogy.

This story involves a person (in this case, you) eating food to act as the processor requesting and operating on data from memory. The kitchen where the food is prepared is the main system memory (typically double data rate [DDR], DDR2, or DDR3 dual inline memory module [DIMMs]). The cache controller is the waiter, and the L1 cache is the table where you are seated.

Okay, here’s the story. Say you start to eat at a particular restaurant every day at the same time. You come in, sit down, and order a hot dog. To keep this story proportionately accurate, let’s say you normally eat at the rate of one bite (byte? <grin>) every four seconds (233 MHz = about 4 ns cycling). It also takes 60 seconds for the kitchen to produce any given item that you order (60 ns main memory).

So, when you arrive, you sit down, order a hot dog, and you have to wait for 60 seconds for the food to be produced before you can begin eating. After the waiter brings the food, you start eating at your normal rate. Pretty quickly you finish the hot dog, so you call the waiter over and order a hamburger. Again, you wait 60 seconds while the hamburger is being produced. When it arrives, you again begin eating at full speed. After you finish the hamburger, you order a plate of fries. Again you wait, and after the fries are delivered 60 seconds later, you eat them at full speed. Finally, you decide to finish the meal and order cheesecake for dessert. After another 60-second wait, you can eat cheesecake at full speed. Your overall eating experience consists of a lot of waiting, followed by short bursts of actual eating at full speed.

After coming into the restaurant for two consecutive nights at exactly 6 PM and ordering the same items in the same order each time, on the third night the waiter begins to think, “I know this guy is going to be here at 6 PM, order a hot dog, a hamburger, fries, and then cheesecake. Why don’t I have these items prepared in advance and surprise him? Maybe I’ll get a big tip.” So you enter the restaurant and order a hot dog, and the waiter immediately puts it on your plate, with no waiting! You then proceed to finish the hot dog and right as you are about to request the hamburger, the waiter deposits one on your plate. The rest of the meal continues in the same fashion, and you eat the entire meal, taking a bite every four seconds, and you never have to wait for the kitchen to prepare the food. Your overall eating experience this time consists of all eating, with no waiting for the food to be prepared, due primarily to the intelligence and thoughtfulness of your waiter.

This analogy describes the function of the L1 cache in the processor. The L1 cache itself is a table that can contain one or more plates of food. Without a waiter, the space on the table is a simple food buffer. When it’s stocked, you can eat until the buffer is empty, but nobody seems to be intelligently refilling it. The waiter is the cache controller who takes action and adds the intelligence to decide which dishes are to be placed on the table in advance of your needing them. Like the real cache controller, he uses his skills to literally guess which food you will require next, and if he guesses correctly, you never have to wait.

Let’s now say on the fourth night you arrive exactly on time and start with the usual hot dog. The waiter, by now really feeling confident, has the hot dog already prepared when you arrive, so there is no waiting.

Just as you finish the hot dog, and right as he is placing a hamburger on your plate, you say “Gee, I’d really like a bratwurst now; I didn’t actually order this hamburger.” The waiter guessed wrong, and the consequence is that this time you have to wait the full 60 seconds as the kitchen prepares your brat. This is known as a cache miss, in which the cache controller did not correctly fill the cache with the data the processor actually needed next. The result is waiting, or in the case of a sample 233 MHz Pentium system, the system essentially throttles back to 16 MHz (RAM speed) whenever a cache miss occurs.

According to Intel, the L1 cache in most of its processors has approximately a 90% hit ratio. (Some processors, such as the Pentium 4, are slightly higher.) This means that the cache has the correct data 90% of the time, and consequently the processor runs at full speed (233 MHz in this example) 90% of the time. However, 10% of the time the cache controller guesses incorrectly, and the data has to be retrieved out of the significantly slower main memory, meaning the processor has to wait. This essentially throttles the system back to RAM speed, which in this example was 60 ns or 16 MHz.

In this analogy, the processor was 14 times faster than the main memory. Memory speeds have increased from 16 MHz (60 ns) to 333 MHz (3.0 ns) or faster in the latest systems, but processor speeds have also risen to 3 GHz and beyond. So even in the latest systems, memory is still 7.5 or more times slower than the processor. Cache is what makes up the difference.

The main feature of L1 cache is that it has always been integrated into the processor core, where it runs at the same speed as the core. This, combined with the hit ratio of 90% or greater, makes L1 cache important for system performance.

Ask a Category Expert

Create a new thread in the Reviews comments forum about this subject

Example: Notebook, Android, SSD hard drive

Display all 36 comments.
This thread is closed for comments
Top Comments
  • 11 Hide
    DelightfulDucklings , October 14, 2013 10:22 PM
    Very interesting article, I quite enjoyed the part about Cache memory
Other Comments
  • 5 Hide
    xkm1948 , October 14, 2013 9:13 PM
    Really nice intro article!
  • 11 Hide
    DelightfulDucklings , October 14, 2013 10:22 PM
    Very interesting article, I quite enjoyed the part about Cache memory
  • 2 Hide
    kindiana , October 14, 2013 10:27 PM
    nice article
  • 9 Hide
    burnley14 , October 14, 2013 10:29 PM
    One of the most interesting and informative articles I've ever read on the site. Great job!
  • 2 Hide
    aredflyingbird , October 14, 2013 10:41 PM
    Agreed, excellent article.
  • 2 Hide
    palladin9479 , October 14, 2013 10:55 PM
    Really good article, actually was spot on with how caching works.
  • 3 Hide
    AndrewJacksonZA , October 15, 2013 12:22 AM
    "Forward From The Editor"
    Shouldn't that be "Foreword?"
  • 3 Hide
    iam2thecrowe , October 15, 2013 2:23 AM
    I need more cache in my kitchen.
  • 0 Hide
    LalitMotagi , October 15, 2013 2:25 AM
    Great Article.
  • -1 Hide
    Rex Romero , October 15, 2013 2:30 AM
    Andrew. It's so advanced so it's forward. lol
  • 1 Hide
    groundrat , October 15, 2013 4:04 AM
    Excellent article.
  • 0 Hide
    ojas , October 15, 2013 5:22 AM
    Quote:
    For example, if you live on a street in which the address is limited to a two-digit (base 10) number, no more than 100 distinct addresses (00–99) can exist for that street (102). Add another digit, and the number of available addresses increases to 1,000 (000–999), or 103.

    Should be 10^2, 10^3, and in the next para, 2^x.
  • 0 Hide
    ojas , October 15, 2013 6:11 AM
    Quote:
    Note: Early versions of EM64T-equipped processors from Intel lacked support for the LAHF and SAHF instructions used in the AMD64 instruction set.

    This was very interesting, considering both instructions were supported even by the humble 8086.
  • 3 Hide
    spookyman , October 15, 2013 6:20 AM
    I have one still from 18 years ago. Still one of the best tech books I own. Though mine was just starting to touch the Intel Pentium processor. It even covered IBM's PS/2 computers and technology. Its amazing how much more hardware intensive PC's were back then they are now.
  • 0 Hide
    ojas , October 15, 2013 6:50 AM
    I ended up buying the 19th edition after last year's excerpts on Tom's Hardware, the 20th wasn't available in India then.

    These sections seem more or less unchanged, except for the mention of Ivy and Vishera, and i think the CPU-z screenshots are new as well.
  • 0 Hide
    AndrewJacksonZA , October 15, 2013 6:54 AM
    Quote:
    Quote:
    Note: Early versions of EM64T-equipped processors from Intel lacked support for the LAHF and SAHF instructions used in the AMD64 instruction set.

    This was very interesting, considering both instructions were supported even by the humble 8086.
    Apparently they were missing from the early 64bit CPUs from AMD and Intel. They appeared in March 2005 for AMD CPUs and June 2005 for Intel CPUs
    https://en.wikipedia.org/wiki/X86-64#Older_implementations

    Yet at the very least the 80386 supported them:
    http://css.csail.mit.edu/6.858/2011/readings/i386/LAHF.htm

    So it appears that it was an early-64 bit CPU issue only.
  • 0 Hide
    ta152h , October 15, 2013 9:04 AM
    I could only get to page three before being thoroughly disgusted by the lack of knowledge of the writer. How can he be writing books, without actually knowing the material?

    The Prescott introduced 64-bit to the Intel world, not the Core 2. Kind of common knowledge. The Athlon XP had a 36-bit address bus? I don't remember ever seeing that.

    Then we go to the misinformation about the 8086/8088 to 386.

    In actuality, there were four modes in the 80386. Real, Virtual 86, Protected 286, and Protected 386. Yup, four. And no, Windows 3.0 was not expected to run on an 8088 or 80286, because it DID use Virtual 86, which those processors could not support. You know, the part where they let you go from one DOS task to another. That was in the hardware. And that hardware started with the 80386.

    Moreover, the 80286 did NOT have the same instruction set as the 8086. Only in real mode did it. And why do you suppose it was called real mode? Maybe because the addresses were not virtualized? The 80286, as mentioned above, did have virtual addresses in what was called the 80286 Protected Mode. It not only ran Real Mode apps much faster, but when in Protected Mode was very capable of running multitasking Operating Systems, something that could not be done well on the 8086. It also increased the memory bus to 24-bits, albeit still using 64K bit segments.

    OS/2 1.x was the best example of an OS using 286 Protected mode, although any software using "Extended Memory" was taking advantage of the greater addressing of the 286, albeit in an inelegant way.

    I stopped reading after page three, as it's just discouraging to think people are writing books without being accurate. OK, so we have the author that got it wrong, fair enough, but what about the people who are supposed to error check it. I certainly don't know everything, and I know this stuff, and it's pretty basic. No one caught this? Are you kidding me? The 286 stuff might be a bit far away, but not knowing that x86-64 first appeared in the Prescott line is really difficult to understand, and is very basic. This is made more so because of all the rumors that the processor was made to support it, but Intel was hiding it so as to not undercut the Itanium. In time, it was proven true.

    Please, don't spread misinformation. Someone will repeat this stuff, and then someone else will, and it becomes 'fact' despite being wrong. If you publish a book, make a friggin effort! I'm sure I could errors the rest of the way, but it's just too annoying for me to wade through this rubbish.

    By the way, the term CPU bus is an ambiguous one. The CPU has multiple buses, and if you used that term with me, I'd wonder which one you were referring to. Find a more accurate term, like PCI-E bus if that's what you are trying to say.
  • -2 Hide
    ezorb , October 15, 2013 9:22 AM
    I feel that this is bellow the level of this website, even below the level of Maximum PC (which has a great podcast), this is the book my grandfather would buy if he wanted to try his hand at build a PC
  • -1 Hide
    hardrock40 , October 15, 2013 10:08 AM
    Really nice article well worth the read for sure.
  • -2 Hide
    ta152h , October 15, 2013 10:42 AM
    I could only get to page three before being thoroughly disgusted by the lack of knowledge of the writer. How can he be writing books, without actually knowing the material?

    The Prescott introduced 64-bit to the Intel world, not the Core 2. Kind of common knowledge. The Athlon XP had a 36-bit address bus? I don't remember ever seeing that.

    Then we go to the misinformation about the 8086/8088 to 386.

    In actuality, there were four modes in the 80386. Real, Virtual 86, Protected 286, and Protected 386. Yup, four. And no, Windows 3.0 was not expected to run on an 8088 or 80286, because it DID use Virtual 86, which those processors could not support. You know, the part where they let you go from one DOS task to another. That was in the hardware. And that hardware started with the 80386.

    Moreover, the 80286 did NOT have the same instruction set as the 8086. Only in real mode did it. And why do you suppose it was called real mode? Maybe because the addresses were not virtualized? The 80286, as mentioned above, did have virtual addresses in what was called the 80286 Protected Mode. It not only ran Real Mode apps much faster, but when in Protected Mode was very capable of running multitasking Operating Systems, something that could not be done well on the 8086. It also increased the memory bus to 24-bits, albeit still using 64K bit segments.

    OS/2 1.x was the best example of an OS using 286 Protected mode, although any software using "Extended Memory" was taking advantage of the greater addressing of the 286, albeit in an inelegant way.

    I stopped reading after page three, as it's just discouraging to think people are writing books without being accurate. OK, so we have the author that got it wrong, fair enough, but what about the people who are supposed to error check it. I certainly don't know everything, and I know this stuff, and it's pretty basic. No one caught this? Are you kidding me? The 286 stuff might be a bit far away, but not knowing that x86-64 first appeared in the Prescott line is really difficult to understand, and is very basic. This is made more so because of all the rumors that the processor was made to support it, but Intel was hiding it so as to not undercut the Itanium. In time, it was proven true.

    Please, don't spread misinformation. Someone will repeat this stuff, and then someone else will, and it becomes 'fact' despite being wrong. If you publish a book, make a friggin effort! I'm sure I could errors the rest of the way, but it's just too annoying for me to wade through this rubbish.

    By the way, the term CPU bus is an ambiguous one. The CPU has multiple buses, and if you used that term with me, I'd wonder which one you were referring to. Find a more accurate term, like PCI-E bus if that's what you are trying to say.
Display more comments