Intel Nehalem 55XX and L1 and L2 cache sizes. Small?

defender

Distinguished
Jan 27, 2009
8
0
18,510
I have been studying up on Intel's upcoming "Nehalem," or "Gainestown" Xeon 55XX processor (though I need to do a LOT more!).

I was wondering about the chip's use of caches.

Now I understand that the QuickPath technology won't keep the CPU starved for data and tapping its foot like now, as data is read from the hard drive, transferred to RAM and finally transferred to the CPU via a FSB. I also understand Nehalem will use an L3 cache of 8MB.

But why the relatively small L1 and L2 caches? (64K L1 cache and a 256K L2 cache.)

What would be the harm in these caches being larger in a chip in the year 2009?

Wouldn't there be a performance increase if they were larger, say 512K L1 and 2MB L2? Because they're "on-die" and I would assume much faster even than the QuickPath accessed RAM and especially L3 cache memory?

The CPU would never be "starved" and tapping its foot waiting to be fed.

Any help understanding this is appreciated.

defender :hello:

P.S. Intel has announced that Nehalem will eventually gain 8 cores. Does anyone think they will use the 32nm process then, or the same 45nm metric?

I'm wondering about socket compatibility and the possibility of upgrading to 8 core processors at a later time...

.

.

 

cheepstuff

Distinguished
Dec 13, 2008
416
0
18,790
the L2 cache is smaller because it is only meant for one core. the L3 is for all cores , carrying algorythms for all the cores, so they make it bigger. i dont speak for intel so i dont know for sure, but they probably are not able to fit very much memory with the 45nm process. to carry an OS and all its components in full takes several GB of RAM. so far they have only been able to a couple MB on these caches. so it isn't like they are very close to replacing ram with on-die caches. they only put enough on to store the most critical and commonly used algorythms.
 

Dekasav

Distinguished
Sep 2, 2008
1,243
0
19,310
I know one issue with using bigger L2 caches is that the L3 cache is inclusive, it holds all 4 cores' L2 cache inside it. Currently, 2Mb of L3 is actually just copies of the 4 cores' L2. If you make the L2 cache bigger, you lose more and more of your L3 cache, also opening up the possibility of performance loss. The way it is now, it's set more for things using multiple cores, as L3 cache will be dominant when 2+ cores are working on similar things (last sentence is opinion.)
 

cheepstuff

Distinguished
Dec 13, 2008
416
0
18,790
no, i dont think intel has created these CPUs without a little thought, wouldnt they put some algorythms in the L3 to 'boot out' redundant stuff?
 

cheepstuff

Distinguished
Dec 13, 2008
416
0
18,790
the L3 cache is supposed to hold copies of the other caches, but what if multiple cores are doing similar things. it isn't smart to just let it hold stuff redundently from other caches. that's wasting L3, i don't think intel spends thousands of hours on these chips to not think of ways of cutting/compressing L3. would they want it redundent for logging it down or something?
 

defender

Distinguished
Jan 27, 2009
8
0
18,510
Thanks, cheepstuff, cjl, Dekasav, et al, for your informative, educational replies.

I have a lot to learn.

But thanx to you guys now I smart!

defender

.
.
.
 

It's redundant so that everything in all caches is in the L3. This means that if something isn't in the L3 cache, there's no need to go searching the L2 and L1 of every core in order to check if it is cached somewhere. See here for more details on the Nehalem cache architecture.
 


That's a good, informative link - thanks. Now I smart too :)

As the article mentioned, the latency for L1 in Nehalem has increased from 3 to 4 cycles. If they made it larger, the latency might increase to the point of signficantly hurting performance.

One disadvantage of exclusive caches like P2 uses, is that the hypertransport has to use bandwidth for cache snooping, since it's faster to get data from an on-chip cache than go out to main memory. So there's continuous snooping going on to update each core about what the other cores might have stored in their exclusive caches. I think that if AMD had the transistor budget, they too would use inclusive caches.
 


This. to explain it a bit more Nehalems L3 will keep a copy and it will keep that in record so that if it needs to go back and load something to the L2/L1 it will have it in the L3 which would be faster than rerequesting it from any other part even the memory since the QPI is very fast.

And the reason the L1/L2 cache are not so large is because they don't need to be. The reason they were with C2D/C2Q is to help stop having to access the memory as much which was slower. Since now Nehalem has a IMC it can access the memory much faster than a C2D/C2Q could and there is no real performance loss thanks to QPI. Much like a AMD chip that has a pretty low L1/L2 normally.
 

defender

Distinguished
Jan 27, 2009
8
0
18,510
.
Thanks to everyone for educating me.

I agree: Intel would not do anything that would negatively affect the performance of upcoming the "Nehalem" or "Gainestown" 55XX processors.

I'm satisfied.

I guess I was thrown for a loop by this article from Tom's Hardware regarding L1 & L2 caches:

<http://www.tomshardware.com/reviews/cache-size-matter,1709-8.html>

defender

.
.
.
 

TRENDING THREADS