Tom's Hardware Forums » CPU & Components » CPUs » Why is Conroe so good with cache?
 

Why is Conroe so good with cache?




Word :   Username :  
 
Bottom
Author
 Thread : Why is Conroe so good with cache?
 
Profile: journeyman
More Information

Seriously with Netburst after 1MB there was not much of a payback adding any more L2 Cache but with Conroe it's all different.

Why do you think it's performance increases so much with cache?

My little theory is since the cores communicate through the Cache more is always better!

Related Product

Register or log in to remove.

Profile: journeyman
More Information

It's because the architecture is much more efficient and...better.

The Pentium D's were in a sense,pretty inefficient and weren't designed to get greater performance or power savings.

Profile: nimble knuckle
More Information

Cause Conroe is faster and needs more data.

Profile: journeyman
More Information

Well, this is my idea so dont quote me or anything... this might be wrong.

First of, lets leave netburst out of the question because it just sucked to begin with.

When you look at Athlon 64 vs Core 2 it might be a little easier to explain. First off, A64 has a bigger L1 cache (64kb I and 64kb D), so that helps it at that level. Next, the A64 had an integrated memory controller, so it didnt really Need (or benefit from) much cache, because memory access was so quick. The memory controller (and bigger L1) might explain why the A64 DIDNT benefit much from extra L2.

Now, Core 2 has smaller L1's (32kb I and 32kb D), so it needs to make up for that somewhere else, which is does with the L2 being a good bit bigger to compensate. Next, it doesnt have an integrated memory controller, so its memory access is (relatively) slow. So, when main memory access is slow, it helps a lot to make up for it with much faster L2 memory. This MIGHT be why Core 2 benefits so much from extra L2 cache.

Note, that everything I said is simply an idea, I am probably wrong, so dont yell at me if I am.


---------------
Please Check Out My Site:

Blue Spear Application Development
Profile: journeyman
More Information

jaywalker256 wrote :

Well, this is my idea so dont quote me or anything... this might be wrong.

First of, lets leave netburst out of the question because it just sucked to begin with.

When you look at Athlon 64 vs Core 2 it might be a little easier to explain. First off, A64 has a bigger L1 cache (64kb I and 64kb D), so that helps it at that level. Next, the A64 had an integrated memory controller, so it didnt really Need (or benefit from) much cache, because memory access was so quick. The memory controller (and bigger L1) might explain why the A64 DIDNT benefit much from extra L2.

Now, Core 2 has smaller L1's (32kb I and 32kb D), so it needs to make up for that somewhere else, which is does with the L2 being a good bit bigger to compensate. Next, it doesnt have an integrated memory controller, so its memory access is (relatively) slow. So, when main memory access is slow, it helps a lot to make up for it with much faster L2 memory. This MIGHT be why Core 2 benefits so much from extra L2 cache.

Note, that everything I said is simply an idea, I am probably wrong, so dont yell at me if I am.



Nah thats why I started this thread, for people to spread their theories about why performance grows LOTS with more cache.

The P6 core did not get near as much of a performance boost either makes me wonder..

Profile: Faithful Poster
More Information

protokiller wrote :

My little theory is since the cores communicate through the Cache more is always better!


Your little theory is correct. The shared cache allows them to share information without using the bus. Very much faster, especially when more than one core is using the same dataset - this allows for all the cores to be able to process the data at the same time, rather than sequentially then passing data along to the next core via the bus.


---------------
+3 Bastage (RC-1,JPJ-2), -5 Thread Revival, +1 Llama Spit, + 5 Soul Satisfaction, +3 Pitiless, +5 I make no f'ing sense, +5 below the belt, +5 Cutting Sarcasm, +7.2 philosophical, +10 OQ, +5 Movie Killer, +5 Hippie Oppression, +5 Over their heads, +5 Over
Profile: journeyman
More Information

exit2dos wrote :

Your little theory is correct. The shared cache allows them to share information without using the bus. Very much faster, especially when more than one core is using the same dataset - this allows for all the cores to be able to process the data at the same time, rather than sequentially then passing data along to the next core via the bus.



So if the Cache is full they talk through the bus? If so I guess that would explain why they need so much cache.

I thought ALL communication was done through cache.

Profile: Faithful Poster
More Information

If the data isn't in cache, then it must be brought up from the main system memory via the bus, which is much, much slower.

This is another advantage that the C2Ds have over the old Pentiums. The C2D is better optimized, therefore much better at knowing what data needs to be stored in cache.

For example, the old P4s were really bad at recognizing small program loops. If the section of code was too small to be recognized as a loop, it would not be cached. Therefore, the data would need to be accessed from the main memory (very slow) for each iteration of the loop.

Also, the pipeline was too big on the P4. It has been said that they needed to run at over 4GHz in order to properly utilize a pipeline of that size. Intel planned to hit 10GHz at some point, but couldn't due to the power draw and heat issues. Inefficient (compared to the C2D) prediction logic often made the P4 have to stop and flush out the pipeline, refill the pipeline with new commands, then continue. Another reason why a larger cache wasn't that effective on the P4s. (I'm over-simplifying here to illustrate the point).


---------------
+3 Bastage (RC-1,JPJ-2), -5 Thread Revival, +1 Llama Spit, + 5 Soul Satisfaction, +3 Pitiless, +5 I make no f'ing sense, +5 below the belt, +5 Cutting Sarcasm, +7.2 philosophical, +10 OQ, +5 Movie Killer, +5 Hippie Oppression, +5 Over their heads, +5 Over
Call me Ishmael.
Profile: nimble knuckle
More Information

Why is Conroe so good with cache?


Dang and I thought it was because they had a better accountants to handle all that cache.


:D


---------------
Athlon 64 AM2 6000+
Gigabyte M61P-S3
4 GB A-Data DDR2 800
Asus 4850 512mb
Profile: Faithful Poster
More Information

---------------
+3 Bastage (RC-1,JPJ-2), -5 Thread Revival, +1 Llama Spit, + 5 Soul Satisfaction, +3 Pitiless, +5 I make no f'ing sense, +5 below the belt, +5 Cutting Sarcasm, +7.2 philosophical, +10 OQ, +5 Movie Killer, +5 Hippie Oppression, +5 Over their heads, +5 Over
Profile: journeyman
More Information

exit2dos wrote :

If the data isn't in cache, then it must be brought up from the main system memory via the bus, which is much, much slower.

This is another advantage that the C2Ds have over the old Pentiums. The C2D is better optimized, therefore much better at knowing what data needs to be stored in cache.

For example, the old P4s were really bad at recognizing small program loops. If the section of code was too small to be recognized as a loop, it would not be cached. Therefore, the data would need to be accessed from the main memory (very slow) for each iteration of the loop.

Also, the pipeline was too big on the P4. It has been said that they needed to run at over 4GHz in order to properly utilize a pipeline of that size. Intel planned to hit 10GHz at some point, but couldn't due to the power draw and heat issues. Inefficient (compared to the C2D) prediction logic often made the P4 have to stop and flush out the pipeline, refill the pipeline with new commands, then continue. Another reason why a larger cache wasn't that effective on the P4s. (I'm over-simplifying here to illustrate the point).



I was more talking about how the Conroes are "true dual core" and do their cross communication through the CPU cache while the Pentium D's used the FSB only.

So the conroe cores talk through the cache but yea as with any cpu when cache is full they must access the memory.


Go to:
 
  Tom's Hardware Forums » CPU & Components » CPUs » Why is Conroe so good with cache?

Google Ads
Ad
News

Intel blocks Nvidia from running SLI on Conroe platforms?

Published on June 27, 2006

The Inquirer reports that Intel may not be supporting SLI on its Conroe motherboards. Read more

Intel: Desktop dual-core processor "Conroe" due in 2006

Published on May 06, 2005

Intel provided a few more code-names of future processors during a briefing with financial analysts in New York. Read more

IDF Taiwan: Intel confirms packages for Conroe and Woodcrest, but remains fuzzy on Merom

Published on April 12, 2006

At IDF Taiwan, Intel officially confirmed packaging for two of its first 65nm dual-core processors (Conroe and Woodcrest) based on the next-generation Intel Core microarchitecture, but failed to clarify packaging on a third processor (Merom) using Intel Core. Read more

Intel reschedules launch of Conroe and 965-chipset to July 2006

Published on December 21, 2005

Intel will push up the launch of its desktop Conroe CPU and delay the debut of its 965-series chipset so that both will hit the market at the same time in July 2006 in a demand-boosting strategy, according to sources at motherboard makers in Taiwan. Read more

Latest Reviews & Articles

Part 4: Avivo HD vs. PureVideo HD

Published on September 29, 2008

The 780G chipset/Radeon HD 3200 and the MCP78S chipset/GeForce 8200 provide the first integrated graphics solutions that can accelerate Blu-ray playback. We dig deep into how well they work with high-quality Blu-ray 1080p video playback. Read more

Four GeForce 9600 GT Cards Compared

Published on September 26, 2008

Manufacturers really love the first Geforce 9. The graphic chip is fast, the cards are inexpensive, and some retailers offer more than ten variations. Read more

Maxtor's Shared Storage Does NAS At Home

Published on September 25, 2008

What do you do with all the data you collect at home? Network attached storage is the solution. We test Maxtor's Shared Storage II and find that it is also suitable for use in small businesses. Read more

SLI & Centrino 2: Gaming Laptops Battle

Published on September 24, 2008

Take four gaming laptops. Arm two of them with SLI and make the others Centrino 2-compatible. You're looking at a high-end collection of the latest mobile technology battling it out for benchmark supremacy and your hard-earned dollars. Read more