EXPERT: Question about CPU and CACHE Speeds

Norman331

Commendable
Aug 5, 2016
20
0
1,520
Even if the CPU Speed is increased, it cannot read | edit | add information to the Cache since it has to wait for the fastest Cache, which is usually slower than the CPU when the CPU is overclocked by increasing the multiplier. How does increasing CPU Clock Speeds increase performance when it still has to wait for the slower Cache to do anything?
 
Solution
CPU performance isn't fully dependent on the cache speed. The cache multiplier you are talking about is only for the L3 cache. Most of the time the data the CPU needs isn't in the L3 cache, it'll be in the L1 and L2 cache. These levels of cache operate at a set frequency based upon the CPU frequency therefore they operate faster when you overclock.

Now if every piece of data the CPU needed was sitting in the L3 cache, then I could see it bottlenecking performance of the CPU. However this isn't the case. CPU / cache / predictor design is very sophisticated. The branch predictor is pretty good at "guessing" what data is going to be needed and it will have it loaded in the pipeline before the CPU gets to the instruction that needs...

Norman331

Commendable
Aug 5, 2016
20
0
1,520
But for example, 5960x. 4.6 GHZ CPU Clock Speed vs. 3.5 GHZ CPU Cache Clock Speed. How can increasing the CPU Clock speed improve performance when the Cache is slower?
 
Well depending on what CPU you are talking about, some caches have their own multiplier and can be "overclocked" independently of the core clock. For instance Haswell. With Haswell, the L3 cache has it's own multiplier. In all overclocking guides I've read that you should keep the cache multiplier so that it's within 300MHz of the core clock. The L3 cache is the slowest of the three levels of cache on Haswell, L1 is the fastest and L2 is slower than L1 but faster than L3.

Older CPU's usually clock their cache based upon some set multiple of the base clock and therefore when you increase the CPU clock, you increase the cache clock proportionately.

In most cases the data the CPU needs will be sitting in one of the two first levels of cache which is the fastest. If it's not in the cache it looks in first that is registered as a cache miss. Obviously cache misses have an affect on performance. At the very least a CPU designer hopes that his design predicts what data is going to be needed and have it stored in at least one level of cache as access to RAM or storage is very expensive.

 

Norman331

Commendable
Aug 5, 2016
20
0
1,520
My 5960x Cache multiplier is set at 35x. How does the performance increase when increasing CPU Core Multiplier from 35x to 40x when the Cache is at a constant 35x? The CPU cannot do anything faster than 35x, or so it seems logically. Yet benchmark scores still increase despite the static Cache speed.
 
CPU performance isn't fully dependent on the cache speed. The cache multiplier you are talking about is only for the L3 cache. Most of the time the data the CPU needs isn't in the L3 cache, it'll be in the L1 and L2 cache. These levels of cache operate at a set frequency based upon the CPU frequency therefore they operate faster when you overclock.

Now if every piece of data the CPU needed was sitting in the L3 cache, then I could see it bottlenecking performance of the CPU. However this isn't the case. CPU / cache / predictor design is very sophisticated. The branch predictor is pretty good at "guessing" what data is going to be needed and it will have it loaded in the pipeline before the CPU gets to the instruction that needs the data in question. In this way, the CPU doesn't actually experience a slow down because the data needed was placed in the pipeline before it was needed. Most CPU instructions don't complete in a single clock cycle, so the pipeline is kept full of data. Where the biggest hit to performance comes is when the predictor gets it wrong and the entire pipeline had to be flushed of data and loaded with data for the new branch. This causes the CPU to stall while it waits for it's pipeline to be filled before it can proceed.
 
Solution


For the 5960x, the L1 cache is the same speed as the CPU, so if you are running 4GHz core it's going to be 4GHz, and it is always 1:1.