Sign in with
Sign up | Sign in
Your question
Solved

Voltage and Clock Rate relation?

Tags:
  • Overclocking
Last response: in Overclocking
Share
July 6, 2013 8:25:12 PM

Can someone please answer my last 2 posts on this question

http://www.tomshardware.com/answers/id-1724624/voltage-...

Thanks!!!!!!!!!!!!!

More about : voltage clock rate relation

a b K Overclocking
July 6, 2013 9:12:03 PM

I think I see where your confusion is. You are wanting to know why different chips with a similar clock speed provide different amounts of computing power? It is mostly the internal composition of each chip. With desktop CPUs being built to run at higher power and laptops at very low power, usually this is accomplished by reducing the clock speed, and thus the voltage, of a given architecture. But also limiting energy intensive features like cache memory, # of cores, hyperthreading, etc.

Here is a theoretical:

1Ghz chip, 1 million transistors = 5 quintillion 0 or 1s per second
2Ghz chip, 500,000 transistors = 5 quintillion 0 or 1s per second

So if I create a chip with 5 quintillion transistors and run it at one Hertz, it would still be equivalent.

This is all very simplifed, the internal workings of modern CPUs are beyond me, but a quick explanation for voltage/current/power required by a CPU:

Inside of the chip each transistor represents an on/off state. The time it remains switched in the ON position means that it is providing a path for current to flow, eventually, to ground. This consumes power and if all the transistors were on at the same time it would essentially be a short to ground and cause the power supply to have a theoretical value of 0 volts. Since they are on only some of the time we can say that the chip runs at 1.2 volts, and has a TDP of 84W = 70 Amps. Increasing the voltage, but maintaining the current, increases the TDP, and thus causing chips to run hotter when you do this. Increasing the clock rate increases the amount of ON time, thus requiring more voltage to maintain the amount of current flowing through the chip. Too little and the chip will have transistors that won't switch when required and causing errors, thus crashing.
m
0
l
a b K Overclocking
July 6, 2013 9:34:59 PM

I thought of a little better analogy then hammers. Car engines.

Air/Fuel = Voltage
Liters = CPU
Work = Horsepower
Exhaust = (well this still equals heat)

As you go up in size you produce more power, but go up in fuel/air consumption. To "overclock" an engine you can add a turbocharger, this forces more air into the engine, which means you need more fuel to burn. The engine produces more power, but also more waste heat.

And here is my fake chart, based a little bit on fact, but mostly just examples.

TDP 11W tablet i5 1.6Ghz = 1.2 liter engine 50HP
TDP 15W dual core laptop i3 2.0Ghz = 1.8 Liter engine 80HP
TDP 20W dual core laptop i5 2.0Ghz = 2.4 Liter engine 100HP
TDP 25W quad core laptop i5 2.4Ghz = 2.4 Liter engine with a turbocharger 120HP (turbo mode, make more sense now :)  )
TDP 30W quad core laptop i7 2.4Ghz = 3.0 Liter engine running at low RPM 150HP, less fuel
TDP 55W desktop i3 3.1Ghz = 3.0 Liter engine 180HP
TDP 77W desktop i5 3.2Ghz = 3.6 Liter engine 240HP
TDP 84Wdesktop i7 3.5Ghz = 5.0 Liter engine 340HP
TDP 110W OC desktop i7 4.2Ghz = 5.0 Liter engine with turbocharger 420HP
m
0
l
Related resources

Best solution

a b K Overclocking
July 6, 2013 9:46:30 PM

^ No it wouldn't.
- The analogy isn't outright wrong, but it does not factor in most common sense aspects of processor design.
- Instructions per clock cycle does not scale linearly with transistor count of the core (inclusive of L1 cache or not).

Transistor count these days is usually more than 70% L2 and L3 cache, and the more transistors are dedicated to cores the more cache is required as the interface to memory becomes a massive bottleneck.
- Doubling the transistor count by adding more cache won't double performance, it just decreases the cache miss rate.
- If the cache miss rate lowers from 50% to 0% as a result (highly unlikely) then, technically, it might.

You can halve the transistor count of the 'raw core' (excluding L1 cache) without halving performance!
- There are many processors that perform fine with just 3.3 to 10.0 million transistors, and taking those designs to 6.6 to 20.0 million transistors will not magically double the performance!

Consider that a 733MHz Pentium III with a 133FSB outperforms a 750MHz Pentium III with a 100FSB.

It takes 1.8 million transistors to have just 32KB of L1 cache last time I checked.
- L2 and L3 caches may require less transistors per bit to create a hybrid like SRAM.
- I say this a you're likely to have an extra 4,732 bytes per 32KB of L1 cache for management of the cache statistics, etc.

Generally speaking if you double the voltage on given silicon you can increase the clock speed by +42.13562%
- This is only true if the die is the same size, the transistor count is the same, and the surface area to contact ratio permits the processor(s) to be cooled effectively.
- The heat output and leakage may rise more than linearly (compared to your starting voltage) and the CPU may become hotter per cubic mm than a nuclear reactor!
- The power efficiency will be the reciprocal of this; as the voltage doubles the performance per watt will decrease by --29.28932%

Another simple comparison is the Pentium II to Pentium II
- 7.5 million transistors for P-II core --- Deschutes (80523)
- 9.5 million transistors for P-III core --- Katmai (80525)
- Over +25% more transistors per core, and the Pentium III is not +25% faster at executing x86-32bit code!

These two processors are available with very similar specs:
- 250nm fabrication (so the voltage and die size would be equal all else been equal)
- They both have 16 + 16 kB (Data + Instructions) L1 caches, using a Harvard or Hybrid Harvard Superscalar Architecture.
- They both have 512KB of L2 cache external to the processor
- The L2 cache is clocked at half the core frequency in both cases
- They are both available at 450 MHz (important for equal comparison)
- They both have a 100MHz FSB (less important for equal 'core' comparison, but important due to the availability of data).
- They both support MMX
- They both have a 2.0 Volt Vcore.
- Only one of them supports SSE

They do not have equal performance 'per transistor' dependant upon the workload.

Heck, if anything your argument is an argument for a reduction in transistor counts of processor cores and having 64 CPU cores with 16MB (or more) of on-die cache shared between all the CPU cores intelligently, as some cores may not be executing code that benefits from L2 cache at all, while others may benefit from 8MB, or more, of L2 cache alone.
- I am in no way against this idea
- Add a 32MB to 256MB L3 cache, and Quad-channel DDR4-SDRAM (which would be so fast as to be considered a L5 cache instead of 'System RAM').
- I think we'd all be very happy indeed with such a 2 billion transistor core processor (and associated motherboard chipset).

Noting that:
- The transistors for L2 and L3 cache (per KB or MB) can differ as there are different ways of making cache. (Faster with more transistors per bit, or slower and larger with less transistors per bit; some of these methods may have patents protecting them. Thankfully we have patents or there would be no technological improvement :-) ).


PS: There used to be an OpenSPARC CPU design website, and I put forward ideas there about pre-fetching and building systems that did not require Rambus style memory interfaces to scale. I think they're gone now but the 4KB prefetching was definitely used in something. (Obvious enough not to be patented and makes systems far more economical per dollar while keeping performance pretty darn close to just having gigabytes of SRAM or hyper Rambus implementations).
Share
May 15, 2014 11:27:42 PM

Tabris DarkPeace said:
^ No it wouldn't.
- The analogy isn't outright wrong, but it does not factor in most common sense aspects of processor design.
- Instructions per clock cycle does not scale linearly with transistor count of the core (inclusive of L1 cache or not).

Transistor count these days is usually more than 70% L2 and L3 cache, and the more transistors are dedicated to cores the more cache is required as the interface to memory becomes a massive bottleneck.
- Doubling the transistor count by adding more cache won't double performance, it just decreases the cache miss rate.
- If the cache miss rate lowers from 50% to 0% as a result (highly unlikely) then, technically, it might.

You can halve the transistor count of the 'raw core' (excluding L1 cache) without halving performance!
- There are many processors that perform fine with just 3.3 to 10.0 million transistors, and taking those designs to 6.6 to 20.0 million transistors will not magically double the performance!

Consider that a 733MHz Pentium III with a 133FSB outperforms a 750MHz Pentium III with a 100FSB.

It takes 1.8 million transistors to have just 32KB of L1 cache last time I checked.
- L2 and L3 caches may require less transistors per bit to create a hybrid like SRAM.
- I say this a you're likely to have an extra 4,732 bytes per 32KB of L1 cache for management of the cache statistics, etc.

Generally speaking if you double the voltage on given silicon you can increase the clock speed by +42.13562%
- This is only true if the die is the same size, the transistor count is the same, and the surface area to contact ratio permits the processor(s) to be cooled effectively.
- The heat output and leakage may rise more than linearly (compared to your starting voltage) and the CPU may become hotter per cubic mm than a nuclear reactor!
- The power efficiency will be the reciprocal of this; as the voltage doubles the performance per watt will decrease by --29.28932%

Another simple comparison is the Pentium II to Pentium II
- 7.5 million transistors for P-II core --- Deschutes (80523)
- 9.5 million transistors for P-III core --- Katmai (80525)
- Over +25% more transistors per core, and the Pentium III is not +25% faster at executing x86-32bit code!

These two processors are available with very similar specs:
- 250nm fabrication (so the voltage and die size would be equal all else been equal)
- They both have 16 + 16 kB (Data + Instructions) L1 caches, using a Harvard or Hybrid Harvard Superscalar Architecture.
- They both have 512KB of L2 cache external to the processor
- The L2 cache is clocked at half the core frequency in both cases
- They are both available at 450 MHz (important for equal comparison)
- They both have a 100MHz FSB (less important for equal 'core' comparison, but important due to the availability of data).
- They both support MMX
- They both have a 2.0 Volt Vcore.
- Only one of them supports SSE

They do not have equal performance 'per transistor' dependant upon the workload.

Heck, if anything your argument is an argument for a reduction in transistor counts of processor cores and having 64 CPU cores with 16MB (or more) of on-die cache shared between all the CPU cores intelligently, as some cores may not be executing code that benefits from L2 cache at all, while others may benefit from 8MB, or more, of L2 cache alone.
- I am in no way against this idea
- Add a 32MB to 256MB L3 cache, and Quad-channel DDR4-SDRAM (which would be so fast as to be considered a L5 cache instead of 'System RAM').
- I think we'd all be very happy indeed with such a 2 billion transistor core processor (and associated motherboard chipset).

Noting that:
- The transistors for L2 and L3 cache (per KB or MB) can differ as there are different ways of making cache. (Faster with more transistors per bit, or slower and larger with less transistors per bit; some of these methods may have patents protecting them. Thankfully we have patents or there would be no technological improvement :-) ).


PS: There used to be an OpenSPARC CPU design website, and I put forward ideas there about pre-fetching and building systems that did not require Rambus style memory interfaces to scale. I think they're gone now but the 4KB prefetching was definitely used in something. (Obvious enough not to be patented and makes systems far more economical per dollar while keeping performance pretty darn close to just having gigabytes of SRAM or hyper Rambus implementations).


COULD YOU HAVE MADE THIS ANY MORE CONFUSING!!! AT ALL!!!! IF THAT IS EVEN POSSIBLE???? I THOUGHT I WAS JUST STARTING TO UNDERSTAND IT A LITTLE TILL YOU CHIMED IN WITH THE L2 CACHE THIS AND L3 CACHE THAT I GET THE FACT THAT YOU ARE REALLY REALLY SMART AND KNOW WHAT YOU ARE SAYING BUT SOMETIMES ACTUALLY MOST TIMES YOU GUYS OVER EXPLAIN THINGS AND MAKE PEOPLE JUST LEARNING NOT ONLY CONFUSED BUT YOU OUTRIGHT SCARE US FROM WANTING TO LEARN ANY MORE AFTER READING ALL THIS NOW I AM MORE CONFUSED THAN EVER AND NOW I HAVE A HEAD ACHE SOMETIMES IT IS JUST BETTER TO KEEP IT AS SIMPLE AS YOU CAN K THANKS
m
0
l
!