Sign in with
Sign up | Sign in
Your question

High data per cycle vs high frequency?

Last response: in CPUs
Share
December 31, 2012 5:05:27 PM

I've wondered why certain games, programs, emulators require a cpu of ~3Ghz but wanted to know what if you say got a cpu with extremely high data per cycle (lets say double the other) and then clock it at 1.5ghz. What affect would it have on the program?
a c 124 à CPUs
December 31, 2012 5:19:55 PM

Most games and emulators are heavily dependent upon single-threaded performance and will perform better on CPUs with fewer cores but high IPC and high clock rates. This is why Sandy Bridge and Ivy Bridge Pentiums and i3 can outdo stock-clocked FX63xx/83xx in many if not most games.

In more heavily threaded applications though (3D rendering, video editing, physics, etc.), the FX8300 can give even the i7 a run for its money.

The ultimate example of heavily threaded processing on the desktop is GPUs: once the game/application's pixel/vertex shader code gets converted into whatever instruction set the GPU can work with, the code gets dispatched to as many as 2048 shader units. While those "CPUs" are running at only ~1GHz, their collective throughput is massive.
m
0
l
a b à CPUs
December 31, 2012 5:24:19 PM

A more efficient architechture is generally capable of executing more work per cycle, clock speed overall makes little difference.

For instance someing like a Core i5 760 vs a Phenom II 965, you'll find the i5 while lower clocked beats out the 965 is practically every peice of software due to the better chip design. All data likes to be executed during as little cycles as possible.

There is also the factor of having multiple cores and threads though to take into account, you'll find that software that is capable of executing accross multple threads will see an advantage against chips with a better design. So even if a chip can do more work per cycle ,you might find that a particular application runs worlds better on a chips that has say 2 to 4 more cores on die.
m
0
l
Related resources
a c 124 à CPUs
December 31, 2012 5:41:12 PM

CDdude55 said:
A more efficient architechture is generally capable of executing more work per cycle, clock speed overall makes little difference.

They are both every bit as important.

A Core2/i3/i5/i7 at 1.5GHz is not going to perform anywhere near a 3GHz version of the same or a 3GHz version of any AMD chip with the same number of cores and similar vintage. There are limits to how large a clock gap architecture alone can bridge.

GPUs have very low per-shader-thread IPC and very low clock speeds compared to general-purpose CPUs but they make up for it by being massively multi-threaded and efficient at doing things that way.
m
0
l
a b à CPUs
December 31, 2012 5:55:17 PM



Quote:
A Core2/i3/i5/i7 at 1.5GHz is not going to perform anywhere near a 3GHz version of the same or a 3GHz version of any AMD chip with the same number of cores and similar vintage. There are limits to how large a clock gap architecture alone can bridge.


This is true, clock speed is still a factor in performance, just not the only one. There is less of an emphasis on clock speed now a days with software performance increasing with other on die resources. This is all i pointed out, clock speed is still certainly relevent.

Quote:
GPUs have very low per-shader-thread IPC and very low clock speeds compared to general-purpose CPUs but they make up for it by being massively multi-threaded and efficient at doing things that way.


Yes that's very true.
m
0
l
January 1, 2013 11:15:53 AM

CDdude55 said:
Quote:
A Core2/i3/i5/i7 at 1.5GHz is not going to perform anywhere near a 3GHz version of the same or a 3GHz version of any AMD chip with the same number of cores and similar vintage. There are limits to how large a clock gap architecture alone can bridge.


This is true, clock speed is still a factor in performance, just not the only one. There is less of an emphasis on clock speed now a days with software performance increasing with other on die resources. This is all i pointed out, clock speed is still certainly relevent.

Quote:
GPUs have very low per-shader-thread IPC and very low clock speeds compared to general-purpose CPUs but they make up for it by being massively multi-threaded and efficient at doing things that way.


Yes that's very true.

then what about CPUs with multiple cores per thread? do they preform epicly? I saw it on some supercomputers like Watson.
m
0
l
a b à CPUs
January 1, 2013 12:07:42 PM

1zacster said:
then what about CPUs with multiple cores per thread? do they preform epicly? I saw it on some supercomputers like Watson.


Performance increases in software that is coded to take advantage of the multiple threads vs chips with lesser cores/threads to do work on. If you have a piece of software that relies on a single thread, then having more cores is irrelevent and it's more dependent on both clock speed and the design of the CPU itself.

Supercomputers are an extreme example of the need for cores, the things they are generally required to run need to have tons of new data executed frequntly. They generally use hundreds to even thousands of CPU cores.
m
0
l
January 1, 2013 1:47:19 PM

CDdude55 said:
Performance increases in software that is coded to take advantage of the multiple threads vs chips with lesser cores/threads to do work on. If you have a piece of software that relies on a single thread, then having more cores is irrelevent and it's more dependent on both clock speed and the design of the CPU itself.

Supercomputers are an extreme example of the need for cores, the things they are generally required to run need to have tons of new data executed frequntly. They generally use hundreds to even thousands of CPU cores.

I think you misunderstood my question. I'm saying multiple cored per thread, meaning say an imaginary cpu (or uber xeon + custom os) has 2 cores per thread, a quad core cpu behaving like a dual core, kinda the opposite of hyperthreading.
m
0
l
a b à CPUs
January 1, 2013 2:13:57 PM

There is no such thing as two cores per thread, it's the opposite. Every core has a single thread and then you have things like HT that create an extra virtual thread per core.

m
0
l
a c 124 à CPUs
January 1, 2013 5:15:03 PM

CDdude55 said:
There is no such thing as two cores per thread, it's the opposite. Every core has a single thread and then you have things like HT that create an extra virtual thread per core.

The extra threads running on a given core are not any more or less real than the first thread per core, they are all treated exactly the same way.

The only difference between Intel's HT and most other SMT architectures is that HT is applied to CPUs fundamentally optimized for single-threaded performance and therefore does not suffer much from HT being disabled while full-on SMT (4+ threads per core) gets applied to architectures optimized for massively threaded environments such as GPUs, servers and supercomputers and these would suffer a catastrophic performance loss from disabling their "extra" threads since they are optimized for full-on threading to interleave the other threads' instructions during each's stalls from mispedicts, cache misses, dependency resolution, etc.
m
0
l
a b à CPUs
January 1, 2013 5:37:37 PM

InvalidError said:
The extra threads running on a given core are not any more or less real than the first thread per core, they are all treated exactly the same way.

The only difference between Intel's HT and most other SMT architectures is that HT is applied to CPUs fundamentally optimized for single-threaded performance and therefore does not suffer much from HT being disabled while full-on SMT (4+ threads per core) gets applied to architectures optimized for massively threaded environments such as GPUs, servers and supercomputers and these would suffer a catastrophic performance loss from disabling their "extra" threads since they are optimized for full-on threading to interleave the other threads' instructions during each's stalls from mispedicts, cache misses, dependency resolution, etc.


The extra threads are simply called virtual threads under HT, i would never say they're any more "real" then any other thread, it's there own terminology.

The rest is true.
m
0
l
a c 124 à CPUs
January 1, 2013 5:39:51 PM


I'm skeptical about "Anaphase" gaining much interest.

The basic idea is to use extra cores to do speculative execution so if one threads gets a branch mispredict, execution can pass on to the other thread/core that took the correct branch. Sounds nice in theory but in practice, this means a fair bit of hardware/software complication to keep the cores/threads in sync at every fork... and also the power of twice as many cores to do the same job for 10% extra performance at best.

You could probably get most of that 10% if AMD/Intel simply implemented methods of saving branch flags and providing compile-time static predictions. (e.g.: the CPU no longer needs to guess which branches are loops because the compiler already flagged them as such and no longer needs to guess which if/elsif/else is most likely going to get taken because it is flagged as such.)
m
0
l
January 1, 2013 9:21:51 PM

InvalidError said:
I'm skeptical about "Anaphase" gaining much interest.

The basic idea is to use extra cores to do speculative execution so if one threads gets a branch mispredict, execution can pass on to the other thread/core that took the correct branch. Sounds nice in theory but in practice, this means a fair bit of hardware/software complication to keep the cores/threads in sync at every fork... and also the power of twice as many cores to do the same job for 10% extra performance at best.

You could probably get most of that 10% if AMD/Intel simply implemented methods of saving branch flags and providing compile-time static predictions. (e.g.: the CPU no longer needs to guess which branches are loops because the compiler already flagged them as such and no longer needs to guess which if/elsif/else is most likely going to get taken because it is flagged as such.)


Yeah for every day use it is pointless, but when you get a massive corporation or science lab with literally infinite power and resources (well at least when building supercomputers) 10% per thread times several thousand cores can be a lot.
m
0
l
a c 124 à CPUs
January 1, 2013 10:53:28 PM

1zacster said:
Yeah for every day use it is pointless, but when you get a massive corporation or science lab with literally infinite power and resources (well at least when building supercomputers) 10% per thread times several thousand cores can be a lot.

If you are building a supercomputer or cluster, you are going to be far more interested into making each CPU run at its native number of thread and as close as possible to 100% theoretical throughput than sacrificing half the threads/cores to increase the remaining threads/cores' performance by 10% which would still be only about 50% max theoretical throughput.

Anaphase is for when you have a multi-threaded/multi-cored CPU running poorly threaded code and that would actually make it most relevant for "everyday use" since most of the stuff normal people use on an everyday basis is poorly threaded.

For servers and supercomputers which are often already massively threaded, Anaphase is largely pointless since there are few to no spare cores to waste on it when the server/HPC is actually under load.
m
0
l
a b à CPUs
January 2, 2013 1:41:27 AM

Frankly, clockspeed and IPC both have the same wighting as far as pure performance goes. EG: Doubling the clockspeed would yield the same results as doubling the IPC.
m
0
l
!