I have been noticing STILL some folks making false comments about processors based upon their number of cores.
I wanted to share some info/good reads about multi-core environments.
Hopefully some folks will take away a better understanding of the single vs dual vs quad core processors and how they affect performance and why.
Things discussed are memory usage, OS utilization/scheduling, IO, threading and multiple processes/multitasking even when you are not multitasking .
The Dr Dobbs article about coding for a shared cache is very very interesting as it defines what "could" be an advantage of the C2D over the K10 (although recent docs have stated that maybe a smaller shared cache will be used with the K10 too). This advantage is only recognized if the code is correct in its usage of the shared cache (see the table in the article).
Understand these reads are mostly from a developer standpoint but they give a very good basis for both SMP and Multi-Core.
Considering the Core 2 Quad, the 4th scenario is not as sound-proof as Tian Tian makes it out to be. The L2 Cache is only shared between to 2 cores not all 4 cores. So, additional logic (developer or OS) would need to inserted so that all 4 cores see the same shared resource in cache -- which, in turn, is a performance hit.
Now considering the K8L, the 4th scenario is viable because the L3 cache is shared across all 4 cores. However, I'm not sure about performance of L3 Cache as compared to L2 Cache.
It feels wierd to use an Intel article to show (at least in one case) where the competition's product is potentially more efficient.
However using processor/core affinity would correct that issue would it not?
If you are on Core 0 and use affinity to keep ya there, you would be doing just fine as that core ONLY has access to that L2 cache segment.
Thing is if you were to code an app with affinity enabled (through OS) if it were to later PORT to a truely native quad core with shared L2 between all cores it would NOT require any changes to code.
This gets back to fully understanding the hardware of the system you are on.
Using tip #3 in the article keeps everyone playing nice while in the shared cache. Again using tip #1 in conjunction with tip #3 gives you the performance advantage they were talking about with also the later advantage of going to a native Intel Quad with NO code changes.
Makes me wonder how most OSes would treat the Core 2 Quad core affinity. Will the OS see it as a generic quad core CPU or more correctly as 2 seperate dual core CPUs?
How does Hyperthreading actually work? Is there 4 complete sets of registers -- 2 sets of User/Supervisor registers? Along with a unique register that points to the set of registers currently in use. Or, is there some sort of superfast register store/fetch?
Also, remember that Intel may offer an Intel specific library that may ensure affinity is correct. Limiting parent and child processes to the same core grouping would not be that difficult.
Psuedo Code
if (get_core(Parent))
{
push.child(get_core(Parent));
}
Hyper-Threading Technology
Hyper-Threading Technology (HT Technology) was developed by Intel Corporation to bring the simultaneous multi-threading approach to the Intel architecture. With HT Technology, two threads can execute on the same single processor core simultaneously in parallel rather than context switching between the threads. Scheduling two threads on the same physical processor core allows better use of the processors resources.
HT Technology is available on Intel Xeon processors and some Intel Pentium 4 processors. HT Technology adds circuitry and functionality into a traditional processor to enable one physical processor to appear as two separate processors. Each processor is then referred to as a logical processor. The added circuitry enables the processor to maintain two separate architectural states and separate Advanced Programmable Interrupt Controllers (APIC) which provides multi-processor interrupt management and incorporates both static and dynamic symmetric interrupt distribution across all processors. The shared resources include items such as cache, registers, and execution units to execute two separate programs or two threads simultaneously. Requirements to enable HT Technology are system equipped with a processor with HT Technology, an OS that supports HT Technology and BIOS support to enable/disable HT Technology.
Figure 2. Processor equipped with Hyper-Threading Technology
You can find some additional, more complete and technical descriptions of HT Technology in the Intel Technology Journal.
Note that it is also possible to have a dual processor system that contains two HT Technology enabled processors which would provide the ability to run up to 4 programs or threads simultaneously. This capability is currently available on Intel Xeon processors and these systems are currently available from several OEM making and selling Intel Xeon processor-based DP systems.
The key being this part:
Quote :
With HT Technology, two threads can execute on the same single processor core simultaneously in parallel rather than context switching between the threads.
Each logical processor maintains a complete set of the
architecture state. The architecture state consists of
registers including the general-purpose registers, the
control registers, the advanced programmable interrupt
controller (APIC) registers, and some machine state
registers. From a software perspective, once the
architecture state is duplicated, the processor appears to
be two processors.
The general-purpose registers, control registers, and the Advanced Programmable Interrupt Controller (APIC), as well as some machine state registers have been duplicated to form the two architectural states.
That makes the most sense to ensure the highest performance is maintained.
Thanks for the help on that ches111 and JumpingJack!
I think that's the crux of the issue. There was a very good thread recently where levicki(?) responded in re: this issue, to the effect that there weren't any decent compilers provided by either intel OR amd to enable programmers to easily code multi-core / multi-threaded apps. Until decent tools are developed, the above is not likely to happen.
If I were intel / amd, I'd spend some R & D $ making tools to allow developers to easily catch up to the technology.
Remember the TI 9904? if / then / else execution in a single cpu cycle. Zero crossing switching, +5 vdc - 5 vdc cpu. Still makes a great PLLC controller for industry, but not a great PC. Why? no generic compilers. LLC compilers yes...
I am bumping this thread given the release of newer dual cores expected along with possibly confusing pricing between dual and quad core processors from Intel.
I am also resurecting this thread given the developments (Barcelona) from the AMD camp..
If I can find some more AMD specific articles on this topic or someone could provide them, I will update the first post with that info.
This link is strictly technical and is presented by IBM.
Please keep in mind that it does state SMP or Symmetrical Multi Processing, this is in essence the same thing as multi-core (I do know the differences but much of this still applies:
They are sometimes a pain in the rear... I understand... ...
Maybe we can write a more comprehensive review, and include some of the frequently asked questions (E6850 vs. Q6600 for instance), and ask turpit to put it as sticky.
I just don't like how people just barge into a forum, put on capslock, and start typing their questions, without doing a little search around. If you need help, we'll help you. But that doesn't mean you come in here unprepared.
------------------------------Intel will not take the top spot, or probably the top 3 spot back for the forseeable future. Not even with 32nm and more cores will intel be able to beat Jaguar. - JennyH the AMDiot, Nov 2009
Reply to yomamafor1
You are about to answer a thread that has been inactive for more than 6 months. If you still wish to proceed, please ensure that your posting is original and does not duplicate or overlap any prior responses to this thread.