Sign in with
Sign up | Sign in
Your question

Is Multi-Core a temporary solution?

Last response: in CPUs
Share

Will multi core DESKTOPS stop to 4?

Total: 70 votes

  • Yes
  • 12 %
  • No
  • 89 %
August 5, 2006 2:18:33 PM

Ater more than 1 year we can surely say that dual core is a must for boosting performance in general. The next year is going to give us Quad-Cores from Intel and AMD but how much is it worth going from 2 to 4, 8 ...32 cores?
A dual core CPU given a hyperthreaded app. performs at most with a factor of 1.9 (95% efficiency) compared to a single core, same frequency. A recent benchmark of an Intel Quad Core gave it an index dropped to 3.23 (76% efficiency) and this is mostly due to the higher thread synchronization required.The more cores, the more the loss, so do we really need many of them (32 etc)?
Another way to increase performance is what they did with Conroe; widen data busses and adding more logical units that work in parallel. A conroe has 3 SSE units compared to only 1 of a K8 CPU; only this would be enough to explain the performance increase. This is a much more logical way to go; equip a core with more and more logical units, so I doubt we will see 32-core desktops the next 10 years or ever. Mother nature is the greatest designer of all and she only gave us one brain; However it's got a lot of units and is higly multithreaded that's why we only need one of it.
So can we go back to a single powerful core in few years?
August 5, 2006 2:34:04 PM

I can see 16 to 64 core cpu's in the next 6-10 years from Intel.
It does not matter if it is used by joe six-pack or not, they will do it to make it appear like their product is better than the competitors lol

I'm sure Microsoft will tweak windows to utilize more and more cores more efficiently, so all those extra cores will serve a purpose in the end. Imagine one entire core running Virus scan, while another one is running outlook, while 4 cores are being used to play some Video Game. (And perhaps 8 cores are handling the GPU duties)
August 5, 2006 2:43:33 PM

You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!
Related resources
August 5, 2006 2:48:40 PM

Quote:
I can see 16 to 64 core cpu's in the next 6-10 years from Intel.


Honestly, I can't see much past 16, MAYBE 32. But like m25 said... it's going to get to a point where it's just too much I think. Even though manufacturing processes are getting smaller, I think they won't be able to effectivly fit all of those cores in one managable package. The chip would be HUGE at 64 don't you think? Maybe I'm wrong, but check out a topic I started yesterday... lot's of good info.

http://forumz.tomshardware.com/hardware/Great-Quad-Core...
August 5, 2006 2:55:02 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!


I’m telling you how the Intel’s marketing people can still pitch the “more cores is better” idea even if the programmers only optimize their software for 1 or 2 cores in the future.

As for your bus problem, they will solve it sooner or later,

So imagine each core running one process / application. So each time you start Notepad, 1 core handles this task until note pad is closed. Having 32 or 64 cores would be nice, each process or app launched could have one core totally dedicated to it.
August 5, 2006 3:02:55 PM

It's perfectly possible even now to make a 32-core CPU; the problem is it would only perform like 12or16 cores. On the other side, we can design a core with 2X,4X..64X the processing units of an actual core and dynamic multithreading; meaning that it activates as many units as it needs. This way we don't waste CPU power in synchronizing cores and CPU busses.
August 5, 2006 3:21:38 PM

Quote:
I can see 16 to 64 core cpu's in the next 6-10 years from Intel.


Honestly, I can't see much past 16, MAYBE 32. But like m25 said... it's going to get to a point where it's just too much I think. Even though manufacturing processes are getting smaller, I think they won't be able to effectivly fit all of those cores in one managable package. The chip would be HUGE at 64 don't you think? Maybe I'm wrong, but check out a topic I started yesterday... lot's of good info.

http://forumz.tomshardware.com/hardware/Great-Quad-Core...
Hmmm if intel can pack 4 cores with 65nm, they can probably pack 32 cores with 45nm or 32nm,
August 5, 2006 3:27:01 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!

Single cores do indeed make better use of processor real-estate than multicore, but developing complex high-performance single-core CPUs is more expensive than integrating two cores on the same die.
August 5, 2006 3:44:30 PM

Quote:
Ater more than 1 year we can surely say that dual core is a must for boosting performance in general. The next year is going to give us Quad-Cores from Intel and AMD but how much is it worth going from 2 to 4, 8 ...32 cores?
A dual core CPU given a hyperthreaded app. performs at most with a factor of 1.9 (95% efficiency) compared to a single core, same frequency. A recent benchmark of an Intel Quad Core gave it an index dropped to 3.23 (76% efficiency) and this is mostly due to the higher thread synchronization required.The more cores, the more the loss, so do we really need many of them (32 etc)?
Another way to increase performance is what they did with Conroe; widen data busses and adding more logical units that work in parallel. A conroe has 3 SSE units compared to only 1 of a K8 CPU; only this would be enough to explain the performance increase. This is a much more logical way to go; equip a core with more and more logical units, so I doubt we will see 32-core desktops the next 10 years or ever. Mother nature is the greatest designer of all and she only gave us one brain; However it's got a lot of units and is higly multithreaded that's why we only need one of it.
So can we go back to a single powerful core in few years?


Just because Intel has problems on thread synchronization in its multi-core implementation resulting to decreased efficiency as the number of cores increases, it doesn't mean that the industry will not develop along multi-core technology. In the first place, Intel is yet to develop a cpu truly designed for multi-core support. Its soon to be released quad-core are just two C2D cpu's made to work together in a single silicon chip.

As process technology improves, more transistors can be packed into the silicon. In layman terms, it is a choice of redesigning cpu's to accommodate more logical units, or designing a more efficient way to have multiple cores work together. We have seen how the first option worked from how cpu evolved from 8086 to the present day single-core processors. On the other hand, the second option is what we see in supercomputers which are basically clusters of cpu's made to work together.

The demand will always be constant, more processing power. The critical factor may prove to be the software support. If the software industry will support multi-core processing soonest, then we might be seeing the last of the single-core cpu's. If the software industry continues to lag and either Intel or AMD comes up with yet another winner core redesign which improve performance, then single-core may stay.

My guess is that multi-core processing will prevail. We've seen them work in supercomputers, we will soon see them in our destops. Afterall, this is what we commonly refer to as miniturization. Whoever thought at that time that ENIAC's processing power multiplied more that a thousand times can today sit on one's table with plenty to spare for other accessories?
August 5, 2006 3:46:55 PM

Quote:

Here Casewhite has an argument, I wonder how they get all those huge arrays of CPUs to work to gether, delivering all those teraflops, peruse this site:

www.top500.org

There is no issues scaling number of core.... as AMD once published in their techno babble: thread level parallism is the holy grail of computing.

Jack


Wait a moment, this is different thing. What you are speaking about here are CLUSTERS - the main difference is that in cluster, the main memory is not shared by nodes, unlike multicore CPUs. Instead, each node in cluster is more or less separate computer which communicates with other machines via network connection (ethernet).

For multicore multithreading, the real problem is indeed sharing a memory and, worse, sharing CHANGES in memory done by one thread/core in other cores. That is why it does not scale up well.

Mirek
August 5, 2006 4:29:11 PM

What you are thinking of as inefficent only really happens when you try to split up a single process into several threads. If, like many of the other people said, you run one process per core, each core can run relatively independently of each other thereby dramaticly reducing the inefficency, so in the future, I can see developers running seperate processes for differnet parts of a game and synchronizing them via one central process, each on individual cores. If Intel builds a computer with a trillion cores, some software company somewhere will find a way to fully utilize its potential. Its like saying "we're never going to use more than 256 kbits of ram..." :roll: (something Bill Gates did actually say about a decade and a half ago, and many high end computers today are pushing 8 or even 16Gb of ram).
August 5, 2006 5:16:14 PM

there will always continue to be things bumping up, once the cores dont matter theres gonna be something else, thats wats so great about it, nvr ending advancements
August 5, 2006 5:35:59 PM

It's good that Intel and AMD is having tech and price war. We all benefit from their competition. I'm really thankful to their engrs who pushes the tech further.

I rather pay a $500 64 core processor in the near future than pay $500 with only 2 cores. In other words I'll be very happy to get a better processor etc with the same money we are willing to pay.

It's great that design engrs find a way to increase the performance of the CPUs. It's not so easy to push yourself trying to get ahead of your competitor and their competition makes better and cheaper products.

To all Intel, AMD, ATI, NVIDIA, mem and hd makers, keep on pushing the technology to the limit.

We should all be grateful to all the efforts of the people in the whole IT business. If not from them we won't have a forum like this wherein we can exchanges ideas and points of view and help other people in distress like that swedish gal Istaria :wink:
August 5, 2006 5:44:47 PM

It's not really a matter of more cores as it is a matter of what type of cores.

If we are talking about 32 "full" cores like we have today then yes, it will be very difficult to accomplish given the die sizes that would require. Perhaps your 10 year estimate is correct in that regard.

However, I think in the future we will see a transition from a few full cores to many smaller cores. Take Kiefer:

http://www.tomshardware.com/2006/07/10/project_keifer_3...

My interpretation of that design is that each core is not the full core that we see today. Otherwise I can't see how they are making that thing with all the L3 caches, IMCs, and interconnects on a 32nm process.

I believe that each of those cores is increasingly cell-like (kind of like increasingly multithreaded with HT). Now that means that in order to take full advantage of all 32 cores software will have to be rewritten. Fine. However, that doesn't mean that current software is useless. This is where Core Multiplexing or RHT steps in. RHT could allow the 4 cores in each node to join together to form a "full core" as we have now. That will allow Kiefer to appear as an 8 core chip to today's software or a 32 core chip to tomorrow's. I could also see HT or some form of 2 thread processing per core in order to yield 64 threaded performance. What this means is a very flexible architecture. Mind you this isn't likely to come to a desktop near you any time soon since my interpretation is that Kiefer is a Itanium 2 familiy chip rather than a Xeon.

In any case, for desktops the multicore transition will slow down. I read somewhere before that the push is to quad cores and then both AMD and Intel will slow down releasing octo-cores mainly for the server market. Most applications are still single threaded and the "multi-threaded" consumer applications we have right now are generally optimized for dual cores. Besides, in the case of AMD their 4x4 platform with quad cores would already offer 8 cores so I can't imagine them pushing too hard beyond that. 8 cores should be more than sufficient for whatever "megatasking" most people do.
August 5, 2006 6:02:38 PM

The car analogy of duo cores to 2 engines is wrong. Duo core is more like 2 cyclinders. Given that we are only at the start the computing technology evolution, most of us lucky enough to have machine have been running on one cylinder lawn mower engines for ages.
August 5, 2006 6:14:22 PM

Quote:
I believe that each of those cores is increasingly cell-like (kind of like increasingly multithreaded with HT). Now that means that in order to take full advantage of all 32 cores software will have to be rewritten. Fine. However, that doesn't mean that current software is useless. This is where Core Multiplexing or RHT steps in. RHT could allow the 4 cores in each node to join together to form a "full core" as we have now. That will allow Kiefer to appear as an 8 core chip to today's software or a 32 core chip to tomorrow's. I could also see HT or some form of 2 thread processing per core in order to yield 64 threaded performance. What this means is a very flexible architecture. Mind you this isn't likely to come to a desktop near you any time soon since my interpretation is that Kiefer is a Itanium 2 familiy chip rather than a Xeon.


You seem to be a knowledgeable person. :) 

Do you really think that "current" software will be used even when 32 core µPs come out? Even if such softwares are still used, will they require something like RHT to work properly? With the number of cores increasing, even the power of each core is increasing as well, right?

Why should RHT be even researched? Why can't all the work that the RHT is supposed to do be left to the compiler? That is the compiler's job, after all, right?

I thought RHT was just a stop gap solution to make older software run faster on multicore machines. Something like a runtime optimizing translator/compiler. But, when multicore desings become common, the compilers will do all the parallelising work themselves, right? Then why involve hardware at execution time?
August 5, 2006 6:29:26 PM

I'm not saying current software as in Office 2003, I'm saying current software as in the current software design philosophy for "full" cores. Basically future cores are going to be more specific toward certain tasks and cell like (probably SPE like) and not like the "general purpose" cores we have now. That will require new software to operate properly. However, software written for general purpose cores can operate in Kiefer, for example, by having the 4 specific cores in a node combine to mimic the features and function of a general purpose core. The hardware for RHT is there to allow standard software to run without or with less problems on "cell-like" processors while newer software can be written to run natively.
August 5, 2006 6:39:34 PM

Quote:
I'm not saying current software as in Office 2003, I'm saying current software as in the current software design philosophy for "full" cores. Basically future cores are going to be more specific toward certain tasks and cell like (probably SPE like) and not like the "general purpose" cores we have now. That will require new software to operate properly. However, software written for general purpose cores can operate in Kiefer, for example, by having the 4 specific cores in a node combine to mimic the features and function of a general purpose core. The hardware for RHT is there to allow standard software to run without or with less problems on "cell-like" processors while newer software can be written to run natively.


Ok... Thanks...

So, in your opinion, RHT is only for a transition period, till almost all softwares are optimized for the new architecture; and after that, it's all compiler, right?
August 5, 2006 6:44:58 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!


They do it just fine with the Itanium.
August 5, 2006 6:46:03 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!


Step back and think about what you are saying, you are not thinking this completely through.....

Here Casewhite has an argument, I wonder how they get all those huge arrays of CPUs to work to gether, delivering all those teraflops, peruse this site:

www.top500.org

There is no issues scaling number of core.... as AMD once published in their techno babble: thread level parallism is the holy grail of computing.

Jack

Maybe I didn't figure out the real problem but the FRICTION concept remains; better have a large unit (split in specialized parts according to the needs) than small IDENTICAL ones. The only 100% benefit maybe is only multitasking but I doubt somebody is going to work on 32 heavy apps on a desktop.
August 5, 2006 6:49:30 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!


They do it just fine with the Itanium.

A server is a heavy multitasker and it really is the case when multi core works better (because thei're not sharing resources of the same process).
I mentioned it earlier; you can't have somebody working on 32 intensive apps on a desktop
August 5, 2006 6:49:35 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!


Step back and think about what you are saying, you are not thinking this completely through.....

Here Casewhite has an argument, I wonder how they get all those huge arrays of CPUs to work to gether, delivering all those teraflops, peruse this site:

www.top500.org

There is no issues scaling number of core.... as AMD once published in their techno babble: thread level parallism is the holy grail of computing.

Jack

Maybe I didn't figure out the real problem but the FRICTION concept remains; better have a large unit (split in specialized parts according to the needs) than small IDENTICAL ones. The only 100% benefit maybe is only multitasking but I doubt somebody is going to work on 32 heavy apps on a desktop.

Already taken care of.
August 5, 2006 6:56:16 PM

From what we seen now , far from all software developers really care about the speed and and good optimisations of their programs.One example - Windows Vista :) .
I think that the main problem is the x86 code itself. Instructions are very dependant on previous ones , so running more instructions per clock starts to be difficult. If you put 20 integer units , 17 of them will spend almost all time without a work.
The same with multi threading.Switching and synconising threads properly isnt that easy to write. In my opinion , in order the x86 code to be multi - core effective , it must modified/updated in general.
August 5, 2006 6:56:48 PM

They say Speculative Threading(That is something like the ghost Reverse Hyper Threading) will be needed to utilize all the extra cores: Better core design, no extra cores, no need for Speculative Threading.
August 5, 2006 7:04:21 PM

Roughly speaking it's perallel data transfer and manipulation? Parallel is faster but requires more HardWare, serial is cheaper but sucks, right?
Good concept but does not apply to current and near future multicores.
August 5, 2006 7:07:42 PM

Quote:
In my opinion , in order the x86 code to be multi - core effective , it must modified/updated in general.

In order for revolutionary performance gain you really need to move away from x86 and discard legacy code. That's both a blessing and a curse. That's the philosophy for Itanium and that worked out very well.
August 5, 2006 7:11:40 PM

Quote:
In my opinion , in order the x86 code to be multi - core effective , it must modified/updated in general.

In order for revolutionary performance gain you really need to move away from x86 and discard legacy code. That's both a blessing and a curse. That's the philosophy for Itanium and that worked out very well.
You're right but we continue to build x86 CPUs and legacy gets always stronger and Itanium always more isolated in it's crystal ball.
August 5, 2006 7:24:08 PM

You couldn't have been clearer... So, YOUR opinion? Will desktops go to more than 4 cores in a few years or they will be server only technology?

P.S: I's looking to me like half a stupid question because we will surely see a wide gamma of specialized sectors within the DESKTOP area.
August 5, 2006 7:35:20 PM

Quote:
In my opinion , in order the x86 code to be multi - core effective , it must modified/updated in general.

In order for revolutionary performance gain you really need to move away from x86 and discard legacy code. That's both a blessing and a curse. That's the philosophy for Itanium and that worked out very well.

Ahhhh, you tripped over my peeve :)  --- Itanium (IA-64) has built in parallization instructions that work very well -- I will dig out the references at some point if strongly challenged. IA-64 was also a new instruction set built from the ground up, and if employeed correctly can be quite powerful.

Alas, a complete discarding of the x86 IA has been blocked completely when AMD decided to extend it to 64-bit, otherwise the natural break to revamp and modernize the x86 platform would have been logically at the 32-bit to 64-bit transisition --- my opinion, we are stuck with x86 for quite awhile longer.

Jack

Actually, I am not sure that IA-64 works "very well"... One would expect spectacular performance gains, but they are nowhere. Best that can be said about IA-64 is that it is competitive with other, classical ISA architectures.

So much about "revolutionary" approaches.

Mirek
Mirek
August 5, 2006 7:53:51 PM

The thing I love about mankind is that regardless of limits that are put in front of it... it always seems to work something out, eventually.

As of now the technology we have could not support 32, but 10 years is a long time in the world today and breakthroughs could be made.

You never know, 2 years down the road some genius out of podunk might have a better understanding of computer architecture then the geniuses of today. Or maybe Intel/AMD will get lucky with some great R&D staff recruits.

Sky is the limit I'd say, and even then.. we've been to space! (Another thing people said could never be done at one point)
August 5, 2006 7:54:51 PM

Quote:

The ultimate point is, we are stuck with a legacy IA for quite some time, and this is because AMD gave the industry the 'easy' way out by extending x86 instructions from 32 to 64. This did not happen at the 16-32, a new instruction set was introduced at that transition, it is the natural 'breaking point' to do so.


Wrong. x86 16->32 transition was much less revolutionary than current 32->64.

Mirek
August 5, 2006 8:11:16 PM

Quote:

The ultimate point is, we are stuck with a legacy IA for quite some time, and this is because AMD gave the industry the 'easy' way out by extending x86 instructions from 32 to 64. This did not happen at the 16-32, a new instruction set was introduced at that transition, it is the natural 'breaking point' to do so.


Wrong. x86 16->32 transition was much less revolutionary than current 32->64.

Mirek

Incorrect, I did not say revolutionary or what not, I simply stated the instruction set changed and as not an extension --- 16 bit code could not run natively as such.... the 32-->26 bit as you put it actually locks us to legacy instructions, that is not revolutionary.

Incorrect. Current CPUs can still run legacy mode, proctected 16-bit mode, legacy protected 32-bit mode, and AMD64/EM64T mode.

That is no difference from 16/32 bit transition. New CPU mode and new ISA added, just like back then (but I guess you cannot remember 286 :) .

You cannot run 32-bit code in 64-bit mode just as you cannot run 16-bit code in 32-bit mode. (meanwhile, you CAN run 16-bit code in 32-bit OS just like you can run 32-bit AND 16-bit code in 64-bit OS - but that is different thing).

As for "legacy", 16-bit ISA is more similar to 32-bit ISA than is 32-bit ISA to 64-bit ISA.

Speaking about it, there were in fact TWO 16-bit ISAs - original 8086 ("real mode") and 80286 ("16-BIT protected mod"). Then came 80386, added 16 bits to each of registers, improved some addressing mode, added virtual memory support and that is the currect 32-bit architecture.

Mirek
August 5, 2006 8:13:35 PM

I think before we get to something rediculous like 128 cores, we'll go to some completely new thing like quantum computing and such, or since they probably wont have figured out everything about quantum computing in 10-15 years, quantum aided computing where there's a quantum processor and a physical processor, both doing what they do best.
August 5, 2006 8:22:52 PM

Quote:

Incorrect, I did not say revolutionary or what not, I simply stated the instruction set changed (actually expanded) and is not an extension --- 16 bit code could not run natively as such.... it was interpreted into 32 bit, hence slower performing the 32-->64 bit as you put it actually locks us to legacy instructions which were nothing more than expanding registers to 64 bits, that is not revolutionary, it was a good engineering/design implementation but it was a market ploy from the word go.


I see you have correct your message...

Nope, that is bullshit. 16-bit code was NEVER interpreted and still is not even on current CPUs (where the hell you have heard that?).

64-bits is much more than just expanding registers. If it would be just expanding registers by 32-bits, then AMD64 would perform slower in 64-bit mode as all RISC architectures that gone through such transition do (I mean, keeping ISA and just extending registers). (Surprise? Yes, 64-bit mode is SLOWER on any other dual mode ISA, because 64-bit mode requires twise as much data to store pointers, therefore has much higher demands on cache and memory subsystem).

There are two things that make AMD64 faster and none of them has anything to do with 64-bits:

- 8 addittional integer registers. That is what improves integer performance.

- floating point operations are exclusively handled by scalar SSE2 instructions. While this is of course in theory possible with any SSE2 capable CPU, most of software compiled for 32-bits stays away from doing so for compatibility reasons. For this, AMD64 brings nice clean cut ;) 

Mirek
August 5, 2006 9:32:18 PM

That kind of reminds me of people saying 640K or 16Mb or x amount of memory is enough ....... it is never enough, if we think 4 cores is enough, 4 years from now we will be proved wrong.
August 5, 2006 9:39:24 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit.


You should look at how its done on cell.

August 5, 2006 10:00:19 PM

Yeah but I like diagrams. :)  Its interesting to note that they use a ring bus so they have fixed latency between the cores whilst in the x86 world we're using switches and it looks like intels CSI also uses a switch. It'll be interesting to see which is more scalable as we're only using 2-4 cores rather then up to 9.

I'll write about it later on, been up all night drinking so yeah.
August 5, 2006 10:50:17 PM

From what I heard from my professor ( he has been doing research on thermal design for CPU ), he said that it is hard to control parts of CPU if they just try to make a single core faster and faster. To gain more clock speed, the general idea is to add more pipeline or change the layout on the chip. Well, we see that netburst has more pipe depth, faster clock speed, but overall it is slower.

I think one advantage that Intel now has is that their chips are now on 65, which make it uses less power, faster clocks, and cheaper in all. That's why they add more cores. I do believe that if AMD get their hand on 65 or 45 technology using the same architecture design, it may actually beat Core 2 duo.
August 5, 2006 11:46:48 PM

Quote:
Nope, that is bullshit. 16-bit code was NEVER interpreted and still is not even on current CPUs (where the hell you have heard that?).

I beg to differ all x86-32 and x86-64 IA's have backwards compatibility for x86-16 and x86-8 for that matter. They decode, execute, and retire somewhat the same, just depend on the operating mode.

Some highlighted information you should read:
Source 1, I know its Wiki but I don’t really feel obligated to go to Intel's site and pull up their PDF's.
Quote:
...The x86 general purpose registers are not really as general purpose as their name implies... The x86 general purpose registers further subdivide into registers specializing in data and others specializing in addressing...

Quote:
...with the advent of the 64-bit extensions to x86 in AMD64... General purpose registers are now truly general purpose and they can be used interchangeably. This does not affect the 32-bit architecture, however...

Quote:
...8-bit and 16-bit subsets of these registers are also accessible. For example, the lower 16-bits of the 32-bit EAX registers can be accessed by calling it the AX register. Some of the 16-bit registers can be further subdivided into 8-bit subsets too; for example, the upper 8-bit half of AX is called AH, and the lower half is called AL. Similarly, EBX is subdivided into BX (16-bit), which in turn is divided into BH and BL (8-bit)...

Quote:
...All of the four following registers may be used as general purpose registers. However each has some specialized purpose as well. Each of these registers also have 16-bit or 8-bit subset names.

EAX (At 000) Dedicated accumulator which is used for all major calculations.
ECX (At 001) The universal loop counter which has a special interpretation for loops.
EDX (At 010) The data register, which is an extension to the accumulator, stores data relevant to the operation applied to the accumulator.
EBX (At 011) Currently used for free storage but was originally used as a pointer in 16-bit mode...

Quote:
...Used only for address pointing. They have 16-bit subset names, but no 8-bit subsets.

ESP (At 100) Stack pointer. Is used to hold the top address of the stack.
EBP (At 101) Base pointer. Is used to hold the address of the current stack frame. It is also sometimes used as free storage.
ESI (At 110) Source index. Commonly used for string operations. It has a one-byte opcode for loading data from memory to the accumulator.
EDI (At 111) Destination index. Commonly used for string operations. Has a one-byte STOS instruction to write data out of the accumulator.
EIP Instruction pointer. Holds the current instruction address...

Quote:
64-bits is much more than just expanding registers. If it would be just expanding registers by 32-bits, then AMD64 would perform slower in 64-bit mode as all RISC architectures that gone through such transition do (I mean, keeping ISA and just extending registers). (Surprise? Yes, 64-bit mode is SLOWER on any other dual mode ISA, because 64-bit mode requires twise as much data to store pointers, therefore has much higher demands on cache and memory subsystem).

Excuse me; do you even understand what you are saying? This is what x86-64 (AMD64) is:

Source 2, again I know Wiki sucks.

Quote:
...All general-purpose registers (GPRs) are expanded from 32 to 64 bits, and all arithmetic and logical operations, memory-to-register and register-to-memory operations, etc., are now directly supported for 64-bit integers...

Quote:
...In addition to increasing the size of the general-purpose registers, their number is increased from eight (i.e. eax,ebx,ecx,edx,ebp,esp,esi,edi) in x86-32 to 16..

Quote:
...Similarly, the number of 128-bit XMM registers (used for Streaming SIMD instructions) is also increased from 8 to 16...

Quote:
...A number of "system programming" features of the x86 architecture are not used in modern operating systems and are not available on AMD64 in long (64-bit and compatibility) mode. These include segmented addressing (although the FS and GS segments remain in vestigial form, for compatibility with Windows code), the task state switch mechanism, and Virtual-8086 mode. These features do of course remain fully implemented in "legacy mode," thus permitting these processors to run 32-bit and 16-bit operating systems without modification...

Quote:
...long mode...The intended primary mode of operation of the architecture; it is a combination of the processor's native 64-bit mode and a 32-bit/16-bit compatibility mode. It is used by 64-bit operating systems. Under a 64-bit operating system, 64-bit, 32-bit and 16-bit (or 80286) protected mode applications may be supported...

Quote:
...legacy mode...The mode used by 16-bit (protected mode or real mode) and 32-bit operating systems. In this mode, the processor acts just like an x86 processor, and only 16-bit or 32-bit code can be executed. 64-bit programs will not run...

Quote:
There are two things that make AMD64 faster and none of them has anything to do with 64-bits:

- 8 additional integer registers. That is what improves integer performance.

- floating point operations are exclusively handled by scalar SSE2 instructions. While this is of course in theory possible with any SSE2 capable CPU, most of software compiled for 32-bits stays away from doing so for compatibility reasons. For this, AMD64 brings nice clean cut

Those registers are not accessed unless in 64bit mode, which may very well change in the future, but right now at this very moment AMD64, in 64bit mode the K8 can access to those registers.

As for the SSE statement, you are partially right. While it is true SSE is becoming more predominate in code, but that is only because it does more accurate math operands and is faster than x87 in FP operands.

But in 64bit mode, x87 stacks are not permitted by any 64bit operating system. Reason being is they are not required to preserve the x87 stack across interrupts and context switches. But in 32bit mode a programmer can safely use either or.

Additionally for the record the SSE engine is what is handling the instructions, not SSE2. SSE2/3/4 are additional instructions/algorithms to help SSE deal with much more complex and robust math.
August 5, 2006 11:49:24 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!


They do it just fine with the Itanium.

A server is a heavy multitasker and it really is the case when multi core works better (because thei're not sharing resources of the same process).
I mentioned it earlier; you can't have somebody working on 32 intensive apps on a desktop

Why do you need to be working on 32 tasks why not 4 or 8 tasks?
August 5, 2006 11:51:17 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!


They do it just fine with the Itanium.

A server is a heavy multitasker and it really is the case when multi core works better (because thei're not sharing resources of the same process).
I mentioned it earlier; you can't have somebody working on 32 intensive apps on a desktop

Why do you need to be working on 32 tasks why not 4 or 8 tasks?

word
August 5, 2006 11:51:45 PM

Or even 1 task like a game.
August 5, 2006 11:52:02 PM

Quote:
You totally missed the point; synchronizung all those cores is increases inefficiency whatever they do.You'll have to control 64 busses, 64 CPUs 64 of everything; it's not as easy as controlling a big unit. A car with two 150hp engines cannot go as fast as one with a 300hp one, though summing the power gives the same result: you have more FRICTION!


They do it just fine with the Itanium.

A server is a heavy multitasker and it really is the case when multi core works better (because thei're not sharing resources of the same process).
I mentioned it earlier; you can't have somebody working on 32 intensive apps on a desktop

Why do you need to be working on 32 tasks why not 4 or 8 tasks?

word

Word.
Hmm Sexy Lady!
August 5, 2006 11:55:41 PM

Quote:
Nope, that is bullshit. 16-bit code was NEVER interpreted and still is not even on current CPUs (where the hell you have heard that?).

I beg to differ all x86-32 and x86-64 IA's have backwards compatibility for x86-16 and x86-8 for that matter. They decode, execute, and retire somewhat the same, just depend on the operating mode.

Some highlighted information you should read:
Source 1, I know its Wiki but I don’t really feel obligated to go to Intel's site and pull up their PDF's.
Quote:
...The x86 general purpose registers are not really as general purpose as their name implies... The x86 general purpose registers further subdivide into registers specializing in data and others specializing in addressing...

Quote:
...with the advent of the 64-bit extensions to x86 in AMD64... General purpose registers are now truly general purpose and they can be used interchangeably. This does not affect the 32-bit architecture, however...

Quote:
...8-bit and 16-bit subsets of these registers are also accessible. For example, the lower 16-bits of the 32-bit EAX registers can be accessed by calling it the AX register. Some of the 16-bit registers can be further subdivided into 8-bit subsets too; for example, the upper 8-bit half of AX is called AH, and the lower half is called AL. Similarly, EBX is subdivided into BX (16-bit), which in turn is divided into BH and BL (8-bit)...

Quote:
...All of the four following registers may be used as general purpose registers. However each has some specialized purpose as well. Each of these registers also have 16-bit or 8-bit subset names.

EAX (At 000) Dedicated accumulator which is used for all major calculations.
ECX (At 001) The universal loop counter which has a special interpretation for loops.
EDX (At 010) The data register, which is an extension to the accumulator, stores data relevant to the operation applied to the accumulator.
EBX (At 011) Currently used for free storage but was originally used as a pointer in 16-bit mode...

Quote:
...Used only for address pointing. They have 16-bit subset names, but no 8-bit subsets.

ESP (At 100) Stack pointer. Is used to hold the top address of the stack.
EBP (At 101) Base pointer. Is used to hold the address of the current stack frame. It is also sometimes used as free storage.
ESI (At 110) Source index. Commonly used for string operations. It has a one-byte opcode for loading data from memory to the accumulator.
EDI (At 111) Destination index. Commonly used for string operations. Has a one-byte STOS instruction to write data out of the accumulator.
EIP Instruction pointer. Holds the current instruction address...

Quote:
64-bits is much more than just expanding registers. If it would be just expanding registers by 32-bits, then AMD64 would perform slower in 64-bit mode as all RISC architectures that gone through such transition do (I mean, keeping ISA and just extending registers). (Surprise? Yes, 64-bit mode is SLOWER on any other dual mode ISA, because 64-bit mode requires twise as much data to store pointers, therefore has much higher demands on cache and memory subsystem).

Excuse me; do you even understand what you are saying? This is what x86-64 (AMD64) is:

Source 2, again I know Wiki sucks.

Quote:
...All general-purpose registers (GPRs) are expanded from 32 to 64 bits, and all arithmetic and logical operations, memory-to-register and register-to-memory operations, etc., are now directly supported for 64-bit integers...

Quote:
...In addition to increasing the size of the general-purpose registers, their number is increased from eight (i.e. eax,ebx,ecx,edx,ebp,esp,esi,edi) in x86-32 to 16..

Quote:
...Similarly, the number of 128-bit XMM registers (used for Streaming SIMD instructions) is also increased from 8 to 16...

Quote:
...A number of "system programming" features of the x86 architecture are not used in modern operating systems and are not available on AMD64 in long (64-bit and compatibility) mode. These include segmented addressing (although the FS and GS segments remain in vestigial form, for compatibility with Windows code), the task state switch mechanism, and Virtual-8086 mode. These features do of course remain fully implemented in "legacy mode," thus permitting these processors to run 32-bit and 16-bit operating systems without modification...

Quote:
...long mode...The intended primary mode of operation of the architecture; it is a combination of the processor's native 64-bit mode and a 32-bit/16-bit compatibility mode. It is used by 64-bit operating systems. Under a 64-bit operating system, 64-bit, 32-bit and 16-bit (or 80286) protected mode applications may be supported...

Quote:
...legacy mode...The mode used by 16-bit (protected mode or real mode) and 32-bit operating systems. In this mode, the processor acts just like an x86 processor, and only 16-bit or 32-bit code can be executed. 64-bit programs will not run...

Quote:
There are two things that make AMD64 faster and none of them has anything to do with 64-bits:

- 8 additional integer registers. That is what improves integer performance.

- floating point operations are exclusively handled by scalar SSE2 instructions. While this is of course in theory possible with any SSE2 capable CPU, most of software compiled for 32-bits stays away from doing so for compatibility reasons. For this, AMD64 brings nice clean cut

Those registers are not accessed unless in 64bit mode, which may very well change in the future but right now this very moment AMD64, is faster because in 64bit mode, while in that mode it has access to those registers.

As for the SSE statement you are partially right. It is true SSE does more, and is becoming more predominate in code because it does much more accurate math and is faster than x87.

But that is only true in 64bit mode, because in 64bit mode x87 stacks are not permitted by 64bit operating systems are which are not required to preserve the x87 stack across interrupts and context switches. But in 32bit mode that is not 100% the case, so you have a half truth. Additionally for the record SSE is what is handling the instructions not SSE2, SSE2/3/4 are additional instructions/algorithms to help SSE deal with much more complex math.

Damn spud
I didnt know you had it in you.
Hats off and I second that
Word
August 6, 2006 12:20:36 AM

Quote:
Darn spud --- you saved me a lot of time --- but I was incorrect on the 16->32 bit, I had read up on this a few years back and thought I recalled that the instruction set was completely revamped, alas it was simply extended similar to the 64-bit extension, abeit differently.

Just wanted to correct the record, the 16 bit ran well enough under protected mode --- what I recall was the the IA-32 when it released ran 32-bit code slower than 16-bit at the time.

Jack


You weren't incorrect you just lacked correct information. But it doesn't matter because in my eyes and many other forum members you are the holy grail of knowledge, and can't be declared wrong because, well your a techno holy man to me.
August 6, 2006 12:24:21 AM

From things I've read.. I'd take Jacks Knowledge over the Inquirer any day. :D 
August 6, 2006 12:26:04 AM

Quote:
From things I've read.. I'd take Jacks Knowledge over the Inquirer any day. :D 


Word.
August 6, 2006 12:29:07 AM

i dont know all i know is i am building a new pc for my first time and i need some suggestions from some people on what i should go with...starting wit processors and mother boards. i am thinkin amd just because i think all the games i played on the amd looks better then the intel but thats jjust me.,......
ANY SUGGESTIONS
August 6, 2006 12:32:41 AM

Quote:
Darn spud --- you saved me a lot of time --- but I was incorrect on the 16->32 bit, I had read up on this a few years back and thought I recalled that the instruction set was completely revamped, alas it was simply extended similar to the 64-bit extension, abeit differently.

Just wanted to correct the record, the 16 bit ran well enough under protected mode --- what I recall was the the IA-32 when it released ran 32-bit code slower than 16-bit at the time.

Jack


You weren't incorrect you just lacked correct information. But it doesn't matter because in my eyes and many other forum members you are the holy grail of knowledge, and can't be declared wrong because, well your a techno holy man to me.

Thanks Spud that is indeed a humbling compliment, but on the flip side I cannot claim to be objective and fair minded if I cannot admit an error once in a while --- I am happy when people hold me accountable and I need to retract -- the lesson learned is stronger .... I just wish he would have done it a little more politely ... but, eh... I can be gruff too.

Jack

I agree talking to you in that manner doesn't win Word from me, Moo is more likely.
August 6, 2006 1:36:49 AM

Quote:
Or even 1 task like a game.


By the way I love the new avatar.
!