Sign in with
Sign up | Sign in
Your question

Just where is intel ahead?

Last response: in CPUs
Share
January 17, 2007 1:58:47 AM

Quote:
I have spent the last 35 years as an electrical engineer and the last 13 before retirement at Sandia National Labs and based on the testing done there you can confine Intels superiority to single socket processors with 32 bit operations.

"“While not number one in speed, in terms of scalability, Red Storm is the best in the world,” says Bill Camp, director of Sandia’s Computation, Computers, Information, and Math center.

Scalability refers to a supercomputer’s computational efficiency as the number of processors on a job is increased. “You want to use more processors to get large jobs done more quickly,” says Camp, “but if the computer doesn’t scale well you can lose much of that speedup.” Red Storm loses little efficiency on large numbers of processors." http://www.physorg.com/news82830306.html

Here are the relevant benchmarks for 64 bit. Cut off date was Nov 6 2006 and 31 Intel entries were made including 1 QFX6700. http://www.hpcchallenge.org/ Based on Sandia's experience with the Intel Thunderbird design, Intel is 3-4 years behind AMD in scalablity and 64 bit design. Broomfield is the first chip in Intel's roadmap that will begin to offer the scalablity needed in the future. The situation is such that DARPA left Intel out completely in the second round of the Petascale computer competion. The competion was narrowed to IBM, Sun and Cray/AMD. IBM and Cray/AMD were the ultimate winners. http://www.theregister.co.uk/2006/11/21/darpa_petascale...

This story explains what will be necessary for performance leads in the future and CPu speed isn't one of them.
"“The interconnect is important because a lot of the applications rely on low latency and high bandwidth,” said Bill Kramer, NERSC general manager. “We run highly parallel applications. One application may make use of 50 or 100 individual nodes.”

Because high-performance computing applications are increasingly spread out over so many processors, how fast they perform comes down in large part to how fast individual nodes can communicate with one another." http://www.gcn.com/print/25_5/40021-1.html
You may want to broaden your reading horizions to publications like physorg.com. , gcn.com, hpcwire.com, linuxelectron.com, and some of the engineering based websites and not the amateur sites.

Dear God, someone just randomly PMed me this.. :p 

More about : intel ahead

January 17, 2007 2:09:18 AM

Quote:
I have spent the last 35 years as an electrical engineer and the last 13 before retirement at Sandia National Labs and based on the testing done there you can confine Intels superiority to single socket processors with 32 bit operations.

"“While not number one in speed, in terms of scalability, Red Storm is the best in the world,” says Bill Camp, director of Sandia’s Computation, Computers, Information, and Math center.

Scalability refers to a supercomputer’s computational efficiency as the number of processors on a job is increased. “You want to use more processors to get large jobs done more quickly,” says Camp, “but if the computer doesn’t scale well you can lose much of that speedup.” Red Storm loses little efficiency on large numbers of processors." http://www.physorg.com/news82830306.html

Here are the relevant benchmarks for 64 bit. Cut off date was Nov 6 2006 and 31 Intel entries were made including 1 QFX6700. http://www.hpcchallenge.org/ Based on Sandia's experience with the Intel Thunderbird design, Intel is 3-4 years behind AMD in scalablity and 64 bit design. Broomfield is the first chip in Intel's roadmap that will begin to offer the scalablity needed in the future. The situation is such that DARPA left Intel out completely in the second round of the Petascale computer competion. The competion was narrowed to IBM, Sun and Cray/AMD. IBM and Cray/AMD were the ultimate winners. http://www.theregister.co.uk/2006/11/21/darpa_petascale...

This story explains what will be necessary for performance leads in the future and CPu speed isn't one of them.
"“The interconnect is important because a lot of the applications rely on low latency and high bandwidth,” said Bill Kramer, NERSC general manager. “We run highly parallel applications. One application may make use of 50 or 100 individual nodes.”

Because high-performance computing applications are increasingly spread out over so many processors, how fast they perform comes down in large part to how fast individual nodes can communicate with one another." http://www.gcn.com/print/25_5/40021-1.html
You may want to broaden your reading horizions to publications like physorg.com. , gcn.com, hpcwire.com, linuxelectron.com, and some of the engineering based websites and not the amateur sites.

Dear God, someone just randomly PMed me this.. :p 

Looks like someone is trying to show their internet intelligence or something.

I didn't know Intel had a Thunderbird named CPU. I thought AMD had a Thunderbird named CPU. Maybe I'm wrong.

To me, it looks like someone just cut and pasted something and sent it to you, in hopes of proving their "smarts" to you. I guess.

If someone can't simplify it to leyman's terms, they are just quoting something they heard, in my opinion. I mean, if someone asked me how my tool works, I won't go into great detail of every process that happens, but I will try to make it easy for the person to understand. Not so much dumb it down, but make it more easily digestable. If they ask for more detail, then I will get more detailed, but why do that right from the start. I can cut and paste the process to someone, but without actually breaking it down, it's just a case of "Look what I found, and I'll use it to show my big brain".

Of course, this is just my opinion...I would hate to get something like that in a PM, though.

LOL :lol: 
January 17, 2007 2:14:44 AM

Quote:
I have spent the last 35 years as an electrical engineer and the last 13 before retirement at Sandia National Labs and based on the testing done there you can confine Intels superiority to single socket processors with 32 bit operations.

"“While not number one in speed, in terms of scalability, Red Storm is the best in the world,” says Bill Camp, director of Sandia’s Computation, Computers, Information, and Math center.

Scalability refers to a supercomputer’s computational efficiency as the number of processors on a job is increased. “You want to use more processors to get large jobs done more quickly,” says Camp, “but if the computer doesn’t scale well you can lose much of that speedup.” Red Storm loses little efficiency on large numbers of processors." http://www.physorg.com/news82830306.html

Here are the relevant benchmarks for 64 bit. Cut off date was Nov 6 2006 and 31 Intel entries were made including 1 QFX6700. http://www.hpcchallenge.org/ Based on Sandia's experience with the Intel Thunderbird design, Intel is 3-4 years behind AMD in scalablity and 64 bit design. Broomfield is the first chip in Intel's roadmap that will begin to offer the scalablity needed in the future. The situation is such that DARPA left Intel out completely in the second round of the Petascale computer competion. The competion was narrowed to IBM, Sun and Cray/AMD. IBM and Cray/AMD were the ultimate winners. http://www.theregister.co.uk/2006/11/21/darpa_petascale...

This story explains what will be necessary for performance leads in the future and CPu speed isn't one of them.
"“The interconnect is important because a lot of the applications rely on low latency and high bandwidth,” said Bill Kramer, NERSC general manager. “We run highly parallel applications. One application may make use of 50 or 100 individual nodes.”

Because high-performance computing applications are increasingly spread out over so many processors, how fast they perform comes down in large part to how fast individual nodes can communicate with one another." http://www.gcn.com/print/25_5/40021-1.html
You may want to broaden your reading horizions to publications like physorg.com. , gcn.com, hpcwire.com, linuxelectron.com, and some of the engineering based websites and not the amateur sites.

Dear God, someone just randomly PMed me this.. :p 

Ignore that moron.
Related resources
January 17, 2007 2:19:34 AM

Has anyone else gotten the same copy and paste job..or was this one made specifically for me? 8)
January 17, 2007 2:20:25 AM

Quote:
Has anyone else gotten the same copy and paste job..or was this one made specifically for me? 8)


I never got one, thank goodness. 8O
January 17, 2007 3:14:18 AM

That man is a loon. :roll:
January 17, 2007 3:43:48 AM

This is a little off the topic.
But can someone explain why Intel needs to make a Conroe and an Itanium?? If the Itanium is better (and costs a lot more). Let that be the high end server/workstation design and desktops will get smaller older versions later.
If the Conroe is better, just kill th Itanium.
There is a lot of x86 software already, just make an emulator as like the Mac OSX running OS9 stuff. The newer chip SHOULD be so much faster, you won't notice the emulator lag.
Anyway, I was wondering about this for years.
a b à CPUs
January 17, 2007 3:50:23 AM

Quote:
This is a little off the topic.
But can someone explain why Intel needs to make a Conroe and an Itanium?? If the Itanium is better (and costs a lot more). Let that be the high end server/workstation design and desktops will get smaller older versions later.
If the Conroe is better, just kill th Itanium.
There is a lot of x86 software already, just make an emulator as like the Mac OSX running OS9 stuff. The newer chip SHOULD be so much faster, you won't notice the emulator lag.
Anyway, I was wondering about this for years.


You CANNOT compare the two

Itanium isnt for desktop pcs
January 17, 2007 3:55:53 AM

Its his modus operandi. He cant push his crap in the public forum, because it gets shredded in minutes, so he PMs people. That way he doesnt have to prove himself or suffer being publicaly disproven and he thinks it makes himself look knowlegble
January 17, 2007 4:13:34 AM

Ok,
So in my simple vocabulary, the Itanium is a better IDEA than Conroe.
AMD x64 won because it was WAY cheaper. Cheap and simple won over better,
Like you said, the radically different compliers didn't catch on.
Strange, I thought the Pentium Pro should have been the LAST x86 chip.
January 17, 2007 4:30:10 AM

Quote:
This is a little off the topic.
But can someone explain why Intel needs to make a Conroe and an Itanium?? If the Itanium is better (and costs a lot more). Let that be the high end server/workstation design and desktops will get smaller older versions later.
If the Conroe is better, just kill th Itanium.
There is a lot of x86 software already, just make an emulator as like the Mac OSX running OS9 stuff. The newer chip SHOULD be so much faster, you won't notice the emulator lag.
Anyway, I was wondering about this for years.


Itanium is (and I may be the only person who believes this) actually the future of processing.

Let me qualify that by saying conceptually, Itanium is the future, the current implementation of the concept is, um, ah, ah.... a work in progress.

Itanium uses something called EPIC - explicitely parralel instruction computing - in the old days we called that VLIW - Very Long instruction word.

Unlike Netburst, Athlon, P4, etc which try to extract parrallelism at runtime, EPIC because it has a long instruction word actually submits at least 3, sometimes more inherently parrallel instructions to the CPU for symultanous execution.

In a modern CPU roughly half the logic circutry (excluding cache) is actually used to re-order, reshedule, and other wise bend, fold, splindle and mutilate the data to try to extract more than one instruction per clock cycle - in EPIC this is done prior to runtime at the compiler. In effect, since it happen before runtime, it happens, from an execution point of view - at infinite speed.

On code specifically written and compiled for Itanium (ok, there are actually only 147 fully debugged lines in the known universe, but leaving that aside) - The itanium is actually amazingly fast.

At some point CPUs will hit a wall in terms of how fast a single thread can go. At that point you must go wider to go faster. - Itanium and EPIC (or something conceptually similar) by virtue of doing all the heavy scheduling work at compile time rather than run time, has to be, at some point, the future.
January 17, 2007 8:26:27 AM

He is a sick moron!
I've recived similar BS, tried to explain once and than ignored him!
Quote:
I distinctly remember some K6-2-500+'s (and k62-550+and 600+ and 650+)that would kick a P3 650 around along with any P2 ever made. . I also remember Pentium Pro 233's that would kick Pentium 3 800's around on win 2k. The pentium Pro 333 overdrive was not matched until the P4 came around. I was certified as a relay engineer by IEEE in 1975. Obviously you do not understand the design use of hyperthreading. In two years time I predict that there will be no single core processors left in the mainstream market. everything will be at least dual core and more likely quad core. That is where reverse hyperthreading makes sense. What AMD's design statements say is that there will be no more single cores in the future. Just so you know I was engineering manager for a 9800 processor pentium pro super computer.


My first job was a hardware servicer, at age of 14 in 1996, Infoproject Computers, 1000 Skopje, R. Macedonia(ex SFR Yugoslavia). Since than I have worked and tested more than 2000 computer configurations. I had different working expiriences, as network administartor, hardware servicer, programmer and other not relative to computers. I am 24 now and I am at the end of my studies of power engineering. When Yugoslavia was divided I went to the army defending my country, therefore wasting 2 years of my life.
Before be so sure about something, try to find some benchmarks. It is mostly like you never tested K6 against Pentium 2/3. I have conclusions thanks to the benchmarks we've made. This is nice article form anand, it might help: http://www.anandtech.com/showdoc.aspx?i=1029&p=1
Obviosly you do not know what I understand and what I know.
I do understand exactly what HT means, why is it usefull for one and uselles for other architecture.
Parallel computing was a subject which was actual before more than 30 years, and abandoned for a while. It was more important to develop fast singlecore chips, just becouse the apps were singlethreaded and were starving from the low pefromance of one core.
What I was talking about is the same that the scientist concluded in that period, and that is that you can't boost the performance of single thread with parallel processing.
Just becouse there is order in doing the jobs, one job can't be done before other.
Parallel computing is usefull and implemented today becouse we are in the era of multimedia. Multimedia is becoming more general purpose, and general purpose is becoming more mutlimedia. It means a lot of independed data that can be processed independently and in parallel. So we find multicore chips usefull today.
Paralel processing does not mean reverse HT. I am not sure what do you think, but I know if you split a linear dependend thread to be done in parallel you will get worst performance than if not, due to time needed by each core for calculating that remains same, logic needed for spliting and synthesis, bus latencies, and etc.
These 2 or 3 years is period for transition from singlethreaded to multithreaded OS-es and apps, I have mentioned this in my post. Also there still will be singlecore chips for the cheap low-end office systems. See the roadmaps from both AMD&Intel.
I guess my name didn't give you a clue. Look up Sandia National Laboratory. I supervised the construction of ASCI Red 9800 pentium pros and 256 terabytes of memory. You might follow up with Red Storm and ASCI Purple. The K6-2 you are quoting the benchmarks for is not the same K6-2+ that I am talking about. The K6-2 550+ had 128 of L1 cache and 128of L2 cache and orginally was a laptop only processsor on the 18nm die.. When AMD released the athlon they were phased out. the K6-2+ is not even closely related to the K6-2 or K6-3 you are thinking of or Anadtech benched marked. The standard K6-2 is a 25nm die processor. I was certified by IEEE (the Institute of Electrical and Electronic Engineers)as a professional relay engineer in 1975. I have a bachelors degree in EE from Rice University and a masters degree from Califonia Institute of Technology. Both are top ten unversities in engineering in the United States. We used to have an ongoing game in Sandia about wjhen someone would release a processor tht would out perform the Pentium Pro 233. That didn't happen until the AMD Athlon Thunderbird 750 came out with the 256k full speed L2 cache. Pentium Pro 233's had either 1mb or 2mb of L2 cache.
January 17, 2007 10:24:36 AM

Quote:
This is a little off the topic.
But can someone explain why Intel needs to make a Conroe and an Itanium?? If the Itanium is better (and costs a lot more). Let that be the high end server/workstation design and desktops will get smaller older versions later.
If the Conroe is better, just kill th Itanium.
There is a lot of x86 software already, just make an emulator as like the Mac OSX running OS9 stuff. The newer chip SHOULD be so much faster, you won't notice the emulator lag.
Anyway, I was wondering about this for years.


Itanium is (and I may be the only person who believes this) actually the future of processing.


Itanium is history. Everybody knowing a dime about modern CPU architecture knows that pretty well. VLIW approach does not scale for general purpose computing. There are some small areas like signal processing where VLIW is able to provide cost advantage, but for high performance server or HPC CPU it is not a good idea.

Quote:

At that point you must go wider to go faster. - Itanium and EPIC (or something conceptually similar) by virtue of doing all the heavy scheduling work at compile time rather than run time, has to be, at some point, the future.


Yes, that is naive understanding and basic ignorance of modern CPU architecture and its bottlenecks.

The real problem now is not extracting parallelism from instruction stream, that thing is solved for decades. The problem is to perform as many as possible in parallel and that is where in-order cores like EPIC simply suck.

Out of order execution is the presence and future of CPU design. EPIC is history.

Mirek
January 17, 2007 11:00:33 AM

Jumping Jack i was a little interested when i first saw the top of this thread so i went and read through your link

I have gotten a dozen or so..... it prompted me to call him out:

http://forumz.tomshardware.com/hardware/modules.php?nam...

"Memo to self don't argue with Jumping Jack or if you do forget anything Wiki related"
January 17, 2007 12:59:20 PM

Quote:
Jumping Jack i was a little interested when i first saw the top of this thread so i went and read through your link

I have gotten a dozen or so..... it prompted me to call him out:

http://forumz.tomshardware.com/hardware/modules.php?nam...

"Memo to self don't argue with Jumping Jack or if you do forget anything Wiki related"


You might want to make another "memo to self".

Link links properly ;)  :lol: 
January 17, 2007 1:18:15 PM

Check



"Sinks into own puddle of noobness"
January 17, 2007 1:32:45 PM

Conceptually, my take on VLIW is that is something like programming the microcode of a CPU directly. In today's world, the x86 instruction set is broken down into smaller instructions then execute. Basically, the CPU has to do a lot of extra work for each instruction. This can be tough to optimize through silicon.

VLIW attempts to simplify the CPU design, allowing it to execute more things in parallel and at higher speeds. The pipelines can be shorter and simpler (less of a need to breakdown the instructions) and there can be many more of them. The downside is that the burden of optimization falls on the compiler writers and, to a lesser degree, on the programmers themselves.

So the argument is, do you make the CPU do all the hard work or the compiler? Emulation performance and compatibility is arguably the make-or-break piece of technology that will determine which method wins in the long run. If it can't run the trillions of lines of code already written out there at least as well as the hardware we have now, then what is the point?
January 17, 2007 2:08:40 PM

Intel Thunderbird heh....

From a very unreliable source,
I heard AMD is going to resurrect Northwood design!
January 17, 2007 4:12:54 PM

Quote:
Conceptually, my take on VLIW is that is something like programming the microcode of a CPU directly. In today's world, the x86 instruction set is broken down into smaller instructions then execute.


Not only x86. Any modern high-performance OOO CPU has to do that.

Quote:

Basically, the CPU has to do a lot of extra work for each instruction. This can be tough to optimize through silicon.


Yes, OOO execution needs mork work per instruction. But also yields much better performance.

Quote:

The downside is that the burden of optimization falls on the compiler writers and, to a lesser degree, on the programmers themselves.


Well, that is half-truth. The real trouble is that compiler or programmer is able to optimize only for specific data.

Optimal way of execution strongly depends on data processed. EPIC is in trouble as it has only single fixed execution path.

Quote:

If it can't run the trillions of lines of code already written out there at least as well as the hardware we have now, then what is the point?


This argument is incorrect. Trillions of lines are not written is assembler. The whole truth is that EPIC concept fails as general purpose CPU.
January 17, 2007 4:40:07 PM

Quote:


The downside is that the burden of optimization falls on the compiler writers and, to a lesser degree, on the programmers themselves.


Well, that is half-truth. The real trouble is that compiler or programmer is able to optimize only for specific data.

Optimal way of execution strongly depends on data processed. EPIC is in trouble as it has only single fixed execution path.

Quote:

If it can't run the trillions of lines of code already written out there at least as well as the hardware we have now, then what is the point?


This argument is incorrect. Trillions of lines are not written is assembler. The whole truth is that EPIC concept fails as general purpose CPU.

I'm not sure what you mean by "only for specific data".

As for my trillion lines of code comment, it still stands. Not every bit of code is going to be recompiled for the new instruction set, so any new platform like this is going to need binary compatibility or one hell of a good emulator. Legacy applications (for argument's sake, anything that cannot be recompiled) will still need to function if it is going to make headway in today's x86 dominated world.

VLIW/EPIC is just a very different animal and it is a lot bigger problem than even Apple faced with its transition from the PowerPC to x86. I wouldn't say it fails as a general purpose CPU, though. It just commands a very different mentality than what we are used to today. Probably too different and too big of a bite to try at this stage of the game. If Itanium had come out 10 years ago (or back when Windows NT was running on several hardware platforms) it could have been written as a different story.

It would be interesting if Intel tried a multi-core approach that had both Core and Itanium cores, assuming they could come up with some compelling reason to do that. Which begs the question of what is Itanium so good at that it makes the x86 stuff seem like "old technology"? Anything? Does it have that potential?
a c 102 à CPUs
January 17, 2007 4:42:19 PM

The Itanium and Woodcrest CPUs are *extremely* different even though they're both 64-bit dual-core CPUs made by Intel.

The Itanium was designed to compete with the likes of IBM's PowerPC CPUs and other RISC CPUs. It has its own special instruction set that is purely 64-bit and is in no way, shape, or form compatible with any of Intel's other CPUs. The Itaniums can throughput 11 instructions per second on a relatively short pipeline. They don't do out-of-order or "branchy" code very well- in fact, they're terrible at it, just like the Cell and far, far worse than the branchy-code-hater we all know, the Pentium 4. Like the Cell, the compiler is EXTREMELY important in making code run fast on the Itanium. But when the compiler and code work well on the Itanium, it's an absolute beast at doing iterative simulations and the like, which is one reason why the Itanium is generally only seen in places that do this kind of work (my university has about 500 Itanium 2 CPUs in a data center working on various simulations.) Itanium units are frequently run as clusters, with many units working on one problem. Itanium setups are expensive, so it's rare to see small installations with Itanium CPUs- it's generally large data centers or HPC setups that use these CPUs.

The Woodcrest is a general-purpose x86 CPU that can handle 16-bit, 32-bit, and 64-bit instructions, depending on what mode it is set into by the OS just after boot-up (all x86 CPUs start in 16-bit mode IIRC.) It has a pipeline of 14 stages, which is very average for a modern x86 chip of its clock speed. The Woodcrest has good out-of-order code execution, so "branchy" code or the use of a suboptimal compiler won't reduce its throughput very much. The Woodcrest has particularly good SSE execution abilities and excellent integer math computation power. You will see Woodcrest units being used in any manner of small-server and workstation setups where one needs 3 or 4 CPU cores available. It is common to see Woodcrest CPUs in file servers, Web servers, rendering and CAD workstations, and even some "professional-level" desktops.
January 17, 2007 5:14:21 PM

Quote:

I'm not sure what you mean by "only for specific data".


Consider e.g. L2 cache and memory latency. If data are not L1, EPIC has to stall processing. Woodcrest simply continues with independent instruction.

EPIC has "prefetch" instructions to help with this (to load data ahead), but the trouble is that it is not always known at compile time what kind of prefetch will algorithm need.

Quote:

As for my trillion lines of code comment, it still stands. Not every bit of code is going to be recompiled for the new instruction set, so any new platform like this is going to need binary compatibility or one hell of a good emulator. Legacy applications (for argument's sake, anything that cannot be recompiled) will still need to function if it is going to make headway in today's x86 dominated world.


Well, that certainly does not seem to be the problem for PowerPC.

Quote:

I wouldn't say it fails as a general purpose CPU, though. It just commands a very different mentality than what we are used to today.


I have heard this many times, but nobody ever explained what that different mentality should be..

Quote:

Probably too different and too big of a bite to try at this stage of the game. If Itanium had come out 10 years ago (or back when Windows NT was running on several hardware platforms) it could have been written as a different story.


Itanium had come 7 years ago. Do you think these 3 years would make such difference?

Also, it is was the last non-x86 platform of Windows.

Quote:

Which begs the question of what is Itanium so good at?


There are surely applications where Itanium shines - simply because it has really a lot of FP units.
January 17, 2007 5:32:30 PM

yeah, people cant just accept that intel is on top, and that is were they will stay until AMD get soming new, it a on going cycle, but if you dont back the winner you lose out, also blue gene beats redstorms ass
January 17, 2007 5:41:59 PM

Quote:
I have heard this many times, but nobody ever explained what that different mentality should be..


The vast majority of programmers today do not think about optimization until it becomes an issue. On VLIW/EPIC machines they would have to change they way they develop code. They could use much the same code (and languages) but they would have to spend much more time on optimization and be required to have a much deeper knowledge of the architecture they program for.

This is a complete reversal of the last 15 years. We, as an industry, have bred a lot of lazy programmers and it would be an uphill climb to retrain them again after all these years. It is too easy to "optimize later" in today's world of fast paced product releases. That mentality is easily what drives the need for faster and faster processors today.

Quote:
Itanium had come 7 years ago. Do you think these 3 years would make such difference?

Also, it is was the last non-x86 platform of Windows.


To be honest, I doubt it based on hind sight. But the possibility was still there. It is a radical design and would have stood out better back then. Granted, the first generation of Itanium was horrid, and that was the nail in the coffin. Had they started with Itanium 2, then I suspect those 3 years would have made a significant difference.
January 17, 2007 5:50:47 PM

Quote:
He is a sick moron!
I've recived similar BS, tried to explain once and than ignored him!
I distinctly remember some K6-2-500+'s (and k62-550+and 600+ and 650+)that would kick a P3 650 around along with any P2 ever made. . I also remember Pentium Pro 233's that would kick Pentium 3 800's around on win 2k. The pentium Pro 333 overdrive was not matched until the P4 came around. I was certified as a relay engineer by IEEE in 1975. Obviously you do not understand the design use of hyperthreading. In two years time I predict that there will be no single core processors left in the mainstream market. everything will be at least dual core and more likely quad core. That is where reverse hyperthreading makes sense. What AMD's design statements say is that there will be no more single cores in the future. Just so you know I was engineering manager for a 9800 processor pentium pro super computer.


My first job was a hardware servicer, at age of 14 in 1996, Infoproject Computers, 1000 Skopje, R. Macedonia(ex SFR Yugoslavia). Since than I have worked and tested more than 2000 computer configurations. I had different working expiriences, as network administartor, hardware servicer, programmer and other not relative to computers. I am 24 now and I am at the end of my studies of power engineering. When Yugoslavia was divided I went to the army defending my country, therefore wasting 2 years of my life.
Before be so sure about something, try to find some benchmarks. It is mostly like you never tested K6 against Pentium 2/3. I have conclusions thanks to the benchmarks we've made. This is nice article form anand, it might help: http://www.anandtech.com/showdoc.aspx?i=1029&p=1
Obviosly you do not know what I understand and what I know.
I do understand exactly what HT means, why is it usefull for one and uselles for other architecture.
Parallel computing was a subject which was actual before more than 30 years, and abandoned for a while. It was more important to develop fast singlecore chips, just becouse the apps were singlethreaded and were starving from the low pefromance of one core.
What I was talking about is the same that the scientist concluded in that period, and that is that you can't boost the performance of single thread with parallel processing.
Just becouse there is order in doing the jobs, one job can't be done before other.
Parallel computing is usefull and implemented today becouse we are in the era of multimedia. Multimedia is becoming more general purpose, and general purpose is becoming more mutlimedia. It means a lot of independed data that can be processed independently and in parallel. So we find multicore chips usefull today.
Paralel processing does not mean reverse HT. I am not sure what do you think, but I know if you split a linear dependend thread to be done in parallel you will get worst performance than if not, due to time needed by each core for calculating that remains same, logic needed for spliting and synthesis, bus latencies, and etc.
These 2 or 3 years is period for transition from singlethreaded to multithreaded OS-es and apps, I have mentioned this in my post. Also there still will be singlecore chips for the cheap low-end office systems. See the roadmaps from both AMD&Intel.
I guess my name didn't give you a clue. Look up Sandia National Laboratory. I supervised the construction of ASCI Red 9800 pentium pros and 256 terabytes of memory. You might follow up with Red Storm and ASCI Purple. The K6-2 you are quoting the benchmarks for is not the same K6-2+ that I am talking about. The K6-2 550+ had 128 of L1 cache and 128of L2 cache and orginally was a laptop only processsor on the 18nm die.. When AMD released the athlon they were phased out. the K6-2+ is not even closely related to the K6-2 or K6-3 you are thinking of or Anadtech benched marked. The standard K6-2 is a 25nm die processor. I was certified by IEEE (the Institute of Electrical and Electronic Engineers)as a professional relay engineer in 1975. I have a bachelors degree in EE from Rice University and a masters degree from Califonia Institute of Technology. Both are top ten unversities in engineering in the United States. We used to have an ongoing game in Sandia about wjhen someone would release a processor tht would out perform the Pentium Pro 233. That didn't happen until the AMD Athlon Thunderbird 750 came out with the 256k full speed L2 cache. Pentium Pro 233's had either 1mb or 2mb of L2 cache.
Uh is that for real???
January 17, 2007 5:50:50 PM

Quote:
I have heard this many times, but nobody ever explained what that different mentality should be..


The vast majority of programmers today do not think about optimization until it becomes an issue. On VLIW/EPIC machines they would have to change they way they develop code. They could use much the same code (and languages) but they would have to spend much more time on optimization and be required to have a much deeper knowledge of the architecture they program for.


Sorry, but this is still just bold ranting.

Please, tell me exactly HOW we poor programmers are supposed to optimize for EPIC. You will find that there is a little we can do for general code. (Some very special FP intensive code is an exception, but that is already pretty well handled by compilers).
January 17, 2007 6:04:37 PM

Just LIES...

And if it's true, I'll remember that when I'll build a supercomputer.... in 20 years from now whenever I have 1+ million$ spare change in my pocket. Until then, I'll get a new PC with Core2Duo/Quad.

Moron!!!!!
January 17, 2007 8:21:07 PM

Quote:


The downside is that the burden of optimization falls on the compiler writers and, to a lesser degree, on the programmers themselves.


Well, that is half-truth. The real trouble is that compiler or programmer is able to optimize only for specific data.

Optimal way of execution strongly depends on data processed. EPIC is in trouble as it has only single fixed execution path.

Quote:

If it can't run the trillions of lines of code already written out there at least as well as the hardware we have now, then what is the point?


This argument is incorrect. Trillions of lines are not written is assembler. The whole truth is that EPIC concept fails as general purpose CPU.

I'm not sure what you mean by "only for specific data".

As for my trillion lines of code comment, it still stands. Not every bit of code is going to be recompiled for the new instruction set, so any new platform like this is going to need binary compatibility or one hell of a good emulator. Legacy applications (for argument's sake, anything that cannot be recompiled) will still need to function if it is going to make headway in today's x86 dominated world.

VLIW/EPIC is just a very different animal and it is a lot bigger problem than even Apple faced with its transition from the PowerPC to x86. I wouldn't say it fails as a general purpose CPU, though. It just commands a very different mentality than what we are used to today. Probably too different and too big of a bite to try at this stage of the game. If Itanium had come out 10 years ago (or back when Windows NT was running on several hardware platforms) it could have been written as a different story.

It would be interesting if Intel tried a multi-core approach that had both Core and Itanium cores, assuming they could come up with some compelling reason to do that. Which begs the question of what is Itanium so good at that it makes the x86 stuff seem like "old technology"? Anything? Does it have that potential?

Good question.
January 17, 2007 8:35:17 PM

Quote:
He is a sick moron!
I've recived similar BS, tried to explain once and than ignored him!
I distinctly remember some K6-2-500+'s (and k62-550+and 600+ and 650+)that would kick a P3 650 around along with any P2 ever made. . I also remember Pentium Pro 233's that would kick Pentium 3 800's around on win 2k. The pentium Pro 333 overdrive was not matched until the P4 came around. I was certified as a relay engineer by IEEE in 1975. Obviously you do not understand the design use of hyperthreading. In two years time I predict that there will be no single core processors left in the mainstream market. everything will be at least dual core and more likely quad core. That is where reverse hyperthreading makes sense. What AMD's design statements say is that there will be no more single cores in the future. Just so you know I was engineering manager for a 9800 processor pentium pro super computer.


My first job was a hardware servicer, at age of 14 in 1996, Infoproject Computers, 1000 Skopje, R. Macedonia(ex SFR Yugoslavia). Since than I have worked and tested more than 2000 computer configurations. I had different working expiriences, as network administartor, hardware servicer, programmer and other not relative to computers. I am 24 now and I am at the end of my studies of power engineering. When Yugoslavia was divided I went to the army defending my country, therefore wasting 2 years of my life.
Before be so sure about something, try to find some benchmarks. It is mostly like you never tested K6 against Pentium 2/3. I have conclusions thanks to the benchmarks we've made. This is nice article form anand, it might help: http://www.anandtech.com/showdoc.aspx?i=1029&p=1
Obviosly you do not know what I understand and what I know.
I do understand exactly what HT means, why is it usefull for one and uselles for other architecture.
Parallel computing was a subject which was actual before more than 30 years, and abandoned for a while. It was more important to develop fast singlecore chips, just becouse the apps were singlethreaded and were starving from the low pefromance of one core.
What I was talking about is the same that the scientist concluded in that period, and that is that you can't boost the performance of single thread with parallel processing.
Just becouse there is order in doing the jobs, one job can't be done before other.
Parallel computing is usefull and implemented today becouse we are in the era of multimedia. Multimedia is becoming more general purpose, and general purpose is becoming more mutlimedia. It means a lot of independed data that can be processed independently and in parallel. So we find multicore chips usefull today.
Paralel processing does not mean reverse HT. I am not sure what do you think, but I know if you split a linear dependend thread to be done in parallel you will get worst performance than if not, due to time needed by each core for calculating that remains same, logic needed for spliting and synthesis, bus latencies, and etc.
These 2 or 3 years is period for transition from singlethreaded to multithreaded OS-es and apps, I have mentioned this in my post. Also there still will be singlecore chips for the cheap low-end office systems. See the roadmaps from both AMD&Intel.
I guess my name didn't give you a clue. Look up Sandia National Laboratory. I supervised the construction of ASCI Red 9800 pentium pros and 256 terabytes of memory. You might follow up with Red Storm and ASCI Purple. The K6-2 you are quoting the benchmarks for is not the same K6-2+ that I am talking about. The K6-2 550+ had 128 of L1 cache and 128of L2 cache and orginally was a laptop only processsor on the 18nm die.. When AMD released the athlon they were phased out. the K6-2+ is not even closely related to the K6-2 or K6-3 you are thinking of or Anadtech benched marked. The standard K6-2 is a 25nm die processor. I was certified by IEEE (the Institute of Electrical and Electronic Engineers)as a professional relay engineer in 1975. I have a bachelors degree in EE from Rice University and a masters degree from Califonia Institute of Technology. Both are top ten unversities in engineering in the United States. We used to have an ongoing game in Sandia about wjhen someone would release a processor tht would out perform the Pentium Pro 233. That didn't happen until the AMD Athlon Thunderbird 750 came out with the 256k full speed L2 cache. Pentium Pro 233's had either 1mb or 2mb of L2 cache.

That's definitive proof that casewhite is off his rocker. The guy is accusing you of not knowing what a K6-2+ is, when he himself has no clue either! :roll:

For anyone wondering, the K6-2+ sucked. It suffered the same problems as any K6 (i.e., office apps were its strong point, anything FPU intensive required 3D NOW! optimizations for it to be competitive). While the addition of full speed on die L2 cache increased the performance of the K6 2+/K6-III, and K6-III+ over the K6-2, it still wasn't enough to best a similarly clocked Pentium II in most benchmarks. BTW, the k6-2+, like every K6, only has 64KB of L1 cache, and was never offered from the factory at speeds higher than 570MHz. A coppermine Pentium III blows it out of the water.

As for the Pentium Pro argument this clown makes, the Pentium II 300 outclassed the fastest Pentium Pro, and the Pentium II 450 surely trounced the Overdrive 333MHz for socket 8. It's not even a close contest.
January 18, 2007 2:09:04 AM

On the matter of private messages, if you feel that a PM to you is spam, abusive, or in any way merits consideration, please quote and forward the PM to a moderator like Jake_Barnes or RCPilot, along with an explanation as to why you are reporting this PM. Action can be taken to disable the users PM feature if deemed warranted by the moderators and/or administrator to prevent such from occurring again. It is generally considered ill mannered and impolite to post a PM on the public discussion board without prior consent.
Regards,
Ninja
!