Sign in with
Sign up | Sign in
Your question

AMDs K8L dead in the water?

Last response: in CPUs
Share
October 4, 2006 3:40:52 PM

From http://www.vr-zone.com/?i=4109

The Core arch is ~35% faster than the K8 arch at equivalent clock rates.

Altair / K8L will clock at 2.9 Ghz max.
Yorkfield XE will clock at 3.73 Ghz max.

Lets give a K8 at 2.9 Ghz a baseline performance rating of 1.0. Now the claimed performance increase of K8L over K8 is about 40%. Then the Altair / K8L will rate at 1.4.

Lets assume the Intel Yorkfield XE only performs as well as an existing Core arch chip, clock for clock. This will underestimate its performance, but lets assume this for now. Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 = 1.72

Therefore:
Intel Yorkfield XE will be 1.72/1.4 = ~23% faster than the AMD K8L!

Now lets be more generous to Intel and say that, clock for clock, the monolithic Yorkfield will be 10% faster than the existing core arch. (Remember AMD claims a 40% advantage for the move to a monolithic quad core). Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 * 1.1 = 1.89

Therefore:
Intel Yorkfield XE will be 1.89/1.4 = ~35% faster than the AMD K8L!

This is just on performance alone. Given that the Yorkfield uses a 45 nm process it should use about half the power than the 65 nm K8L does too.

8O for AMD
:trophy: for Intel

More about : amds k8l dead water

October 4, 2006 3:46:30 PM

Its useless to speculate, but thats an interesting way of doing it. I cant imagine AMD fanboys are going to let this one go. I have to say, I really hope the difference isn't that much, or there wont be any competition to speak of. Which is bad for us in the long run.
October 8, 2006 10:02:35 AM

That is an interesting way to look at it all right, but it doesn't really make sense. Using the same logic you would have said that the P4 would kill the K8 because P4 was faster and clocked higher than K7.

Unfortunately there are just so many unknowns right now about both Altair (k8L) and Yorkfield that we just don't know which one is going to be better.

I hope AMD can come up with a strong response to Core2. It would be a pity to go back to the days when your only choice is Intel, or Intel. Prices will be higher, and progress slower if AMD can't keep Intel on their toes.

Can't wait to see the 2 chips battle it out though! Should be very interesting!
Related resources
October 8, 2006 11:22:14 AM

Interesting... but it's still speculation. Yorkfield could in the end of the day not have a huge performance boost over kentsfield, instead really just cut down power consumption, but then again it could be even faster. We just do not know at the moment. Same goes with Altair. To be honest, I really want Altair to perform at the same level with kentsfield and yorkfield, why? Because competition is healthy, and that means prices for EVERYTHING will be forced down on both sides. Cheaper processors make consumers happy, and cheaper top-of-the-range processors keep enthusiasts happy as well. AMD will come up with a strong contender, they have to in order for them to hold on to the market share they gained with K7/K8 in the P4 days.
October 8, 2006 5:35:56 PM

I wouldn't say dead in the water until it comes out and we can see actual data stating a for or against case.
That being said, the Horde will be arriving anytime soon.
October 8, 2006 5:48:00 PM

Quote:
I wouldn't say dead in the water until it comes out and we can see actual data stating a for or against case.
That being said, the Horde will be arriving anytime soon.


Hello. I am here to represent the horde. 8)

My thoughts on this. That is only speculation based off curent designs.

Core 2 doesn't have a %40 advantage over AMD right now. In certain applications yes it got as high as %40, but in an overall overage it is more like %20. Anywhere from %0 to %40.

The horde has spoken. :p 
October 8, 2006 5:52:30 PM

I think it needs to be specified, is it a 20% or so overall performance increase over a certain AMD chip or the field of AMD chips. And is it comparing the family or just one Core 2. As Wusy would say, you have to look at more than just benchmarks in order to determine these things.
October 8, 2006 8:16:07 PM

Quote:
From http://www.vr-zone.com/?i=4109

The Core arch is ~35% faster than the K8 arch at equivalent clock rates.

Altair / K8L will clock at 2.9 Ghz max.
Yorkfield XE will clock at 3.73 Ghz max.

Lets give a K8 at 2.9 Ghz a baseline performance rating of 1.0. Now the claimed performance increase of K8L over K8 is about 40%. Then the Altair / K8L will rate at 1.4.

Lets assume the Intel Yorkfield XE only performs as well as an existing Core arch chip, clock for clock. This will underestimate its performance, but lets assume this for now. Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 = 1.72

Therefore:
Intel Yorkfield XE will be 1.72/1.4 = ~23% faster than the AMD K8L!

Now lets be more generous to Intel and say that, clock for clock, the monolithic Yorkfield will be 10% faster than the existing core arch. (Remember AMD claims a 40% advantage for the move to a monolithic quad core). Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 * 1.1 = 1.89

Therefore:
Intel Yorkfield XE will be 1.89/1.4 = ~35% faster than the AMD K8L!

This is just on performance alone. Given that the Yorkfield uses a 45 nm process it should use about half the power than the 65 nm K8L does too.

8O for AMD
:trophy: for Intel



Where have you seen AMD claim 40% increase over K8? The theoretical gains will be closer to 80% in integer and close to 150% in FP. They are doubling loads and (realistically retires) widening the L1 to 256bit, adding an extra FP unit and doubling the SSE FP to execute two instructions per cycle.


That is much larger than 40%. But we'll see.
October 8, 2006 8:51:59 PM

I generally figure 20% for Core 2 vs K8..

http://www.xbitlabs.com/news/cpu/display/20060731233200...
First they claim 4x4 will be 80% better at Cinebench.
http://www.amd.com/us-en/assets/content_type/Downloadab...
Here claims of 60% better "performance per watt". Geez, isn't comparing by clocks confusing enough?
http://www.hkepc.com/bbs/itnews.php?tid=678736&starttim...
Then here claiming 40% improvement.
http://www.vr-zone.com/?i=4109
If AMD it's true that it's 40% better than K8, which should be 17% better than Conroe... A 2.9 Altair should theoretically be at best case better than a 3.4 Conroe. Yorkfield should bring that down[even possibly gaining with all the new features], and VR-Zone reports that these will come in 3.46-3.73[nullifying the per clock gains of Altair] parts.

http://badhardware.blogspot.com/2006/10/amds-k8l-reveal...
AMD fan blog claims 15% improvement per clock, and 25% clock gain. Even more confusing. We should know more when AMD demonstrates their quads late this year :wink:
October 8, 2006 9:15:10 PM

I would like to say thats 100% true, but unfortunately I cant

I have always been an Intel FanBoy even though when I knew the truth (those 4 ugly years when AMD was the best bet and intel was dead with netburst)

But now with the Core architecture and the upcoming kentsfield and yorkfield the 45nm tech omg I cant just wait for a engineering sample review of yorkfield.
October 8, 2006 9:16:42 PM

Quote:
Baron, you an add 200% more SIMD's, 300% more FPUs and make them wider this will not translate into gains. The efficiency to get IPC is dependent upon the ability of the decoders and reorder buffer to maximize the use of the execution units.

If AMD stays 3 issue, then it will not show nearly as big of 'theoretical' performance boost that you are claiming.



The pictures of the die show an "enhanced IPC core" I assume that means it will peak above 3 IPC. If they don't know more than you do about how to increase IPC I guess they are doomed.
October 8, 2006 9:19:14 PM

How about you just wait for it to be released and tested by people who know what they're doing? Speculation is one thing, but damning a product that hasnt even seen the light of day is just a bit fanatical.

Besides, lets not forget that the K8L is only a revamp of an existing architecture, and not a complete re-write as the C2D was. It would only be fare ideally to compare new architectures when they are both available and there for hard testing.

Or are you just another Intel fanboy who wants to gloat, whilst secretly fearing that the C2D era could be short lived? Be patient, and all will be revealed.
October 8, 2006 9:43:12 PM

Quote:
Baron, you an add 200% more SIMD's, 300% more FPUs and make them wider this will not translate into gains. The efficiency to get IPC is dependent upon the ability of the decoders and reorder buffer to maximize the use of the execution units.

If AMD stays 3 issue, then it will not show nearly as big of 'theoretical' performance boost that you are claiming.



The pictures of the die show an "enhanced IPC core" I assume that means it will peak above 3 IPC. If they don't know more than you do about how to increase IPC I guess they are doomed.

Those are CAD pictures of what they hope the die will look like Baron, that’s the slides the share holders get to see because they aren't tech suave, take them with a grain off salt.
October 8, 2006 9:54:21 PM

Quote:
The pictures of the die show an "enhanced IPC core"

The picture of you show an "extremly stupid moron"
October 8, 2006 9:56:08 PM

Quote:
Baron, you an add 200% more SIMD's, 300% more FPUs and make them wider this will not translate into gains. The efficiency to get IPC is dependent upon the ability of the decoders and reorder buffer to maximize the use of the execution units.

If AMD stays 3 issue, then it will not show nearly as big of 'theoretical' performance boost that you are claiming.



The pictures of the die show an "enhanced IPC core" I assume that means it will peak above 3 IPC. If they don't know more than you do about how to increase IPC I guess they are doomed.

The IPC can be enhanced in all sorts of ways. Without any fusion tricks, the max theoretical would be directly the width of the core, a 3-issue core would be able to do a max of 3 instructions per clock. However, since code is interdependent, average IPC is always lower than what the core is capable of doing.

Intel, in my opinion, made a nice leap forward in IPC by rectifying one major dependency which would cause all other dispatches to wait, and that is load before store (memory disambiguation), AMD is working this into K8L, so yeah I expect a good boost for AMD --- but 40% sounds about right, not 80 or 150%.

Having said that, and the dicussion on IPC, I have posted this article before:
http://www.ece.utexas.edu/projects/ece/lca/ps/deepu-icc...

Now, it is more detailed than this argument but the data in table 5 is what I draw your attention to.... here you can see that the IPC is always less the the issue, and at some point the IPC saturates, even after the width goes up.... this saturation point is about 4. This makes sense in this architecture, as on average stores about 1/4 of the code mix, loads are about 1/3, math and logic functions make up the rest.

In otherwords, what if Intel zoomed up and hit the IPC max for the x86 code base? If this is the case, then the best AMD can do is meet not exceed. If this is true then it will fall back to who can get clock speed up within the power envelop.... Intel has that one by a long shot.

Jack


Again, if they don't know as much as you they're DOOMED. I don't care how many links you post.
October 8, 2006 10:04:48 PM

I pray he never post pictures of black people in reference to himself again. He just set us back 50 years.
October 8, 2006 10:04:55 PM

Quote:
Again, if they don't know as much as you they're DOOMED. I don't care how many links you post.

The key of your sucess is to study this book:

and ignore everything else.
October 8, 2006 10:17:34 PM

Quote:
Baron, you an add 200% more SIMD's, 300% more FPUs and make them wider this will not translate into gains. The efficiency to get IPC is dependent upon the ability of the decoders and reorder buffer to maximize the use of the execution units.

If AMD stays 3 issue, then it will not show nearly as big of 'theoretical' performance boost that you are claiming.



The pictures of the die show an "enhanced IPC core" I assume that means it will peak above 3 IPC. If they don't know more than you do about how to increase IPC I guess they are doomed.

Those are CAD pictures of what they hope the die will look like Baron, that’s the slides the share holders get to see because they aren't tech suave, take them with a grain off salt.

Is he going on about that CAD plot, or is he refering to the picture of the wafer in Charlie D.'s lap on the Inquirer.


That CAD plot or whatever is direct from AMD and clearly says (ambiguous though it maybe) "IPC-enhanced core." Ask them what it means. A logical person woud say it means MORE THAN THREE.

Isn't it time for you and goJdo's date?
October 8, 2006 10:19:50 PM

Quote:
Isn't it time for you and goJdo's date?


....
October 8, 2006 10:31:46 PM

Alright its time to back off. This has gotten out of hand. Apologize.
October 8, 2006 10:34:17 PM

He have to grow up once....I am being childish too.
Sorry for mentioning you in my lame post. I've deleted the contents.
October 8, 2006 11:41:55 PM

The basic answer is both quad core and quad father are a generation out fo date now that IBM is making the 1U an larger versions of Roadrunner (1.6 petaflops and 16000 opteron single cores and 16,000 Cell chips) available for the desktop. http://www-03.ibm.com/technology/splash/qs20/ Roadrunner will be 100gigaflops per Opteron/Cell chip about 125% of the very best quad cores and use 33% of the power. The QS20 series will not require multiple licenses from Microsoft to operate. since it will use either a single core or dual core opteron. Do note that IBM says that the Cell is an accelerator chip. You can buy this today and you can't buy either Kentsfield or Quadfather. The QS 20 is already installed at the University of Manchester among others.
For those unfamiliar with roadrunner:

"The Roadrunner system, along with the Protein Explorer and the seventh-fastest supercomputer, Tokyo Institute of Technology's Tsubame system built by Sun Microsystems (SC Online READERS' CHOICE PRODUCT OF 2005: Sun Microsystems Sun Fire servers), illustrate a new trend in supercomputing: combining general-purpose processors with special-purpose accelerator chips.

"IBM's BladeCenter systems are amenable to the hybrid approach. A single chassis can accommodate both general-purpose Opteron blade servers and Cell-based accelerator systems. The BladeCenter chassis includes a high-speed communications links among the servers, and one source said the blades will be used in Roadrunner.

"Advanced Micro Devices' Opteron processor is used in supercomputing "cluster" systems that spread computing work across numerous small machines joined with a high-speed network. In the case of Roadrunner, the Cell processor, designed jointly by IBM, Sony and Toshiba, provides the special-purpose accelerator.

"Cell originally was designed to improve video game performance in the PS3 console. The single chip's main processor core is augmented by eight special-purpose processing cores that can help with calculations such as simulating the physics of virtual worlds. Those engines also are amenable to scientific computing tasks, IBM has said.


"On average, Cell is eight times faster and at least eight times more power-efficient than current Opteron and Itanium processors, despite the fact that Cell's peak double-precision performance is fourteen times slower than its peak single-precision performance. If Cell were to include at least one fully usable pipelined double-precision floating-point unit, as proposed in the Cell+ implementation, these performance advantages would easily double." http://www.supercomputingonline.com/article.php?sid=118...

Here is the IEEE paper that explains why quad cores are obsolete and accelerators are the future. of computing:
"However, with the addition of the customized Reconfigurable Data Cache, the resulting system runs 5× faster and outperforms the reference microprocessor." A 2005 paper by Prof Ron Sass Univ. of Kansas and one of his grad students Pradeep Nalabalapu

http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourc...

Anybody that wants to read the complete paper send me your email and I will send you the PDF if you are not a dues paying member of IEEE.
October 8, 2006 11:45:18 PM

All this is speculation, but I have the following points to note:

1) It seems odd that in a year's time and with a new process, AMD would not be able to increase clock speed at all when going from native dual-core to native quad-core. That would suggest some premature design limitation in the K8L, or an underdeveloped manufacturing process. In the same time, Intel supposedly would be going through a similar process shrink, doubling of core count, and cache increase, but on top of this they're going to increase the clock speed by 25%.

Perhaps Intel is using some of the headroom found in today's C2D's, as we've all noticed their unusual overclock potential. When was the last time people overclocked a top-of-the-line chip by 30% on air?

2) Historically, the only time AMD gained the upper hand with an equivalent-generation architecture was when Intel released the poorly performing Pentium 4 about 6 years ago. I was here reading on THG, and the reviewer was quite displeased with the initial performance, causing me to purchase my first-ever AMD-based system. All other times, Intel's process edge gave them the performance and cost advantage over other desktop CPU designs, regardless of minor weaknesses in their own architecture.

3) Seeing that AMD probably won't be keeping up with Intel on the manufacturing front, unless Intel releases another grossly underperforming architecture, I don't see how AMD would grab back the performance crown except with outrageously elaborate setups - e.g., a four-socket K6-III setup sure may overwhelm the 1P Pentium Pros, but at what cost?

4) Performance isn't everything. As we see on the forums, most people don't go out and buy an FX-62 or an X6800, even though review sites so many times refer to both. I think AMD should go after niche markets, and Torrenza is an excellent start.
October 9, 2006 12:00:06 AM

Quote:

If Cell were to include at least one fully usable pipelined double-precision floating-point unit, as proposed in the Cell+ implementation, these performance advantages would easily double." http://www.supercomputingonline.com/article.php?sid=118...


Do you know what's happening with regard to Cell+?
October 9, 2006 12:08:13 AM

Quote:
The basic answer is both quad core and quad father are a generation out fo date now that IBM is making the 1U an larger versions of Roadrunner


While the numbers would appear to make next year's processors already out-of-date, you're comparing them with something built from the ground up not to run x86 or general apps but a different specialized language altogether. Cell is very powerful, but so are Altivec, UltraSparc T1, Alpha 21164, and so forth.

If you try video encoding on IBM's system, you'll probably get the now-measly performance of an Athlon because the Cell processor isn't made to encode video, nor is there a video encoder to my knowledge programmed in Cell language.

But if you try Folding@Home, then with a patched version incorporating a few Cell math libraries you probably could gain that order of magnitude of performance.

In the future, I don't see why accelerator boards and Torrenza plug-in chips can't coexist with multicore CPUs - it could be an enthusiast's dream. The CPUs are all-purpose and will multitask anything you throw at them, while the accelerators are specialized and fast, but require individually patched program code.
October 9, 2006 1:45:26 AM

yeh some good points there , I dont see AMD gaining a performance lead , for a long while , Intel has a massive lead on them , and its going to take Intel to bring out a Shocker of a CPU to have AMD take the performance crown

But who knows , they now Have ATI to help them develop aswell as IBM , so maybe they could develop something better
October 9, 2006 1:50:57 AM

Quote:


2) Historically, the only time AMD gained the upper hand with an equivalent-generation architecture was when Intel released the poorly performing Pentium 4 about 6 years ago. I was here reading on THG, and the reviewer was quite displeased with the initial performance, causing me to purchase my first-ever AMD-based system. All other times, Intel's process edge gave them the performance and cost advantage over other desktop CPU designs, regardless of minor weaknesses in their own architecture.

3) Seeing that AMD probably won't be keeping up with Intel on the manufacturing front, unless Intel releases another grossly underperforming architecture, I don't see how AMD would grab back the performance crown except with outrageously elaborate setups - e.g., a four-socket K6-III setup sure may overwhelm the 1P Pentium Pros, but at what cost?


You know those are extremely interesting points, the clear implication being that the only reason AMDs outperformed the intels was not because of superior design on AMDs part, but inferior design on Intels. OK so thats an old and oft made point, but the future implication that AMD will never again experiance a mainstream advantage unless Intel screws up once more is, in its way, new.

Many people, myself included, presumed it would be a matter of time before AMD regained a performance lead through Uarch design (if they didnt screw up) which it would eventually lose again to Intel. Back and forth, back and forth etc. From your perspective, which sadly I can find no real flaw with (much as I would like to), AMD will never regain a mainstream advantage. Bummer. High priced exotic systems and high priced underperforming mainstrem parts arent going save AMD.
October 9, 2006 1:53:18 AM

Out of those two, IBM looks to be of more help than ATI.
October 9, 2006 2:45:26 AM

Quote:
That CAD plot or whatever is direct from AMD and clearly says (ambiguous though it maybe) "IPC-enhanced core." Ask them what it means. A logical person woud say it means MORE THAN THREE.

I don't think that's necessarily true. I don't think any 3 issue core whether K8 or Banias averages the full 3 IPC from issue to retire. By the same token Core 2 doesn't average the full 4 IPC from issue to retire either. I'm pretty sure a 3 issue core would be lucky to average between 2 to 2.5 IPC. Hannibal here said that average 2.5 IPC is already the most you can expect in most code.

http://arstechnica.com/articles/paedia/cpu/hyperthreadi...

Quote:
Note that this is an exceedingly common scenario, since research has shown the average ILP that can be extracted from most code to be about 2.5 instructions per cycle. (Incidentally, this is why the Pentium 4, like many other processors, is equipped to issue at most 3 instructions per cycle to the execution core.)

What Intel's 4+1 issue core and all it's fusions and other feature allows it to do is get the average higher, but even then I doubt they get the average higher than 3 IPC very often. So essentially, Intel put a lot of effort and resources into getting that additional 0.5 average increase in IPC. That's not to say 4+1 issue is completely useless of course since they are peak conditions that can exploit this just not on average.

Now there have been several sources that have said K8L will remain a 3 issue core including Phil Hestor, the Chief Technology Officer of AMD.

http://translate.google.com/translate?hl=en&sl=ja&u=htt...

Quote:
As for AMD, although it increases to 2 times, the x86 order decoder section under order fetch, it stops order fetch while they are 3 decoder constitution.

 “Several minor expansion were done even in the order decoder, but they are not big ones”, that Hester talks. Hester recognized the fact that either the maximum number of decoding orders does not change in 3 orders/the cycle. This is the point which differs from Core MA which increased the decoder to 4 largely.

X-Bit Labs K8L summary also confirms K8L is a 3 issue core. In constrast, the only indication that K8L may be a 4 issue core is a several month old die plot that shows 4 micro-ROMs which does not necessarily mean there are 4 decoders and we don't know how the K8L architecture has changed since that time.

What AMD means by enhanced IPC core is K8L's ability to get closer to the full 3 IPC potential of the core. If AMD can do it without resulting to the more brute force approach of Core 2 then good for them. How will AMD be accomplishing this? Mainly through enhanced OoO execution with load before load ability and with the widened L1 cache buses.

The 40% figure for K8L improvement over K8 is actually AMD's own claim.

http://www.hkepc.com/bbs/itnews.php?tid=678736&starttim...

That claim is fairly reasonable based on the architectural improvements. We've already discussed before my stance on why 80-100% average improvements are not likely. I'd need to have confirmation that the entire cache system has been widened not just the L1 links, doubling and widening of the FMISC unit, some type of load before store ability, improved FP/SSE subsystem scheduling logic, etc. before I think that type of average improvement is possible.

One of the things people get enthusiastic about is the doubling of SSE resources. The thing is that it isn't as much a doubling as a splitting of the 2 FP/SSE units into 2 dedicated FP units and 2 dedicated SSE units. This means that to see the doubled performance you would need to have 2 FP macro-ops and 2 SSE macro-ops executing in parallel. Otherwise if you have 4 SSE instructions or even 2 SSE instructions you won't see much improvement at all over K8 because you still have 2 units even if they are now dedicated. I should also point out that momentum is to convert FP code into SSE code which means the situations where FP and SSE code appear in parallel are reducing not increasing.

Further, the FP/SSE subsytem in K8 is limited by poor scheduling where they use a round robin system. For instance macro-op 1 goes to unit 1, macro-op 2 to unit 2, op 3 to unit 3, op 4 to unit 1, op 5 to unit 2, etc. This doesn't take into accound the actual usage patterns of the units. For example if unit 2 has a long queue and unit 1 is completely empty, if the round robin system saids that it's unit 2's turn it will get the macro-op even if using unit 1 would be faster. There are also latency scheduling limitations that happen in a similar way. Without improved scheduling logic, which AMD hasn't mentioned, widening the FP/SSE subsystem isn't useful at all.

The issue with the FMISC unit is that it acts as the FSTORE unit. AMD's presentations have shown that it remains unwidened at 64-bits and there is only 1 of them. The FMISC would then be another major bottleneck of the widened FP/SSE subsystem unless it is improved too. There are a few other issues too, but I've said enough.

In AMD's defence, as an architecture K8L should on average be as fast if not faster than Core 2. I've mentioned it before that K8L's advantage over Core 2 will probably be around 10% on average, with FP code taking the lead for AMD followed by SSE and integer still being in Intel's favour. However, Intel can counter that lead with higher clock speeds which is what they are looking to do.
October 9, 2006 2:50:13 AM

Quote:
From http://www.vr-zone.com/?i=4109

8O for AMD
:trophy: for Intel


I heard a rumour from a friend that AMD is folding after reading your scientific calculations.
October 9, 2006 3:16:58 AM

Quote:
All this is speculation, but I have the following points to note:

1) It seems odd that in a year's time and with a new process, AMD would not be able to increase clock speed at all when going from native dual-core to native quad-core. That would suggest some premature design limitation in the K8L, or an underdeveloped manufacturing process. In the same time, Intel supposedly would be going through a similar process shrink, doubling of core count, and cache increase, but on top of this they're going to increase the clock speed by 25%.

Perhaps Intel is using some of the headroom found in today's C2D's, as we've all noticed their unusual overclock potential. When was the last time people overclocked a top-of-the-line chip by 30% on air?

2) Historically, the only time AMD gained the upper hand with an equivalent-generation architecture was when Intel released the poorly performing Pentium 4 about 6 years ago. I was here reading on THG, and the reviewer was quite displeased with the initial performance, causing me to purchase my first-ever AMD-based system. All other times, Intel's process edge gave them the performance and cost advantage over other desktop CPU designs, regardless of minor weaknesses in their own architecture.

3) Seeing that AMD probably won't be keeping up with Intel on the manufacturing front, unless Intel releases another grossly underperforming architecture, I don't see how AMD would grab back the performance crown except with outrageously elaborate setups - e.g., a four-socket K6-III setup sure may overwhelm the 1P Pentium Pros, but at what cost?

4) Performance isn't everything. As we see on the forums, most people don't go out and buy an FX-62 or an X6800, even though review sites so many times refer to both. I think AMD should go after niche markets, and Torrenza is an excellent start.


Well said.

No, the QS20 will run Windows Server 2003 if you wish after all most everything for 1U is written for Server 2003. I have yet to see any 1U run the supercomputer type languages you mention. The installation at the University of Manchester uses Suse Linux 10.1 and the one at Stuttgart uses RedHat 4.0 IBM already has the necessary drivers for windows and suport for Linux. . As for Roadrunner it will run either linux or Fortan. Not the languages you mention. The cost of the module is in line with anyone elses 1U. Based on the costs for Roadrunner figure somewhere around $3000 for a 1U with case video card harddrives and everything but the OS. Total cost for Roadrunner is somewhere around $55 million and that includes the Voltaire Infiniband server network. When Cell+ gets here the performance margin will at least double. For those who don't keep up, Stanford Medical School has a system based on the Opteron/ATI graphics card that is good for around 3(350 gigaflops per core /GPU combo) times what QS20 is capable of today. It runs on your computer just like folding at home.

Cell+ looks like it will be here next summer. . As to the main stream board makers not supporting dual socket , they will be out of business if they don't. The typical Torrenza for the desktop will be a Opteron 2xx or later with the Cell+. I think that for every day use AMD will be the chip of choice because if you look at Dell's offerings for the money you spend on 4 Intel dual cores you can buy 5 AMDs. And as noted in the article here at Tom's on performance /power ratio AMD has Intel beat. For government agencies and state and local agencies operating on federal grant money, the new GSA purchasing guidelines spell out efficiency ratios(200watts total) that Intel has a very hard time meeting but most of the AM2's like the X2-3800 do(higher end doesn't make it either).
The other thing that everyone seems to miss is that the Cell/Opteron numbers are double precision (64 bit ) and Intels' numbers are single precision (32 bit) . Take a look at Thunderbird(EM64T), when it switches from single precision Rpeak to double precison Rmax it drops 42%. Red Storm drops only 16%. So you better think about what 64 bit Vista will do. As to the gaming enthusiast market , there is nothing more irrrelvant in the business. The total enthusiast market is 100-150,000 cpus a year. DOE bought 500,000 cpu's between June 1 and Sept 30 this year. That is 1 fiscal year's cycle. None of which were Intel despite the presence of Core2. Note that Roadrunner's RFP was issued in May and Intel did provide Woodcrest engineering samples. . There will be 200,000 for Blue Gene L, 19,000 Opteron Quad cores for Baker at Oak Ridge(if the IBM Cell proposal had been available at the time of the RFP it would not have been quad core.), 5200 for exchanging 152's for 185's on Jaguar at Oak Ridge and 21000 for the expansion of Jaguar to 250 teraflops. Red Storm will get 11,000 to replace 150's with 185's. and a futher 20,000 in expansion. Coyote A and Coyote B will get 3200 upgrade cpu's and 20,000 for expansion. Peloton at Lawrence Livermore is 16,000 opterons, 1000 285's to replace 252's for Gauss and 19,000 Opterons for Franklin at NERSC. If you want to know how well Core 2 does in double precision go to the NERSC web site and contact OPI. The results of SSP are public record. http://www.lbl.gov/CS/Archive/news081006.html
DOE does not use 32 bit benchmarks like the trade press does, all tests are double precision 64 bit.

Any one who reads the IEEE paper by Ron Sass will understand why Intel is behind the curve at this point. You will also understand why power hungy quad cores of any brand are like Netburst a dead end. As the IBM marketing people say for servers we are at the point where operating costs for utilites are more than the cost of the hardware. Quadcore requires 3 times the electric power to operate and three times the AC as an Opteron accelerator combo."A watt is about a dollar a year if you have the things on all the time, so 10 megawatts per year equates to $10 million in operating expenses."
That is for Roadrunner . Baker will cost more like $30 million to operate and Intel's quad cores won't be any more energy efficient. Let's put it this way. with $3/gal. gasoline are you going to buy a Cadillac Escalade that gets 12 miles per gallon or a Toyota Prius that gets 38 miles per gallon? That is why quadcores, be they AMD or Intel are dead ends right now. There is no economic model where either quad core makes any sense at all. The measurement in industry is now performance per watt not performance per clock cycle. K8L would not see the light of day if the Baker contract at Oak Ridge didn't exist..
October 9, 2006 3:24:38 AM

Quote:
From http://www.vr-zone.com/?i=4109

The Core arch is ~35% faster than the K8 arch at equivalent clock rates.

Altair / K8L will clock at 2.9 Ghz max.
Yorkfield XE will clock at 3.73 Ghz max.

Lets give a K8 at 2.9 Ghz a baseline performance rating of 1.0. Now the claimed performance increase of K8L over K8 is about 40%. Then the Altair / K8L will rate at 1.4.

Lets assume the Intel Yorkfield XE only performs as well as an existing Core arch chip, clock for clock. This will underestimate its performance, but lets assume this for now. Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 = 1.72

Therefore:
Intel Yorkfield XE will be 1.72/1.4 = ~23% faster than the AMD K8L!

Now lets be more generous to Intel and say that, clock for clock, the monolithic Yorkfield will be 10% faster than the existing core arch. (Remember AMD claims a 40% advantage for the move to a monolithic quad core). Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 * 1.1 = 1.89

Therefore:
Intel Yorkfield XE will be 1.89/1.4 = ~35% faster than the AMD K8L!

This is just on performance alone. Given that the Yorkfield uses a 45 nm process it should use about half the power than the 65 nm K8L does too.

8O for AMD
:trophy: for Intel


the history is repeating itself. Pentium 4 EE vs Athlon FX
October 9, 2006 3:30:09 AM

Quote:

Lets assume the Intel Yorkfield XE only performs as well as an existing Core arch chip, clock for clock. This will underestimate its performance, but lets assume this for now. Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 = 1.72



Underestimate?Dont make a joke here. Yorkfield architecture is same as today's c2d, with the exception of monolithic quad core and some enhancement; pci-e 2.0, 45nm....
October 9, 2006 4:52:43 AM

thanx mr frog
October 9, 2006 5:27:16 AM

Quote:
That CAD plot or whatever is direct from AMD and clearly says (ambiguous though it maybe) "IPC-enhanced core." Ask them what it means. A logical person woud say it means MORE THAN THREE.

I don't think that's necessarily true. I don't think any 3 issue core whether K8 or Banias averages the full 3 IPC from issue to retire. By the same token Core 2 doesn't average the full 4 IPC from issue to retire either. I'm pretty sure a 3 issue core would be lucky to average between 2 to 2.5 IPC. Hannibal here said that average 2.5 IPC is already the most you can expect in most code.

http://arstechnica.com/articles/paedia/cpu/hyperthreadi...

Quote:
Note that this is an exceedingly common scenario, since research has shown the average ILP that can be extracted from most code to be about 2.5 instructions per cycle. (Incidentally, this is why the Pentium 4, like many other processors, is equipped to issue at most 3 instructions per cycle to the execution core.)

What Intel's 4+1 issue core and all it's fusions and other feature allows it to do is get the average higher, but even then I doubt they get the average higher than 3 IPC very often. So essentially, Intel put a lot of effort and resources into getting that additional 0.5 average increase in IPC. That's not to say 4+1 issue is completely useless of course since they are peak conditions that can exploit this just not on average.

Now there have been several sources that have said K8L will remain a 3 issue core including Phil Hestor, the Chief Technology Officer of AMD.

http://translate.google.com/translate?hl=en&sl=ja&u=htt...

Quote:
As for AMD, although it increases to 2 times, the x86 order decoder section under order fetch, it stops order fetch while they are 3 decoder constitution.

 “Several minor expansion were done even in the order decoder, but they are not big ones”, that Hester talks. Hester recognized the fact that either the maximum number of decoding orders does not change in 3 orders/the cycle. This is the point which differs from Core MA which increased the decoder to 4 largely.

X-Bit Labs K8L summary also confirms K8L is a 3 issue core. In constrast, the only indication that K8L may be a 4 issue core is a several month old die plot that shows 4 micro-ROMs which does not necessarily mean there are 4 decoders and we don't know how the K8L architecture has changed since that time.

What AMD means by enhanced IPC core is K8L's ability to get closer to the full 3 IPC potential of the core. If AMD can do it without resulting to the more brute force approach of Core 2 then good for them. How will AMD be accomplishing this? Mainly through enhanced OoO execution with load before load ability and with the widened L1 cache buses.

The 40% figure for K8L improvement over K8 is actually AMD's own claim.

http://www.hkepc.com/bbs/itnews.php?tid=678736&starttim...

That claim is fairly reasonable based on the architectural improvements. We've already discussed before my stance on why 80-100% average improvements are not likely. I'd need to have confirmation that the entire cache system has been widened not just the L1 links, doubling and widening of the FMISC unit, some type of load before store ability, improved FP/SSE subsystem scheduling logic, etc. before I think that type of average improvement is possible.

One of the things people get enthusiastic about is the doubling of SSE resources. The thing is that it isn't as much a doubling as a splitting of the 2 FP/SSE units into 2 dedicated FP units and 2 dedicated SSE units. This means that to see the doubled performance you would need to have 2 FP macro-ops and 2 SSE macro-ops executing in parallel. Otherwise if you have 4 SSE instructions or even 2 SSE instructions you won't see much improvement at all over K8 because you still have 2 units even if they are now dedicated. I should also point out that momentum is to convert FP code into SSE code which means the situations where FP and SSE code appear in parallel are reducing not increasing.

Further, the FP/SSE subsytem in K8 is limited by poor scheduling where they use a round robin system. For instance macro-op 1 goes to unit 1, macro-op 2 to unit 2, op 3 to unit 3, op 4 to unit 1, op 5 to unit 2, etc. This doesn't take into accound the actual usage patterns of the units. For example if unit 2 has a long queue and unit 1 is completely empty, if the round robin system saids that it's unit 2's turn it will get the macro-op even if using unit 1 would be faster. There are also latency scheduling limitations that happen in a similar way. Without improved scheduling logic, which AMD hasn't mentioned, widening the FP/SSE subsystem isn't useful at all.

The issue with the FMISC unit is that it acts as the FSTORE unit. AMD's presentations have shown that it remains unwidened at 64-bits and there is only 1 of them. The FMISC would then be another major bottleneck of the widened FP/SSE subsystem unless it is improved too. There are a few other issues too, but I've said enough.

In AMD's defence, as an architecture K8L should on average be as fast if not faster than Core 2. I've mentioned it before that K8L's advantage over Core 2 will probably be around 10% on average, with FP code taking the lead for AMD followed by SSE and integer still being in Intel's favour. However, Intel can counter that lead with higher clock speeds which is what they are looking to do.


If K8 is as fast as it is with the averaging of IPC, how fast will it be with increasing the average? That was my point. Hannibal has said that it will catch up and maybe surpass Core 2 in integer perf and by doubling the SSE FP and regular FP, it should get MAJOR increases in encoding.

AMD is a major CPU company and I believe that they are trying to beat their own processors, not Intel's. By concentrating on that they can get 60-80% increases.
Phil Hester was also quoted as saying he doesn't know the specifics, just that the engineers are telling him it will be much faster because of technical things out of his scope.

Again, I don't know what enhcanced IPC core means and neither do you. We can only base any increases on what K8 currently does not on what Core 2 does.

If they put all of that extra real estate into the chip and can't get more instructions retired then they can't expect to call this a next gen chip.

Out of order staores is totued as some major enhancement but it outdistanced Intel's own chips by mush more than AMDs.

Because AMD relies on low latency operation and L3 will certainly reduce latency in combination with the reported enhanced prefetch, this is yet another area where AMD will get increases.


Theoretical increases based on specs shows at least 60% and perhaps even 80% on average (combining the increases between SSE, FP and Integer).

After all of the BS about it, I hope it flops so everyone can have a laugh. NetBust flopped and everyone stayed around. Any fairweather friends can get off the bus now.

AMD is working on improving K8. I don't care if it's faster than Core 2, only that it's sufficient for my workload.

Though AMD was accused of making excuses, FPS is no longer the factor, since GPU is more responsible with the increasing complexity of graphics.

GPUs would kill any CPU so AMD just has to improve by their promoted 40-60% increase - depending on whether you talk to an engineer or an executive. Dirk Meyer would be the authority for CPU manuf processes and theoretical increases. I believe he quoted 80% in his Analyst Day presentation.
October 9, 2006 5:30:23 AM

Ya know, Ive pointed this out to you before. But I'll do it again. Please dont try to use this stat as proof of anything, as it is pure BS and proves nothing.

Quote:
DOE bought 500,000 cpu's between June 1 and Sept 30 this year. That is 1 fiscal year's cycle. None of which were Intel despite the presence of Core2


1-Core 2 Duo was not available to anyone in retail form prior to 23 July including the US gov.
2-OEMs were not able to release Core 2 Dou systems until the same time
3-The contracts of which you speak were anounced, bid, settled and finalized BEFORE 23 july, thereby effective negating the possibility that any vendor could have supplied systems with those CPUs
4-since you want to take stats out of context, and you obviuosly have no clue as to how a governemt contract works, lets talk about the DOD NMCI contract.

Navy Marine Corps Intranet (NMCI):
NMCI Timeline
Following are milestones defining the EDS and Department of the Navy (DoN) relationship since the signing of the Navy Marine Corps Intranet (NMCI) contract:

Oct 6, 2000
DoN awards NMCI contract to EDS. Contract is valued at $6.9 billion ($4.1 billion for the five-year base period plus one additional three-year option for a total value of $6.9 billion).
Oct 2000
Department of Defense (DoD) authorizes the DoN to order up to 60,000 seats.
Sep 11, 2001
EDS begins helping DoN reconstitute its IT infrastructure at the Pentagon following terrorist attack. Crisis and relocation efforts are completed. EDS helps the DoN recreate all Navy IT capabilities lost in the Pentagon and bring approximately 700 people back online within a week of terrorist attack.
Sep 28, 2001
EDS and DoN announce signing of a seven-month, $9 million task order under the NMCI contract to help implement the Navy’s Task Force Web initiative.
May 3, 2002
EDS successfully completes Contractor Testing and Evaluation. DoD authorizes Navy to order 100,000 additional seats. (The total number of seats authorized to date is 160,000.)
Oct 30, 2002
DoN announces that the base period of the NMCI contract has been extended from five to seven years, bringing the minimum guaranteed value of the contract to $6 billion.
Dec 11, 2002
Navy briefs DoD on successful completion of the NMCI Operational Assessment.
Feb 2003
Congress lifts 60,000 cutover seat cap on NMCI, allowing EDS to cut over all NMCI seat orders.
Feb 2003
DoD authorizes Navy to order an additional 150,000 seats as a result of EDS meeting service level agreements on 20,000 seats cut over. (The new total number of seats authorized to date becomes 310,000.)
Mar 24, 2003
EDS assumes responsibility for first U.S. Marine Corps seats at Marine Corps Base Quantico in Quantico, Va.
August 2003
EDS assumes responsibility for NMCI seats at Marine Corps bases in Japan.
Oct 4, 2004
EDS reaches agreement with DoN on modifications to the NMCI contract and establishes new service levels based on commercial IT best practices, a critical step to 100 percent billing.
Dec 31, 2005
The network stopped 20 million unauthorized access attempts in 2005, and it trapped, quarantined and disinfected 70,000 viruses. In 2005, 292,230 seats were cut over to NMCI.
Mar 24, 2006
DoN and EDS sign NMCI contract extension to 2010 http://www.eds.com/sites/nmci/timeline/


So please, spare me and everyone else here the BS that Core2Duo "lost" any DOE contracts in an attempt to prove your points


Oh by the way, not a single solitary NMCI seat, not a one, was an AMD

CLIN TITLE
0001AA Fixed Workstation - Red Seat - $2958.12 per year. Pentium 800MHz. Provides performance for use with 2-D and light 3-D graphics or engineering related applications, applications that require additional processing capability.
0001AB Fixed Workstation - White Seat - $2863.68 per year. Pentium III 733MHz. Ideal for the typical user of Microsoft Office Professional softwar.
0001AC Fixed Workstation - Blue Seat - $2788.08 per year. Celeron 566MHz. Provides adequate performance for daily office productivity applications. Ideal for administrative functions.
0001AD Fixed Workstation - Thin Client - $2335.92 per year.
0002 Portable Seat - $3699.00 per year. Dell Latitude C600. Provides excellent performance for office productivity software. Supports users needing remote access to NMCI. Enables high-quality presentations while on travel.
October 9, 2006 5:31:07 AM

Quote:
Yorkfield architecture is same as today's c2d, with the exception of monolithic quad core and some enhancement; pci-e 2.0, 45nm....


Umm...doesn't that pretty much make it a completely different architecture? I've never seen 45nm uarch before.

I think that alone will be a huge factor in how well these quads perform. If Intel can start really culling power consumption and boosting overclockability on 45nm then who knows what they'll be able to do?
October 9, 2006 5:52:31 AM

i wonder how an overclocked 65nm K8L compare agaisnt an overclocked 45nm Yorkfield.
October 9, 2006 5:55:26 AM

Quote:
i wonder how an overclocked 65nm K8L compare agaisnt an overclocked 45nm Yorkfield.


interesting !! , and can AMD's 65nm wafers compete with Intel's 45nm wafers?
October 9, 2006 8:12:42 AM

Quote:
i wonder how an overclocked 65nm K8L compare agaisnt an overclocked 45nm Yorkfield.


interesting !! , and can AMD's 65nm wafers compete with Intel's 45nm wafers?

i would love to compare it with Cadbury chocolate wafer.
October 9, 2006 7:44:54 PM

Quote:

The 40% figure for K8L improvement over K8 is actually AMD's own claim.


As usual, a pleasure to go through another of your typical posts.

What I might add, is a conjecture on the underlying reason why AMD claims such a suspicion-induced improvement, on an incremental upgrade over an existing microarchitecture: Why 40%, indeed?

Up until now (and as far as I know), transistor gate straining techniques have been focused in one single direction of the gate's oxide lattice (either in pMOS & nMOS); hence, these techniques have been termed uniaxial, for obvious reasons, on what concerns electron mobility within the lattice.
At last year's IEDM, AMD presented its forecast for third-generation strained-SOI, where an interesting technique - in conjunction with the usual IBM/AMD transistor typicals, tensile-stress liner, compressive-stress liner, stress memorization & embedded SiGe - was also proposed, a 'mixed' straining approach ("Hybrid Orientation Technology" - HOT), which allows higher drive currents by using a single, bivalent stressor for both pMOS & nMOS. This new technique has already been tested (AMD/IBM/Toshiba/Sony) using partially-depleted SOI, in a 90nm process scaled down to 65nm.
The reported results (IBM/Sematech), from tests made on a generalization of the HOT technique (biaxial-strain fully developped in the entire wafer), do address an average of approx. 40% increase in drive current (pMOS=53%; nMOS=32%; average=42.5%), as quoted:

Quote:
PMOS and NMOS saturation drive current increased by 53% and 32%, respectively, leading to 40% higher product speed.


Although HOT & FD-SOI do appear on the "Mobility-Enhancement Roadmap" (IBM/Sematech, Fig. 2) by around the end of 2006, comments on such improvements (end of 2005) are not very favourable on what concerns technical hurdles, costs & implementation at the 65nm node.

This was more than a year ago; I'm led to believe that, given the time frame, AMD/IBM/Toshiba could already have achieved a mature enough process, in order to implement it (*) in AMD's K8L (contributing for the launch delay?).
Knowing, beforehand, that such an improvement does not translate into a direct gain in the final product, the claimed & reported 40% improvement do coincide; and, in accordance with your quotation on AMD's Phil Hester's, this "improvement" could have more to do with process than with architectural tweaks (both adding up, of course). It could also add up to AMD's option to fab lower-end parts at its most sophisticated facilities (300mm wafer/90nm node), due to mature-process delay.

My conjecture, anyway. :wink:

http://www.reed-electronics.com/semiconductor/index.asp?layout=articlePrint&articleID=CA6294195

Edit: Context issue: it (*), being Hybrid Orientation Technology, since, according to IBM, HOT «has resulted in 20% reduction in gate delays on bulk silicon.»


Cheers!
October 9, 2006 10:36:15 PM

Quote:
http://www.amd.com/us-en/assets/content_type/Downloadab...
There, they claim 60% increase in 'performance per watt'.


AMD claims.

Quote:
http://www.hkepc.com/bbs/news.php?tid=678736&starttime=...
There, 40%, not specifying the increase per clock, per price, per watt.


AMD claims.

Quote:
http://badhardware.blogspot.com/2006/10/amds-k8l-reveal...
AMD fan blog claims 15% improvement per clock, and 25% increase in clock, for a total of 40%.


Nice... blog claims.

How does it feel to know that, «Clock rise of 25% + some 15% boost due to architectural enhancements, that is what gives supposed K8L's 40% performance rise!»?

What I'm after (hopelessly, so far), is how K8L (or whatever) is going to achieve that "40% increase" over K8, knowing that the [pretense] 25% clock rise is, naturally, due to process.


Cheers!
October 9, 2006 11:04:22 PM

I'd be surprised with a 2.9 Ghz cap since K8 is at 3 Ghz on 90nm.... I'd expect that it would scale higher since it will be based on 65nm since it is a revision and not a major architecture change.
October 9, 2006 11:10:28 PM

Quote:
Lets assume the Intel Yorkfield XE only performs as well as an existing Core arch chip, clock for clock. This will underestimate its performance, but lets assume this for now.

Remember 8088 vs 8086, 80386DX vs 80386SX (I think), Willamette vs Tualatin , Prescott vs Northwood or even Banias vs P4M?

Not that it changes your conclusion but an equivalent performance might be a rather optimistic expectation. Intel's history is full of new core releases or upgrades which performed lower than their predecessors
October 10, 2006 1:37:39 AM

Quote:
I wouldn't say dead in the water until it comes out and we can see actual data stating a for or against case.
That being said, the Horde will be arriving anytime soon.


Hello. I am here to represent the horde. 8)

My thoughts on this. That is only speculation based off curent designs.

Core 2 doesn't have a %40 advantage over AMD right now. In certain applications yes it got as high as %40, but in an overall overage it is more like %20. Anywhere from %0 to %40.

The horde has spoken. :p 

Get off it, you are not a HORDE member. You are too rationale and accept logical argument and make good arguments in return. Don't do this again. :) 

Jack

Way to go JACK. How am I suppose to get into the club now with you throwing my quality traits around. Thanks, all i wanted to do was belong.

:lol: 
October 10, 2006 2:46:54 AM

I think Intel has a better chance increasing Quad clock from 2.66 to 3.46/3.73[~30/40%] rather than AMD getting a 40% per clock improvement.

The Yorkfield info comes from an admin of the site, which is partly sponsored by Intel now.
http://forums.vr-zone.com/showthread.php?t=97656
Without factoring in any core improvements for Yorkfield since all we know is that it's shrinked with SSE4..
http://www.hkepc.com/bbs/itnews.php?tid=678736&starttim...
At 2.9 max and '40%' improvement'..Pretty huge, I doubt we'll see that..
So 2.9*1.4 < 3.46*1.2[advantage of Conroe over K8]
3.73 Yorkfield should be ~10% better than the 2.9, while 3.46 should be ~2.3%.
October 10, 2006 3:16:24 AM

So Intel's pipe dream is that a shrink to 45 nanos will allow
1 a 37% speed bump
2 a 50% increase in logic transistors
3 a 200% increase in cache
4 while maintaining power usage.
Meanwhile, AMD will go nowhere on a by then mature process.
Why do I doubt.
It could happen, after all, you just have to look at the magic of conroe.
Then again, core was 4 years in the making.
BTW, as has been said before, A64 has a 20% gap to make up, not 35 0r 40%.
To those who would have us believe that AMD is having trouble with the 65 nano node, how does that fit with a one month earlier release?
Think 2.6 is too slow for a new node? Think again. It's about the same speed as core EE. AMD is a conservative outfit, they make the node work, then work on ramping.
October 10, 2006 3:26:22 AM

Quote:
So Intel's pipe dream is that a shrink to 45 nanos will allow
1 a 37% speed bump
2 a 50% increase in logic transistors
3 a 200% increase in cache
4 while maintaining power usage.

Not too sure if you can call it a pipe dream seeing what Rahul Sood said about the Kentsfield. Intel will most likely be able to do such again, if the Kentsfield is any indication of performance and innovation.
Quote:
To those who would have us believe that AMD is having trouble with the 65 nano node, how does that fit with a one month earlier release?

Simple, you state an overprojection to the media so when you actually get to production you are technically ahead of the schedule. Companies do it all the time.
October 10, 2006 3:39:25 AM

You have come a long way since our first encounter.
You have earned my respect.
When Intel first came out with the added issue core, the numbers they were tossing were up to 10% higher IPC. That is on top of the other enhancements of course. They will only get close to that, when the SSE dog and pony show has all it's puppies lined up in a row.
Once again, Intel has used those extensions to thier advantage.
Will AMD be able to catch up by using the same tool?
In most of the benches, the diference between the A64s and the conroes has been the use of steaming. If AMD can learn to do SSE right, we will have a horse race. If AMD has other working tricks up thier sleave, they just might take the crown back.
!