AMDs K8L dead in the water?

Wombat2

Distinguished
Jul 17, 2006
518
0
18,980
From http://www.vr-zone.com/?i=4109

The Core arch is ~35% faster than the K8 arch at equivalent clock rates.

Altair / K8L will clock at 2.9 Ghz max.
Yorkfield XE will clock at 3.73 Ghz max.

Lets give a K8 at 2.9 Ghz a baseline performance rating of 1.0. Now the claimed performance increase of K8L over K8 is about 40%. Then the Altair / K8L will rate at 1.4.

Lets assume the Intel Yorkfield XE only performs as well as an existing Core arch chip, clock for clock. This will underestimate its performance, but lets assume this for now. Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 = 1.72

Therefore:
Intel Yorkfield XE will be 1.72/1.4 = ~23% faster than the AMD K8L!

Now lets be more generous to Intel and say that, clock for clock, the monolithic Yorkfield will be 10% faster than the existing core arch. (Remember AMD claims a 40% advantage for the move to a monolithic quad core). Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 * 1.1 = 1.89

Therefore:
Intel Yorkfield XE will be 1.89/1.4 = ~35% faster than the AMD K8L!

This is just on performance alone. Given that the Yorkfield uses a 45 nm process it should use about half the power than the 65 nm K8L does too.

8O for AMD
:trophy: for Intel
 

lcandy

Distinguished
Jul 14, 2006
260
0
18,780
Its useless to speculate, but thats an interesting way of doing it. I cant imagine AMD fanboys are going to let this one go. I have to say, I really hope the difference isn't that much, or there wont be any competition to speak of. Which is bad for us in the long run.
 

ricardo

Distinguished
Apr 11, 2004
130
0
18,680
That is an interesting way to look at it all right, but it doesn't really make sense. Using the same logic you would have said that the P4 would kill the K8 because P4 was faster and clocked higher than K7.

Unfortunately there are just so many unknowns right now about both Altair (k8L) and Yorkfield that we just don't know which one is going to be better.

I hope AMD can come up with a strong response to Core2. It would be a pity to go back to the days when your only choice is Intel, or Intel. Prices will be higher, and progress slower if AMD can't keep Intel on their toes.

Can't wait to see the 2 chips battle it out though! Should be very interesting!
 

Bluefinger

Distinguished
Mar 10, 2006
531
0
18,980
Interesting... but it's still speculation. Yorkfield could in the end of the day not have a huge performance boost over kentsfield, instead really just cut down power consumption, but then again it could be even faster. We just do not know at the moment. Same goes with Altair. To be honest, I really want Altair to perform at the same level with kentsfield and yorkfield, why? Because competition is healthy, and that means prices for EVERYTHING will be forced down on both sides. Cheaper processors make consumers happy, and cheaper top-of-the-range processors keep enthusiasts happy as well. AMD will come up with a strong contender, they have to in order for them to hold on to the market share they gained with K7/K8 in the P4 days.
 

corvetteguy

Distinguished
Jan 15, 2006
1,545
0
19,780
I wouldn't say dead in the water until it comes out and we can see actual data stating a for or against case.
That being said, the Horde will be arriving anytime soon.

Hello. I am here to represent the horde. 8)

My thoughts on this. That is only speculation based off curent designs.

Core 2 doesn't have a %40 advantage over AMD right now. In certain applications yes it got as high as %40, but in an overall overage it is more like %20. Anywhere from %0 to %40.

The horde has spoken. :p
 
I think it needs to be specified, is it a 20% or so overall performance increase over a certain AMD chip or the field of AMD chips. And is it comparing the family or just one Core 2. As Wusy would say, you have to look at more than just benchmarks in order to determine these things.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
From http://www.vr-zone.com/?i=4109

The Core arch is ~35% faster than the K8 arch at equivalent clock rates.

Altair / K8L will clock at 2.9 Ghz max.
Yorkfield XE will clock at 3.73 Ghz max.

Lets give a K8 at 2.9 Ghz a baseline performance rating of 1.0. Now the claimed performance increase of K8L over K8 is about 40%. Then the Altair / K8L will rate at 1.4.

Lets assume the Intel Yorkfield XE only performs as well as an existing Core arch chip, clock for clock. This will underestimate its performance, but lets assume this for now. Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 = 1.72

Therefore:
Intel Yorkfield XE will be 1.72/1.4 = ~23% faster than the AMD K8L!

Now lets be more generous to Intel and say that, clock for clock, the monolithic Yorkfield will be 10% faster than the existing core arch. (Remember AMD claims a 40% advantage for the move to a monolithic quad core). Therefore, the Yorkfield XE will rate at: 3.7/2.9 * 1.35 * 1.1 = 1.89

Therefore:
Intel Yorkfield XE will be 1.89/1.4 = ~35% faster than the AMD K8L!

This is just on performance alone. Given that the Yorkfield uses a 45 nm process it should use about half the power than the 65 nm K8L does too.

8O for AMD
:trophy: for Intel


Where have you seen AMD claim 40% increase over K8? The theoretical gains will be closer to 80% in integer and close to 150% in FP. They are doubling loads and (realistically retires) widening the L1 to 256bit, adding an extra FP unit and doubling the SSE FP to execute two instructions per cycle.


That is much larger than 40%. But we'll see.
 

r0ck

Distinguished
Oct 8, 2006
469
0
18,780
I generally figure 20% for Core 2 vs K8..

http://www.xbitlabs.com/news/cpu/display/20060731233200.html
First they claim 4x4 will be 80% better at Cinebench.
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/PhilHesterAMDAnalystDayV2.pdf
Here claims of 60% better "performance per watt". Geez, isn't comparing by clocks confusing enough?
http://www.hkepc.com/bbs/itnews.php?tid=678736&starttime=0&endtime=0
Then here claiming 40% improvement.
http://www.vr-zone.com/?i=4109
If AMD it's true that it's 40% better than K8, which should be 17% better than Conroe... A 2.9 Altair should theoretically be at best case better than a 3.4 Conroe. Yorkfield should bring that down[even possibly gaining with all the new features], and VR-Zone reports that these will come in 3.46-3.73[nullifying the per clock gains of Altair] parts.

http://badhardware.blogspot.com/2006/10/amds-k8l-revealed-in-cray-rainier.html
AMD fan blog claims 15% improvement per clock, and 25% clock gain. Even more confusing. We should know more when AMD demonstrates their quads late this year :wink:
 

slim142

Distinguished
Jan 29, 2006
2,704
0
20,780
I would like to say thats 100% true, but unfortunately I cant

I have always been an Intel FanBoy even though when I knew the truth (those 4 ugly years when AMD was the best bet and intel was dead with netburst)

But now with the Core architecture and the upcoming kentsfield and yorkfield the 45nm tech omg I cant just wait for a engineering sample review of yorkfield.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
Baron, you an add 200% more SIMD's, 300% more FPUs and make them wider this will not translate into gains. The efficiency to get IPC is dependent upon the ability of the decoders and reorder buffer to maximize the use of the execution units.

If AMD stays 3 issue, then it will not show nearly as big of 'theoretical' performance boost that you are claiming.


The pictures of the die show an "enhanced IPC core" I assume that means it will peak above 3 IPC. If they don't know more than you do about how to increase IPC I guess they are doomed.
 

Cabletwitch

Distinguished
Feb 3, 2006
103
0
18,680
How about you just wait for it to be released and tested by people who know what they're doing? Speculation is one thing, but damning a product that hasnt even seen the light of day is just a bit fanatical.

Besides, lets not forget that the K8L is only a revamp of an existing architecture, and not a complete re-write as the C2D was. It would only be fare ideally to compare new architectures when they are both available and there for hard testing.

Or are you just another Intel fanboy who wants to gloat, whilst secretly fearing that the C2D era could be short lived? Be patient, and all will be revealed.
 

spud

Distinguished
Feb 17, 2001
3,406
0
20,780
Baron, you an add 200% more SIMD's, 300% more FPUs and make them wider this will not translate into gains. The efficiency to get IPC is dependent upon the ability of the decoders and reorder buffer to maximize the use of the execution units.

If AMD stays 3 issue, then it will not show nearly as big of 'theoretical' performance boost that you are claiming.


The pictures of the die show an "enhanced IPC core" I assume that means it will peak above 3 IPC. If they don't know more than you do about how to increase IPC I guess they are doomed.

Those are CAD pictures of what they hope the die will look like Baron, that’s the slides the share holders get to see because they aren't tech suave, take them with a grain off salt.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
Baron, you an add 200% more SIMD's, 300% more FPUs and make them wider this will not translate into gains. The efficiency to get IPC is dependent upon the ability of the decoders and reorder buffer to maximize the use of the execution units.

If AMD stays 3 issue, then it will not show nearly as big of 'theoretical' performance boost that you are claiming.


The pictures of the die show an "enhanced IPC core" I assume that means it will peak above 3 IPC. If they don't know more than you do about how to increase IPC I guess they are doomed.

The IPC can be enhanced in all sorts of ways. Without any fusion tricks, the max theoretical would be directly the width of the core, a 3-issue core would be able to do a max of 3 instructions per clock. However, since code is interdependent, average IPC is always lower than what the core is capable of doing.

Intel, in my opinion, made a nice leap forward in IPC by rectifying one major dependency which would cause all other dispatches to wait, and that is load before store (memory disambiguation), AMD is working this into K8L, so yeah I expect a good boost for AMD --- but 40% sounds about right, not 80 or 150%.

Having said that, and the dicussion on IPC, I have posted this article before:
http://www.ece.utexas.edu/projects/ece/lca/ps/deepu-iccd-2000.pdf

Now, it is more detailed than this argument but the data in table 5 is what I draw your attention to.... here you can see that the IPC is always less the the issue, and at some point the IPC saturates, even after the width goes up.... this saturation point is about 4. This makes sense in this architecture, as on average stores about 1/4 of the code mix, loads are about 1/3, math and logic functions make up the rest.

In otherwords, what if Intel zoomed up and hit the IPC max for the x86 code base? If this is the case, then the best AMD can do is meet not exceed. If this is true then it will fall back to who can get clock speed up within the power envelop.... Intel has that one by a long shot.

Jack


Again, if they don't know as much as you they're DOOMED. I don't care how many links you post.
 

gOJDO

Distinguished
Mar 16, 2006
2,309
1
19,780
Again, if they don't know as much as you they're DOOMED. I don't care how many links you post.
The key of your sucess is to study this book:
CPSlogo.gif

and ignore everything else.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
Baron, you an add 200% more SIMD's, 300% more FPUs and make them wider this will not translate into gains. The efficiency to get IPC is dependent upon the ability of the decoders and reorder buffer to maximize the use of the execution units.

If AMD stays 3 issue, then it will not show nearly as big of 'theoretical' performance boost that you are claiming.


The pictures of the die show an "enhanced IPC core" I assume that means it will peak above 3 IPC. If they don't know more than you do about how to increase IPC I guess they are doomed.

Those are CAD pictures of what they hope the die will look like Baron, that’s the slides the share holders get to see because they aren't tech suave, take them with a grain off salt.

Is he going on about that CAD plot, or is he refering to the picture of the wafer in Charlie D.'s lap on the Inquirer.


That CAD plot or whatever is direct from AMD and clearly says (ambiguous though it maybe) "IPC-enhanced core." Ask them what it means. A logical person woud say it means MORE THAN THREE.

Isn't it time for you and goJdo's date?
 

casewhite

Distinguished
Apr 11, 2006
106
0
18,680
The basic answer is both quad core and quad father are a generation out fo date now that IBM is making the 1U an larger versions of Roadrunner (1.6 petaflops and 16000 opteron single cores and 16,000 Cell chips) available for the desktop. http://www-03.ibm.com/technology/splash/qs20/ Roadrunner will be 100gigaflops per Opteron/Cell chip about 125% of the very best quad cores and use 33% of the power. The QS20 series will not require multiple licenses from Microsoft to operate. since it will use either a single core or dual core opteron. Do note that IBM says that the Cell is an accelerator chip. You can buy this today and you can't buy either Kentsfield or Quadfather. The QS 20 is already installed at the University of Manchester among others.
For those unfamiliar with roadrunner:

"The Roadrunner system, along with the Protein Explorer and the seventh-fastest supercomputer, Tokyo Institute of Technology's Tsubame system built by Sun Microsystems (SC Online READERS' CHOICE PRODUCT OF 2005: Sun Microsystems Sun Fire servers), illustrate a new trend in supercomputing: combining general-purpose processors with special-purpose accelerator chips.

"IBM's BladeCenter systems are amenable to the hybrid approach. A single chassis can accommodate both general-purpose Opteron blade servers and Cell-based accelerator systems. The BladeCenter chassis includes a high-speed communications links among the servers, and one source said the blades will be used in Roadrunner.

"Advanced Micro Devices' Opteron processor is used in supercomputing "cluster" systems that spread computing work across numerous small machines joined with a high-speed network. In the case of Roadrunner, the Cell processor, designed jointly by IBM, Sony and Toshiba, provides the special-purpose accelerator.

"Cell originally was designed to improve video game performance in the PS3 console. The single chip's main processor core is augmented by eight special-purpose processing cores that can help with calculations such as simulating the physics of virtual worlds. Those engines also are amenable to scientific computing tasks, IBM has said.


"On average, Cell is eight times faster and at least eight times more power-efficient than current Opteron and Itanium processors, despite the fact that Cell's peak double-precision performance is fourteen times slower than its peak single-precision performance. If Cell were to include at least one fully usable pipelined double-precision floating-point unit, as proposed in the Cell+ implementation, these performance advantages would easily double." http://www.supercomputingonline.com/article.php?sid=11894

Here is the IEEE paper that explains why quad cores are obsolete and accelerators are the future. of computing:
"However, with the addition of the customized Reconfigurable Data Cache, the resulting system runs 5× faster and outperforms the reference microprocessor." A 2005 paper by Prof Ron Sass Univ. of Kansas and one of his grad students Pradeep Nalabalapu

http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/ipdps/2005/2312/04/2312toc.xml&DOI=10.1109/IPDPS.2005.121#search=%22ron%20sass%20reconfigured%22

Anybody that wants to read the complete paper send me your email and I will send you the PDF if you are not a dues paying member of IEEE.
 

WR

Distinguished
Jul 18, 2006
603
0
18,980
All this is speculation, but I have the following points to note:

1) It seems odd that in a year's time and with a new process, AMD would not be able to increase clock speed at all when going from native dual-core to native quad-core. That would suggest some premature design limitation in the K8L, or an underdeveloped manufacturing process. In the same time, Intel supposedly would be going through a similar process shrink, doubling of core count, and cache increase, but on top of this they're going to increase the clock speed by 25%.

Perhaps Intel is using some of the headroom found in today's C2D's, as we've all noticed their unusual overclock potential. When was the last time people overclocked a top-of-the-line chip by 30% on air?

2) Historically, the only time AMD gained the upper hand with an equivalent-generation architecture was when Intel released the poorly performing Pentium 4 about 6 years ago. I was here reading on THG, and the reviewer was quite displeased with the initial performance, causing me to purchase my first-ever AMD-based system. All other times, Intel's process edge gave them the performance and cost advantage over other desktop CPU designs, regardless of minor weaknesses in their own architecture.

3) Seeing that AMD probably won't be keeping up with Intel on the manufacturing front, unless Intel releases another grossly underperforming architecture, I don't see how AMD would grab back the performance crown except with outrageously elaborate setups - e.g., a four-socket K6-III setup sure may overwhelm the 1P Pentium Pros, but at what cost?

4) Performance isn't everything. As we see on the forums, most people don't go out and buy an FX-62 or an X6800, even though review sites so many times refer to both. I think AMD should go after niche markets, and Torrenza is an excellent start.
 

WR

Distinguished
Jul 18, 2006
603
0
18,980
The basic answer is both quad core and quad father are a generation out fo date now that IBM is making the 1U an larger versions of Roadrunner

While the numbers would appear to make next year's processors already out-of-date, you're comparing them with something built from the ground up not to run x86 or general apps but a different specialized language altogether. Cell is very powerful, but so are Altivec, UltraSparc T1, Alpha 21164, and so forth.

If you try video encoding on IBM's system, you'll probably get the now-measly performance of an Athlon because the Cell processor isn't made to encode video, nor is there a video encoder to my knowledge programmed in Cell language.

But if you try Folding@Home, then with a patched version incorporating a few Cell math libraries you probably could gain that order of magnitude of performance.

In the future, I don't see why accelerator boards and Torrenza plug-in chips can't coexist with multicore CPUs - it could be an enthusiast's dream. The CPUs are all-purpose and will multitask anything you throw at them, while the accelerators are specialized and fast, but require individually patched program code.