Barcelona to have double digit lead in integer

I agree talk is cheap, but if these numbers are true, to what extent will Barcelona have to excel in integer, knowing that soi wont oc much better even at 65nm? Or will the posts of the future be : who needs oc when my Barcelona is faster already?
 

Dade_0182

Distinguished
Apr 3, 2006
1,102
0
19,280
I agree talk is cheap, but if these numbers are true, to what extent will Barcelona have to excel in integer, knowing that soi wont oc much better even at 65nm? Or will the posts of the future be : who needs oc when my Barcelona is faster already?
But us enthusiasts love to oc...
 

accord99

Distinguished
Jan 31, 2004
325
0
18,780
Too true LOL , but I am asking some heady questions, and to add to that what if any effect or advantage do FP have on desktop?
Well, based on what little information that AMD has released, the 42% advantage will be completely irrelevant on the desktop. The 42% advantage isn't in FP, rather it's in SPECfp_rate2000, which is heavily memory bandwidth dependent and AMD's platform is very well-suited for it but doesn't correlate with desktop applications. A dual-socket Opteron 2220SE, roughly analogous to a QuadFX FX-72 system scores 90.8, while a Xeon 3220 system, basically a Q6600, scores 64.2. And yet as seen in the numerous reviews, the Q6600 is a match for the FX-74 on desktop applications.
 
It is a marketing piece. A double digit lead could be as little as 10%. There are no independent benchmarks of how well it runs applications.

And apparently, R600 is going to be more late.

john
 
So theyre touting their IMC? Will this advantage have any effect on multi media aps? Decoding graphics/movies? Or since its all 2 dimensional will have no effect? Im just trying to understand as Im new to this
 

accord99

Distinguished
Jan 31, 2004
325
0
18,780
So theyre touting their IMC? Will this advantage have any effect on multi media aps? Decoding graphics/movies? Or since its all 2 dimensional will have no effect? Im just trying to understand as Im new to this
Yeah, AMD's NUMA architecture works very well with SPECfp_rate. But as AMD has only given two rough numbers (this one plus a database related benchmark), there's not enough information to say other than the two architectures will be quite close to each other for the desktop. It'll probably end up being decided like it used to be, by CPU clock.
 

LordPope

Distinguished
Jun 23, 2006
553
0
18,980
http://www.eetimes.com/news/latest/showArticle.jhtml?articleID=197700269 Is this enough? Is this enough for the enthusiast class?


File this under "More Beast Sightings."

you were laughed at for calling the k10 a beast

looking more and more like u will laugh last

although the rabid INTELIOTS are eerily quiet...
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
http://www.eetimes.com/news/latest/showArticle.jhtml?articleID=197700269 Is this enough? Is this enough for the enthusiast class?


File this under "More Beast Sightings."

you were laughed at for calling he k10 a beast

looking more and more like u will laugh last

I was basing it on real specs, not assumptions or hopes or fanboyism. I said I expect to see SPEC benches soon. They are getting closer with the R600 Demo having happened.

1 TFLOP with two Streams and probably two Barcelona's is like WOW!! Hell even 4 CPUs and two CPUs is like Wow. AMD has now won another race; TFLOPS IN A BOX.

The little engine that could is pulling a Beast. I am almost willing to bet that AMD will drop a box this year that will crack the top of the TPC-C. It's currently owned by 64P Itanium and Power.

Since we can see that 8xxx does have 4 HT links they will have a cache coherent 1-hop 32P box by Sept/Oct.
 

Periander

Distinguished
Jan 23, 2007
170
0
18,680
When they claim a "double digit" lead in integer performance and a 40% advantage on fp performance, are they claiming it on a clock for clock basis? If not, what clock speed K10 are they comparing to what clock speed Kentsfield for this benefit?

C2D's have a lot of head room on their current process, it would not be difficult to put out a bump or two in clock speeds as a response.
 

sweetpants

Distinguished
Jul 5, 2006
579
0
18,980
When they claim a "double digit" lead in integer performance and a 40% advantage on fp performance, are they claiming it on a clock for clock basis? If not, what clock speed K10 are they comparing to what clock speed Kentsfield for this benefit?

C2D's have a lot of head room on their current process, it would not be difficult to put out a bump or two in clock speeds as a response.

I'm not knowledgable/articulate enough to explain this but there is an article that I've posted and here is a snippet of it... explains SSE128 and I think it may delve a little bit into your question/comment

http://anandtech.com/cpuchipsets/showdoc.aspx?i=2939&p=3
 

jeff_2087

Distinguished
Feb 18, 2007
823
0
18,980
Question for those that have been reading all the new articles on K10, because I don't have time right now to read them myself.

Has AMD run any real benchmarks on K10? I don't mean proudly pronouncing theoretical numbers.

If the answer is yes: What are these numbers and are they meaningful? Or are they just a simulated floating point task that depends heavily on memory?

If the answer is no: Then nothing has actually changed, so what's with the sudden barrage of K10 threads? They're still just claiming the superiority of something that doesn't exist yet, or if it does exist, it's not yet as superior as they'd like everyone to believe.


EDIT: Reading over my post, I don't want people to think I'm dismissing AMD or K10. I've just skimmed the threads here, and all I've seen are claims but no proven results, so I'd like to know what is actually concrete.
 

sweetpants

Distinguished
Jul 5, 2006
579
0
18,980
just to quote some from the article:


In general, the accuracy of a CPU's branch predictor determines how wide and how deep of a design you can make. The average number of instructions before the predictor mispredicts governs how many instructions you can have in flight, which in turn controls how many execution units you can realistically keep fed on a regular basis. The K8's branch predictor was quite good and very well optimized for its architecture, but there were some advancements Intel introduced in the Pentium M and Pentium 4 that AMD could stand to benefit from.

Barcelona adds a 512-entry indirect predictor which, believe it or not, predicts indirect branches. An indirect branch is one where the target of the branch is a location pointed to by an address in memory, in other words, a branch with multiple targets. Instead of branching directly to a label indicated by the branch instruction, an indirect branch sends the CPU to a memory location that contains the location of the instruction that it should branch to


And more:

Barcelona widens the execution units that handle SSE operations from 64-bits to 128-bits, so now 128-bit SSE operations don't have to be broken up into two 64-bit operations. This also means that you get more usable decode bandwidth since 128-bit SSE instructions now map to a single micro-op instead of two. The FP scheduler can now handle these 128-bit SSE operations as well.

It's the increase to SSE execution width that drove a number of other changes within the core. Since you effectively have more decode bandwidth when executing 128-bit SSE instructions AMD discovered a new bottleneck: instruction fetch bandwidth. These 128-bit SSE instructions tend to be quite large, and in order to maximize the number decoded in parallel the Barcelona core can now fetch 32-bytes per cycle, up from 16-bytes in K8. The 32B instruction fetch not only benefits SSE code but also seems to benefit integer code as well. Bigger instructions in general will see a performance boost here.

Now that you can fetch and decode more instructions, you need to be able to get more data to the execution core and thus AMD widened the interface between the L1 data cache and Barcelona's SSE registers. Barcelona can now perform two 128-bit SSE loads per cycle from the L1-D cache compared to two 64-bit loads per cycle in K8. AMD then widened the interface between the L2 cache and the memory controller so that now 128-bits can be transferred per cycle, once again to balance out all of the aforementioned changes
 

jeff_2087

Distinguished
Feb 18, 2007
823
0
18,980
But that's exactly what I mean by 'proudly pronouncing'. Sure, sounds great, but it doesn't really show anything. As interesting as it is, I want to know if they've actually demonstrated anything, instead of just listing specifications that put the Starship Enterprise to shame.

Frankly, I don''t care if there's a 512-entry indirect predictor under that heat spreader or a team of leprechauns sporting tiny pencils and paper pads. Either way, have they shown how well it works?
 

sweetpants

Distinguished
Jul 5, 2006
579
0
18,980
But that's exactly what I mean by 'proudly pronouncing'. Sure, sounds great, but it doesn't really show anything. As interesting as it is, I want to know if they've actually demonstrated anything, instead of just listing specifications that put the Starship Enterprise to shame.

Frankly, I don''t care if there's a 512-entry indirect predictor under that heat spreader or a team of leprechauns sporting tiny pencils and paper pads. Either way, have they shown how well it works?

so do they have benchmarks out yet? not that I'm aware of..

it's one thing to know the answer, it's something else to know why the answer is...

It's not the destination that matters, it's the journey :)
 

Periander

Distinguished
Jan 23, 2007
170
0
18,680
The Anand article doesn't address anything in terms of concrete performance numbers.

Is AMD making IPC claims, or are they comparing a particular clock to a particular C2Q clock? Since the C2's are expected to run at higher clock speeds, this can rather dramatically impact the performance claims they are making.
 

jeff_2087

Distinguished
Feb 18, 2007
823
0
18,980
Okay, so basically all these new K10 threads are responses to what is just currently unfounded marketing hype by AMD (regardless of whether or not it will be proven true in the future)? That's disappointing. :(
 

sweetpants

Distinguished
Jul 5, 2006
579
0
18,980
The Anand article doesn't address anything in terms of concrete performance numbers.

Is AMD making IPC claims, or are they comparing a particular clock to a particular C2Q clock? Since the C2's are expected to run at higher clock speeds, this can rather dramatically impact the performance claims they are making.

No they are not making claims...

The anandtech article is showing the differences in Barcelona's architecture vs. previous core architectures. Nowhere will it post a visual benchmark. You're kind of missing the point of the article. The intention was not to benchmark or make claims. The intention of the article was to give information on the core changes.

I'm not making claims that Barcelona will be better, in fact I didn't read in the article where the author said that either. I was trying to point to an informative source that manages to objectively look at both sides. From now on, I'll be sure to add "no benchmarks will be found here" at the end of my posts.
 

jeff_2087

Distinguished
Feb 18, 2007
823
0
18,980
I'm not making claims that Barcelona will be better, in fact I didn't read in the article where the author said that either. I was trying to point to an informative source that manages to objectively look at both sides. From now on, I'll be sure to add "no benchmarks will be found here" at the end of my posts.

Yeah I didn't mean to suggest either you or the author was, I was just curious if there had been benchmarks or something because today there are seven new Barcelona threads on this page alone, which suggested there was something going on anyway.


i like the articles and statements with them saying intel is cheating. reminds me of 4th grade.

Haha yeah I was reading earlier about how AMD was complaining about one of Intel's presentation of benchmarks favouring Intel CPUs over AMD. I thought it was pretty funny considering AMD is spouting its own superiority claims, but at least Intel used benchmarks, even if they may have been biased.