K8L 10-15% faster in SPECint 40% in SPECfp

qurious69ss

Distinguished
Mar 4, 2006
474
0
18,780
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

http://blogs.zdnet.com/Ou/?p=415
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

http://blogs.zdnet.com/Ou/?p=415

Because he doesn't realize that Barcelona will have dual core derivatives, I'd take his report with a grain of salt. Maybe I should make that my sig. Yeah, I think I will.

He also notes a significant deficit for AMD with AM2 vs. Core2 which is far from the actual situation.

We will hear SPEC numbers soon and TPC numbers will follow.

Hey, I actually said that on one of the forums I visit.
 

qurious69ss

Distinguished
Mar 4, 2006
474
0
18,780
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

http://blogs.zdnet.com/Ou/?p=415

Because he doesn't realize that Barcelona will have dual core derivatives, I'd take his report with a grain of salt. Maybe I should make that my sig. Yeah, I think I will.

He also notes a significant deficit for AMD with AM2 vs. Core2 which is far from the actual situation.

We will hear SPEC numbers soon and TPC numbers will follow.

Hey, I actually said that on one of the forums I visit.

So what does him not realizing that K8L will have dual core derivatives have to do with numbers??? Do you believe that they will be larger with dual cores?
 

m25

Distinguished
May 23, 2006
2,363
0
19,780
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

http://blogs.zdnet.com/Ou/?p=415
Well, if AMD boasts a 40%, that is clearly a maximum, however, figures are pretty unclear up to this moment without any benchmarks. On an article I even saw a 80% per core over K8, which would mean 50%+ over Core2.
Until we have at least one proven benchmark, speculation will be the predominant form of input we'll get :roll:
 

accord99

Distinguished
Jan 31, 2004
325
0
18,780
Not specINT and specFP, but TPC-C (a database/transaction processing benchmark) and specFP_rate assuming that they are continuing to use the two released benchmarks from December.

http://xtremesystems.org/forums/showpost.php?p=1873442&postcount=42

The following is my estimated scores for the Barcelona system

SPECfp_rate:
fastest 2S Opteron 2220SE score is 96.0 (peak)
fastest 2S Xeon 5160 score is 83.4
fastest 2S Xeon 5355 score is 104

40% faster than the 2220SE gives a score of about 135-140 for a 2S, quad-core Barcelona. By comparison, a 4S Opteron 8220SE has a best score of 178.

SPECfp_rate is heavily memory bandwidth dependent and really isn't indicative of desktop application performance and most server type applications.

Results are from here:

http://www.spec.org/cpu2000/results/rfp2000.html

OLTP benchmark:

I think the existing systems mentioned in the test are:

Opteron 2220SE - HP DL385G2 with a score of 139,693
Xeon Woodcrest - HP DL380G5 with a score of 140,246
Xeon Clovertown - HP BL480c wth a score of 222,117 (~60% higher increase versus the Woodcrest system, which matches the graph)

A 70% increase over the Opteron system gives a score of around 235,000-240,000.

The best score for a Clovertown system currently is the HP ML370G5 with a score of 240,737.

The best score for a 4S Opteron system is the HP ProLiant DL585G2 with 8220SE with a score of 262,989. The best score for a 4S Intel system is 331.087 from the IBM x3950 with 3.5GHz Tulsas-based Xeons.
 

casewhite

Distinguished
Apr 11, 2006
106
0
18,680
synthetic benches suck donkey

spec.org is the only IEEE/ASTM/ANSI approved benhcmarks since they are 1. peer reviewed for accuracy 2. designed to prevent testing manipulation 3. represent real operation power since you can't use benchmarks that fit in cpu cache.

"The SPEC CPU2000 benchmarks are intended to exercise the CPU itself, the memory hierarcy, and the compilers. How much memory do they actually use?

The data collected here show that SPEC met its goals for memory footprint: most benchmarks are larger than common cache sizes, many are larger than 100MB, and none are larger than 200MB.

* It is useful to have benchmarks that are larger than common caches, because SPEC would like to differentiate its benchmarks from "toy benchmarks" that are too easy to run or that simply reflect MHz.
* It is useful to keep the benchmarks under 200MB so that the suite leaves a reasonable margin on a 256MB machine. The other 56MB are available for the operating system, graphics system, network daemons, etc, without using 'single user mode' on Unix systems, or killing processes on NT systems. (Such measures may not be representative of how most people use their systems.)

The SPEC CPU2000 benchmarks are derived from real applications, and they exercise more of the system than just the CPU chip. " http://www.spec.org/cpu/analysis/memory/

"SPEC's Background

The System Performance Evaluation Cooperative, now named the Standard Performance Evaluation Corporation (SPEC), was founded in 1988 by a small number of workstation vendors who realized that the marketplace was in desperate need of realistic, standardized performance tests. The key realization was that an ounce of honest data was worth more than a pound of marketing hype.

SPEC has grown to become one of the more successful performance standardization bodies with more than 60 member companies. SPEC publishes several hundred different performance results each quarter spanning across a variety of system performance disciplines." http://www.spec.org/spec/

The operative words are "toy benchmarks". So if you have a problem with spec's benchmarks then you believe everything George W bush has said.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
synthetic benches suck donkey

spec.org is the only IEEE/ASTM/ANSI approved benhcmarks since they are 1. peer reviewed for accuracy 2. designed to prevent testing manipulation 3. represent real operation power since you can't use benchmarks that fit in cpu cache.

"The SPEC CPU2000 benchmarks are intended to exercise the CPU itself, the memory hierarcy, and the compilers. How much memory do they actually use?

The data collected here show that SPEC met its goals for memory footprint: most benchmarks are larger than common cache sizes, many are larger than 100MB, and none are larger than 200MB.

* It is useful to have benchmarks that are larger than common caches, because SPEC would like to differentiate its benchmarks from "toy benchmarks" that are too easy to run or that simply reflect MHz.
* It is useful to keep the benchmarks under 200MB so that the suite leaves a reasonable margin on a 256MB machine. The other 56MB are available for the operating system, graphics system, network daemons, etc, without using 'single user mode' on Unix systems, or killing processes on NT systems. (Such measures may not be representative of how most people use their systems.)

The SPEC CPU2000 benchmarks are derived from real applications, and they exercise more of the system than just the CPU chip. " http://www.spec.org/cpu/analysis/memory/

"SPEC's Background

The System Performance Evaluation Cooperative, now named the Standard Performance Evaluation Corporation (SPEC), was founded in 1988 by a small number of workstation vendors who realized that the marketplace was in desperate need of realistic, standardized performance tests. The key realization was that an ounce of honest data was worth more than a pound of marketing hype.

SPEC has grown to become one of the more successful performance standardization bodies with more than 60 member companies. SPEC publishes several hundred different performance results each quarter spanning across a variety of system performance disciplines." http://www.spec.org/spec/

The operative words are "toy benchmarks". So if you have a problem with spec's benchmarks then you believe everything George W bush has said.

You saved me the trouble. Thx.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
Not specINT and specFP, but TPC-C (a database/transaction processing benchmark) and specFP_rate assuming that they are continuing to use the two released benchmarks from December.

http://xtremesystems.org/forums/showpost.php?p=1873442&postcount=42

The following is my estimated scores for the Barcelona system

SPECfp_rate:
fastest 2S Opteron 2220SE score is 96.0 (peak)
fastest 2S Xeon 5160 score is 83.4
fastest 2S Xeon 5355 score is 104

40% faster than the 2220SE gives a score of about 135-140 for a 2S, quad-core Barcelona. By comparison, a 4S Opteron 8220SE has a best score of 178.

SPECfp_rate is heavily memory bandwidth dependent and really isn't indicative of desktop application performance and most server type applications.

Results are from here:

http://www.spec.org/cpu2000/results/rfp2000.html

OLTP benchmark:

I think the existing systems mentioned in the test are:

Opteron 2220SE - HP DL385G2 with a score of 139,693
Xeon Woodcrest - HP DL380G5 with a score of 140,246
Xeon Clovertown - HP BL480c wth a score of 222,117 (~60% higher increase versus the Woodcrest system, which matches the graph)

A 70% increase over the Opteron system gives a score of around 235,000-240,000.

The best score for a Clovertown system currently is the HP ML370G5 with a score of 240,737.

The best score for a 4S Opteron system is the HP ProLiant DL585G2 with 8220SE with a score of 262,989. The best score for a 4S Intel system is 331.087 from the IBM x3950 with 3.5GHz Tulsas-based Xeons.


Barcelona will allow AMD to own TPC-H forever. Clusters will be beating up Itanium 2 and Power.

If their OLTP numbers are accurate, they will crack TPC-C.
 

gOJDO

Distinguished
Mar 16, 2006
2,309
1
19,780
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

http://blogs.zdnet.com/Ou/?p=415
There is no any info about integer performance. Also, it is impossible to boost the ALU peroformance out of nothing and K8L has no improvements on the ALU.
TPC-C is not an ALU benchmark, but is an on-line transaction processing benchmark. It mostly measures system bandwidth.
That quote is borowed from David Kanter's article from RealWorldTech.
http://www.realworldtech.com/page.cfm?ArticleID=RWT012707024759
However, what happens beyond the middle of the year is subject to quite a bit of uncertainty, with rhetoric issuing from both camps. AMD has claimed an advantage based on performance models, which are extremely accurate but may not account for faster speed grades from Intel, of around 10-15% for TPC-C and 40% for SPECfp_rate. The latter is likely to be somewhat of an outlier, but it is clear that AMD will be strongest in high performance computing workloads. AMD’s performance is largely attributed to microarchitectural improvements and a high level of system integration.
Those claims come directly from AMD's mouth. Before we see an K8L ES benchmark we can't conclude how fast will be Barcelona.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
If their OLTP numbers are accurate, they will crack TPC-C.
If their OLTP numbers are accurate, they're already matched by an existing Clovertown system.

Go to www.tpc.org

Click on Non clustered TPC-H

Opteron OWNS 100GB and 300GB DB sizes. Only Power and Itanium's ability to expand beyond 8Way is keeping Opteron at bay for higher sizes and TPC-C. Barcelona should reach 16Way(a whopping 64 cores) at least with the L3 helping with cache coherency. Even an 8Way makes 32 Opteron+ cores.

The clustered results show Opteron leading the way at most DB sizes.
 

accord99

Distinguished
Jan 31, 2004
325
0
18,780
If their OLTP numbers are accurate, they will crack TPC-C.
If their OLTP numbers are accurate, they're already matched by an existing Clovertown system.

Go to www.tpc.org

Click on Non clustered TPC-H
Who cares about TPC-H. The benchmark shown by AMD is TPC-C and in that the score is already achieved by an existing Clovertown system.
 

DavidC1

Distinguished
May 18, 2006
494
67
18,860
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

Buddy. Did you even read the article that you linked?? The article states 10-15% for TPC-C. And its not SpecFP, its SpecFP_Rate. I can show you why. It looks like you didn't.

David Kanter: AMD has claimed an advantage based on performance models, which are extremely accurate but may not account for faster speed grades from Intel, of around 10-15% for TPC-C and 40% for SPECfp_rate.
 

NightlySputnik

Distinguished
Mar 3, 2006
638
0
18,980
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

http://blogs.zdnet.com/Ou/?p=415

I didn't read any other answer, so I might sound off track.

I can't care less about SPEC numbers. What I want to know is how fast it'll encode my home video and play my games... all at the same time. For everything elses it's not worth my time.

Gotta say tough that I'm desperatly looking for real apps numbers. Keep me posted if you can find some. :wink:
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

http://blogs.zdnet.com/Ou/?p=415
There is no any info about integer performance. Also, it is impossible to boost the ALU peroformance out of nothing and K8L has no improvements on the ALU.
TPC-C is not an ALU benchmark, but is an on-line transaction processing benchmark. It mostly measures system bandwidth.
That quote is borowed from David Kanter's article from RealWorldTech.
http://www.realworldtech.com/page.cfm?ArticleID=RWT012707024759
However, what happens beyond the middle of the year is subject to quite a bit of uncertainty, with rhetoric issuing from both camps. AMD has claimed an advantage based on performance models, which are extremely accurate but may not account for faster speed grades from Intel, of around 10-15% for TPC-C and 40% for SPECfp_rate. The latter is likely to be somewhat of an outlier, but it is clear that AMD will be strongest in high performance computing workloads. AMD’s performance is largely attributed to microarchitectural improvements and a high level of system integration.
Those claims come directly from AMD's mouth. Before we see an K8L ES benchmark we can't conclude how fast will be Barcelona.

Then maybe you should ask Jack. The widened L1-L2, OoO loads, 2x128 bit loads/retires per cycle, enhanced prediction, larger branch history, updated stack handler, better TLBs, SSE4A(yes there are SSE int instructions), etc will enhance int performance probably significantly(>10%). I would say even more (closer to 20%) but we'll see.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
According to this guy, the 40% number that AMD announced is for floating point only. For integer they expect an increase of 10-15% over the clovertown. Still wonder how this translates to real world apps though and if this 40% advantage is only when comparing clock for clock.

http://blogs.zdnet.com/Ou/?p=415

I didn't read any other answer, so I might sound off track.

I can't care less about SPEC numbers. What I want to know is how fast it'll encode my home video and play my games... all at the same time. For everything elses it's not worth my time.

Gotta say tough that I'm desperatly looking for real apps numbers. Keep me posted if you can find some. :wink:

You can believe that SPEC is a relevant benchmark for determining general performance. Much better than 3DMark.
 

DavidC1

Distinguished
May 18, 2006
494
67
18,860
Then maybe you should ask Jack. The widened L1-L2, OoO loads, 2x128 bit loads/retires per cycle, enhanced prediction, larger branch history, updated stack handler, better TLBs, SSE4A(yes there are SSE int instructions), etc will enhance int performance probably significantly(>10%). I would say even more (closer to 20%) but we'll see.

Sorry Baron, despite your fancy marketing talk, majority of the advantages is because of more simpler reasons. Relative to the platform, Barcelona is crappier than Clovertown is to the 1333MHz FSB. Or to put it in a different meaning, Barcelona is equal to Clovertown per core, but it gains advantages due to better memory subsystem.

Desktop apps couldn't care any less about platform superiority. Its different in servers however, which is why Core microarchitecture CPUs will be uncompetitive in greater than 4P environments due to crappier platform(despite the superior core).
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
If their OLTP numbers are accurate, they will crack TPC-C.
If their OLTP numbers are accurate, they're already matched by an existing Clovertown system.

Go to www.tpc.org

Click on Non clustered TPC-H
Who cares about TPC-H. The benchmark shown by AMD is TPC-C and in that the score is already achieved by an existing Clovertown system.

The lead score in TPC-C is IBM Power 5+. The Top 10 doesn't contain an X86 processor.

The Opteron owns the other relevant bench for transactions.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
Yep, that is a wide range all right Smile ... people have caught on to the play on words, very much like AMD's 40% improvement for 65 nm announcement last year .... which of course has yet to materialize.


The spin doctor says I believe they meant "native" 65nm designs.
 

DavidC1

Distinguished
May 18, 2006
494
67
18,860
What did c2d specfp at before release?this would give us an idea of actual averages.

The talk here is not Core 2 Duo, but the Xeon 5100. Plus, AMD is talking about SpecFP_Rate. The first benchmarks that were touted on the Xeon were SpecFP, the single threaded benchmark. AMD is really talking about 2P systems.