Sign in with
Sign up | Sign in
Your question

Thouroughbred +1800(1.53Ghz) Overclocked to 2Ghz - Page 2

Tags:
  • CPUs
  • Overclocking
  • Font
Last response: in CPUs
Share
May 3, 2002 10:51:50 PM

Quote:
Lets get one thing straight: Currently the P4 is the ONLY CPU which has SSE2. It's is thus natural that any discussion about SSE2 implementation must revolve about the current reality, until we see something else.


No it isnt natural, you are confusing sse2 itself with the p4's sse2 engine, a chip can support sse2 code but it does not have to use the same design as the p4 does to run that sse2, and it is as variable as regular ipc.

If you took whatever the p4 has to run sse2, and put two of them on a die, if that die ran at half the speed of the p4 die, it still would have the same sse2 performance.

What you are doing is claiming that sse2 ITSELF is corespeed dependant, which is blatantly wrong.

The clawhammer will not NECCICARILY be smoked(as you claim) by a 3ghz p4 in PURE SSE2 applications because the sse2 core of the clawhammer may be more efficient/faster than the p4 sse2 core.


So claiming that a 3.0ghz p4 will beat the 2.0ghz clawhammer in sse2 calculations based on the flawwed idea that ALL SSE2 CORES are clockspeed dependant, is wrong. And thats all I have been saying.

If the clawhammer used the exact same sse2 core layout as the p4, then yes your claim would be correct, but the odds are extremely in favor of amd using their own tweaked and modified sse2 excecution, which like amds sse(being faster than intels, additional command or not burger) may be faster than the p4 sse2.


Furthermore, amd would not put sse2 into its chips if doing so would ensure that a faster clocked p4 would win, it makes 0 sense, sure they could claim sse2 capability, but what would be the point? "we can use sse2, but unfortunatly it [-peep-] performance."?


I dont mean to come off as attacking copenhagen, but your statement that sse2 is dependant on clockspeed is incorrect, and I am trying to get you to realize the difference between an extension library and the silicon core which actually runs that library.


:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
May 3, 2002 11:55:08 PM

The claim of "as little as one clock cycle" does not cover all circumstances. x86 processors as early as the 486 was able to execute certain integer instructions in as little as one clock cycle; however, in reality this only applied to simple instructions like ADD, MOV (move), et al, and only applied under the best of circumstances (meaning no cache miss and no recent branch mispredict). Not to mention which, complex integer instructions like MUL (multiply) and DIV (divide) took upwards of ten clock cycles to complete (and often still do). Still, it was often said that the 486 completed instructions in "as little as one clock cycle," just as your link states now for SSE2.

Can SSE2 instructions complete in as little as one clock cycle? Technically, yes, but only with certain simple instructions, and only under the best of circumstances. Your article proposes a hypothetical situation to help somewhat lesser minds understand the benefits of SSE2. There is definite room for improvement in SSE2, even without a change in clock speed.

You really need to take your own argument with a grain of salt. In short, you're slipping into the kind of thinking that screwed many people into believing that the Willamette was faster than the Athlon.

<i>If a server crashes in a server farm and no one pings it, does it still cost four figures to fix?
May 4, 2002 10:02:16 PM

<blockquote><font size=1>Svar på:</font><hr><p>If you took whatever the p4 has to run sse2, and put two of them on a die, if that die ran at half the speed of the p4 die, it still would have the same sse2 performance.<p><hr></blockquote><p>True, if CH implements what I would call two SSE2 "execution units", a 2.0GHz CH would smoke a 3.0GHz Northwood in "raw" SSE2 performance. And if a 4.0GHz Prescott does the same it would smoke the CH, and if ... and if ...
<blockquote><font size=1>Svar på:</font><hr><p>
What you are doing is claiming that sse2 ITSELF is corespeed dependant, which is blatantly wrong.<p><hr></blockquote><p>Rubbish, the "basic" SSE2 execution "unit" must be to complete one SSE2 instruction in one clock cycle. Thus it follows that under optimal conditions SSE2 performance scales proportional to the core frequency, i.e SSE2 performance is (under optimal conditions) DIRECTLY dependant on the core frequency. If you take whatever core with, say 10 SSE2 "execution units", the SSE2 performance of this specific core will also (under optimal conditions)be DIRECTLY dependant on the core frequency. I guess the Palomino is able to perform no more than one SSE2 instruction per clock cycle but the high overall IPC and strong x86 FPU helps it to achieve a decent FPU performance. I expect the CH to do likewise.

<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
Related resources
May 4, 2002 10:12:41 PM

Frist off I rather listen to Matisaro then you. Second the Palomino dont have SSE2. It has frist gen of SSE. Read up on your tech stuff before you post. And what your talking about is Rubbish.
May 4, 2002 10:32:39 PM

<blockquote><font size=1>Svar på:</font><hr><p>Frist off I rather listen to Matisaro then you. <p><hr></blockquote><p>Do whatever you feel like, I couldn't care less ...

<blockquote><font size=1>Svar på:</font><hr><p>Second the Palomino don't have SSE2.<p><hr></blockquote><p>Correct, I meant SSE not SSE2. I know perfectly well that SSE2 was invented by Intel and that AMD lagging behind as usual.

<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 4, 2002 10:38:14 PM

Quote:
that AMD lagging behind as usual.

I think that that's kinda overexaggerating. AMD has had quite a few innovations as well that can't be overlooked.

My firewall tastes like burning. :eek: 
May 4, 2002 10:51:10 PM

<blockquote><font size=1>Svar på:</font><hr><p>I think that that's kinda overexaggerating.<p><hr></blockquote><p>In the past AMD has being copying everything that Intel did, but lately I admit, they have come up with some nice things. SSE and SSE2 they have copied (licensed) from Intel, both innovations that suits better to very high clocked CPUs instead of the old x86 FPU.

The guy gave me an unpleasant impression, so maybe my response was a bit provocative.

<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 4, 2002 10:52:58 PM

Quote:
Correct, I meant SSE not SSE2. I know perfectly well that SSE2 was invented by Intel and that AMD lagging behind as usual.


Amd is not lagging behind, intel wouldnt license sse2 to them till recently, furthermore, I would rather have a good core(axp) than a shitty core with sse2(willamette) because most of my apps dont use sse2.

Your stand that sse2 (THE LANGUAGE/CODE ETC) is clockspeed dependant is MORONIC. One could design a sse2 engine which could only do 1 sse2 op in 2 clockcycles, IT WOULD STILL BE AN SSE2 ENGINE!!!

Also, if you would pay attention to kelledin(who knows what hes talking about) that 1 op is a simple one, and complex ops take longer.

Having said that, it is possible to TWEAK a sse2 excecution core to do larger ops in less clock cycles AND make a sse2 core which can do more than one simple op a clock tick, THIS IS STILL AN SSE2 CORE.

Thus BOTH your possible points are wrong, sse2 itself is NOT CLOCKSPEED DEPENDANT.
and neither are the units which excecute said code.



PS: LOL atmo!

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
May 4, 2002 10:54:04 PM

Quote:
In the past AMD has being copying everything that Intel did, but lately I admit, they have come up with some nice things. SSE and SSE2 they have copied (licensed) from Intel, both innovations that suits better to very high clocked CPUs instead of the old x86 FPU.


Wrong, amds sse runs better than the p3's at the same clock speed, theres proof right there about sse copenhagen, why do you stick to this flawwed argument?

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
May 4, 2002 11:07:26 PM

<blockquote><font size=1>Svar på:</font><hr><p>Your stand that sse2 (THE LANGUAGE/CODE ETC) is clockspeed dependant is MORONIC. One could design a sse2 engine which could only do 1 sse2 op in 2 clockcycles, IT WOULD STILL BE AN SSE2 ENGINE!!!<p><hr></blockquote><p>I repeat the definition of SSE/SSE2:

<i>SSE is a <b>single instruction multiple data (SIMD)</b> processing scheme. SIMD combines several intensive computations into a single instruction. The single instruction can then be processed in as little as one CPU clock cycle, thus allowing for much improved performance over traditional core operations.</i>

By increasing the clock speed, you get more clock cycles per second, and thus you can execute more SSE2 instructions (under perfect conditions), i.e "raw" SSE2 performance depends DIRECTLY on clock speed.


<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 4, 2002 11:11:16 PM

Hmm then I wonder how now Intel is the one who will lag behind for x86-64, ya?

--
Thunderbirds in wintertime, Northwoods in summertime! :lol: 
May 4, 2002 11:12:10 PM

Quote:
By increasing the clock speed, you get more clock cycles per second, and thus you can execute more SSE2 instructions (under perfect conditions), i.e "raw" SSE2 performance depends DIRECTLY on clock speed.


No no no no no no no!


First off, can be processed in one clock cycle is not NOT the same as is processed in a single clock cycle.

SSE2 is just like x86 in the fact that it is a standardized set of instructions which are ran on a core to produce results, the core you run it on is the SOLE DETERMINING FACTOR, of how fast it runs.

sse2 can be implimented on a core so that it excecutes faster per clock than another implimentation, in effect an sse2 core has IPC, just like a regular core.

This is PROVEN!!
In the fact that sse is faster on an axp than a p3, you have NO argument.

sse2 is exactly the same as sse in this regards, your claim the hammer WILL lose to a faster mhz p4 in sse2 is WRONG.


You should let it go, you are wrong and its only going to go downhill from here copen.

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
May 4, 2002 11:17:16 PM

<blockquote><font size=1>Svar på:</font><hr><p>Wrong, amds sse runs better than the p3's at the same clock speed,<p><hr></blockquote><p>No it doesn't, it because NO real life programs are made out of SSE instructions only. Most applications needs to read and write from/to external memory once in a while, so here memory bandwith plays a role, and there are also other areas that influence the result. It IS NOT a proof of Athlon being able to execute a SSE instruction faster than the P3.


<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 4, 2002 11:19:41 PM

<blockquote><font size=1>Svar på:</font><hr><p>Hmm then I wonder how now Intel is the one who will lag behind for x86-64, ya?<p><hr></blockquote><p>I'll just quote myself from a previous post:

<i>but lately I admit, they have come up with some nice things.</i>


<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 4, 2002 11:25:29 PM

Quote:
No it doesn't, it because NO real life programs are made out of SSE instructions only. Most applications needs to read and write from/to external memory once in a while, so here memory bandwith plays a role, and there are also other areas that influence the result. It IS NOT a proof of Athlon being able to execute a SSE instruction faster than the P3.


Actually, a website took a benchmark(I think it was sysmark, it was the one who had the "genuineintel" issue)
and ran the test on an axp without sse, then enabled sse with the patch, the axp gained like 7% from this.

However, they then altered the app to not run with sse on the p3, and it lost only 4% of its score, what does that tell you?



From this debate I get the impression you have a hard time distinguishing between sse2 and a cpus sse2 engine.


"SSE2 on the hammer will run slower than on a faster clocked p4" Copenhagen.

^ completely FALSE.

You dont know what kind of sse2 core the hammer will have, and what optimisations it will include.

Sse and sse2 are LANGUAGES/EXTENSIONS, not set core functions(in the meaning that sse on every chip performs the same).

You are incorrect, im sorry if you have trouble accepting it, but note how no one is agreeing with you, the reason is because you are not right on your assesment of sse2 performance and functionality.


:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
May 4, 2002 11:30:26 PM

<blockquote><font size=1>Svar på:</font><hr><p>sse2 is exactly the same as sse in this regards, your claim the hammer WILL lose to a faster mhz p4 in sse2 is WRONG.<p><hr></blockquote><p>I'll bet I can code a small program loop which will run faster on a 3.0GHz P4 than on a 2.0GHz CH provided that CH doesn't implement more than one SSE2 "execution unit". This will be a "raw" SSE2 instruction execution benchmark.

If you take a real life application, that's a diferent store altogether, because a number of other factors, which doesn't involve SSE2 instructions, will influence the result.

<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 4, 2002 11:41:34 PM

<blockquote><font size=1>Svar på:</font><hr><p>Actually, a website took a benchmark(I think it was sysmark, it was the one who had the "genuineintel" issue)
and ran the test on an axp without sse, then enabled sse with the patch, the axp gained like 7% from this.

However, they then altered the app to not run with sse on the p3, and it lost only 4% of its score, what does that tell you?<p><hr></blockquote><p>It doesn't tell anything. I think I have made it pretty clear that real life applications are complex programs that are a mixture of SSE, SSE2 and x86 FPU and a lot of other things. It is therefore impossible to anything about the "raw" SSE or SSE2 performance.


<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 4, 2002 11:46:51 PM

Copenhagen, if you're not talking about real-world performance then, frankly, who cares?

:wink: <b><i>"A penny saved is a penny earned!"</i></b> :wink:
May 5, 2002 1:35:09 AM

Glad I dont sound like AmdMeltdown. Must be your brother.
May 5, 2002 1:37:16 AM

I agree with you AMD_Man. Copenhagen quit doing what if what if. Facts are what Count.
May 5, 2002 2:24:18 AM

When we discuss these things, theoretical facts are nice, but cold concrete proof is even better. I mean if it doesn't happen in real life, then why bother? If we can't tell, then does it really matter? If only "theoreitcally" happens, but in peformance terms doesn't help, isn't this optimization or whatever a total failure? As they say seeing is believing.

My firewall tastes like burning. :eek: 
May 5, 2002 7:34:45 AM

<blockquote><font size=1>Svar på:</font><hr><p>Copenhagen, if you're not talking about real-world performance then, frankly, who cares? <p><hr></blockquote><p>You have a point, which is also why I said the following at a much earlier point is this thread, quote:

<i>We'll probably see P4 taking the lead in applications extremly well suited and optimized for SSE2, while CH could dominate applications which are only medium suited for SSE2 and totally dominate all other non-SSE2 optimized applications.</i>

This pretty much reflects my view on this matter. It all depends on the application. A SSE2 optimized version of Microsoft Word would probably run faster on a 2.0GHz CH than on a 3.0GHz Northwood.

This was my opinion and still is, I haven't heard anything yet that changes that.

<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 5, 2002 10:23:15 AM

I am very afraid Intel will introduce a new variety of x86-64 with Prescott, that will be uncompatible with AMD's implementation. Guess Intel is too proud to just license it from AMD. And wants Microsoft to bring out three versions of 64-bit Windows XP: IA-64, x86-64 and Yamnill-x86-64. I guess that OS will be expensive ...

Bikeman

<i>Then again, that's just my opinion</i>
May 5, 2002 10:44:15 AM

indeed

prefer to use a processor that has excellent ALU and FPU BEFORE any additional optimisations such as SSE2 or 3dnow.

many scientific apps will never get SSE2 optimisation. or 3dnow.


<font color=purple>Win ME Slayer. And PROUD of it!</font color=purple>
May 5, 2002 11:54:46 AM

<blockquote><font size=1>Svar på:</font><hr><p>prefer to use a processor that has excellent ALU and FPU BEFORE any additional optimisations such as SSE2 or 3dnow.<p><hr></blockquote><p>Right here and right now most of currently availble applications doesn't take advantage of SSE2, so in that respect the Athlon has been a far more well balanced CPU just like the P3. But raw power (in terms of high clock frequency) and the increasingly adoption of SSE2 in the software community is currently making the P4 the most powerful CPU.

<blockquote><font size=1>Svar på:</font><hr><p>many scientific apps will never get SSE2 optimisation. or 3dnow.<p><hr></blockquote><p>They will, if there is a significant gain to be obtained by doing it and with the increasing clock-speed it could happen.



<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 5, 2002 2:42:01 PM

That would screw us, and the end user. It would be stupid. I am more than sure they won't do this money-losing trick, to have 3 OS to confuse consumers, and just for Intel, when AMD got their prize for once? Sorry but I don't agree.

--
Thunderbirds in wintertime, Northwoods in summertime! :lol: 
May 5, 2002 2:44:03 PM

It may be different from Intel's so that Intel's can also use clockspeed to win, but it can also be like SSE and continue to run better on Athlon no matter what. AMD can add many things so that clockspeed dependant apps using SSE2, would still benefit more from AMD's. To me, it seems SSE2 CAN run better than on P4s, but again that's how we both see this argument.

--
Thunderbirds in wintertime, Northwoods in summertime! :lol: 
May 5, 2002 11:47:35 PM

Quote:
AMD can add many things so that clockspeed dependant apps using SSE2, would still benefit more from AMD's. To me, it seems SSE2 CAN run better than on P4s, but again that's how we both see this argument



except sse2 is not clock speed dependant and is not implimented in x fasion on every core.



:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
May 6, 2002 12:51:43 AM

i agree that in many cases SSE2 acceleration will make a difference, but the apps im talking about are the scientific freeware variety, for linux platforms and beowulf clusters. SSE2 optimisation would take alot of reprogramming and the resources/knowhow/person time required just doesnt exist.

case point: UD - cure for cancer is intel sponsored, yet to my knowledge it has no SSE, SSE2 or 3Dnow optimisations. all programming effort is directed towards bug fixing, program streamlining, and website/server maintenance related to the distributed nature of UD.

<font color=purple>Be gentle with yourself, but not too gentle when browsing porn :smile: .</font color=purple>
May 6, 2002 3:50:42 AM

If Intel was desperate to put AMD out of business, this tactic might work. Due to Intel's greater market share and clout, far more server administrators would be installing the Yamhill version of Win XP on their machines than the x86-64 version. Because of this, developers would tend to compile the Yamhill versions of their apps first, with x86-64 as an afterthought (similar to SSE vs. 3DNow). Eventually some developers would quit coding x86-64 altogether due to the extra expense for little profit (similar to how many game developers don't code for Macs or take *forever* to do it).

But I don't think Intel wants to shut AMD down because then they'll essentially have a monopoly on the market, which would lead to possibly dangerous lawsuits. It'll be interesting to see what next year brings and how Intel responds.

One thing's for sure: it won't matter worth a damn to us down here on the desktop end. For gamers and general users, it'll still be 32-bit for a good while.

Ritesh
May 6, 2002 5:19:30 AM

each instructions set as 3DNOW, SSE, SSE2 are "just" extended instructions which are directly implemented in the cpu.

if the applications are SSE2 optimized, whatsoever the cpu supporting the same instructions set, it is capable to take advantage of as well.

this is why AMD has implemented SEE, then this SSE2 in his cpu to kick the Intel ass on his proper flagfield (for all SSE/SSE2 optimized applications). the defeat of the 3DNOW instructions set could also explain the AMD attitude face of the Intel instructions sets.

besides, you cant see SEE or SEE2 advantages with Word or whatsoever desktop applications because they arent implemented in.


<i>if <b>you know</b> <font color=white>you don't know<font color=black>, the way could be more easy ...
May 6, 2002 4:42:38 PM

Back to SSE2 optimized applications...

Note: I am not intending to bash you. I am trying to help you open your mind to the fact that one could engineer a process that accomplishes SSE2 instruction set but by doing it in a different way than Intel does it.

Quote:
I repeat the definition of SSE/SSE2:

SSE is a single instruction multiple data (SIMD) processing scheme. SIMD combines several intensive computations into a single instruction. The single instruction can then be processed in as little as one CPU clock cycle, thus allowing for much improved performance over traditional core operations.

<b><font color=blue>SSE is a</b></font color=blue> single instruction multiple data (SIMD) <b><font color=blue>processing scheme.</b></font color=blue>

There are a lot of different ways to get the same process requirements accomplished using different processing setups. That is why Industrial Engineers exist. (Efficiencies gain by utilizing differing production methods)

<b>"Sometimes you can't hear me because I'm talking in parenthesis" - Steven Wright</b> :lol: 
May 6, 2002 6:08:22 PM

<blockquote><font size=1>Svar på:</font><hr><p>I am trying to help you open your mind ...<p><hr></blockquote><p>Sorry... Mission Impossible.


<i>/Copenhagen - Clockspeed will make the difference... in the end</i> :cool:
May 6, 2002 11:39:48 PM

Just a little education for Copenhagen on SIMD:

Quote:
SSE is a single instruction multiple data (SIMD) processing scheme. SIMD combines several intensive computations into a single instruction.

SIMD simply performs the same operation on multiple sets of like data in a single <b>instruction</b> (instruction <> cycle):

A + B = C
5 + 4 = 9
2 + 3 = 5
81+19 = 100

All of the above ops would have been completed in a single instruction. This has huge benefits because the operands don't have to be re-loaded each time, but the data generally does, depending on the number of variables and sets of data.

Quote:
The single instruction can then be processed in as little as one CPU clock cycle, thus allowing for much improved performance over traditional core operations.

Being processed in "as little as" one CPU clock cycle only means that some, not all (and in practice very few), SIMD ops are processed in a single cycle. In most cases, processing these "matrixes" of data requires many cycles.

In addition, most modern procs divide instructions into "micro-ops" that are the actual smallest op unit in the proc. If AMD writes more efficient micro-ops for their implementation of SSE2 (like they did for SSE on AXP), it will take fewer cycles to process the same SSE2 instructions.

The only comparison where your point would have any validity is if one were to make up an app purely out of 1-cycle SSE2 instructions (the minority of SSE2 instructions). Any other pure SSE2 app with varying (multiple or single) cycle SSE2 instructions could be broken down into fewer clock cycles on the CH (i.e. a 4-cycle SSE2 instruction on the P4 could take only 1, 2 or 3 cycles on the CH due to core optimization).

Since none of us (or the tech media for that matter) has been able to perform this kind of (or real world) performance testing, it is all speculation.

To recap:
Copenhagen would only be right with 1-cycle SSE2 instructions (that can't be optimized to 2 instructions per cycle on the CH). This will never happen in SSE2 optimized apps or the rest of the world.

SSE2 is NOT dependant purely on clock cycle. While it can help (like any other instruction), it is no more dependant than standard x86 CISC instructions.

If the thought I thought I thought had been the thought I thought, I wouldn't have thought so much.
      • 1
      • 2 / 2
!