Sign in with
Sign up | Sign in
Your question

'Reverse Hyperthreading' coming in k10 what about k8l

Last response: in CPUs
Share
May 2, 2006 1:23:24 PM

as i said in the topic, it's coming in the k10, but what about the k8l.

The K10 wont be released until 2011 at the earliest, so who the hell gives a damn.

honestly, i dont think AMD will be able to beat conroe, even after the k10, because by that time intel would've developed and released their own 'reverse hyperthreading', and even added some extra new stuff.

AND this article by anandtech says that

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=274...

intel's core architecture has no disagvantages, so we conclude that AMD will be history in 2 or 3 years.

so goodbye AMD
May 2, 2006 1:48:27 PM

I doubt AMD will die or go away, but then again, I do think we are going to see some serious competition. I would hate to see AMD die, because then Intel are going to monopolise and over-price EVERYTHING, and eventually the chip quality would decrease, because there would be no other alternative to compare it with. So saying goodbye to some good competition is foolish indeed.
May 2, 2006 1:57:22 PM

MadModMike did a thread on this not to long ago, news a little old.
Related resources
May 2, 2006 2:23:46 PM

Yeah, I read the thread quite a while ago... even added a comment as well... which is why I focused on pointing out to the guy why he shouldn't say goodbye to AMD, otherwise his 'beloved' Intel will rape him with over-priced processors.... :lol: 
May 2, 2006 2:35:16 PM

Quote:
Yeah, I read the thread quite a while ago... even added a comment as well... which is why I focused on pointing out to the guy why he shouldn't say goodbye to AMD, otherwise his 'beloved' Intel will rape him with over-priced processors.... :lol: 
Ha!
May 2, 2006 2:36:26 PM

Quote:
intel's core architecture has no disagvantages, so we conclude that AMD will be history in 2 or 3 years.

hmmmm no. AMD was always behind intel until athlon XP, so if they didnt broke when they were behind intel, why would they fall after losing the crown? where where you in the last 10 years? you are probably a 15 year old kid who likes to spend your father's money

Quote:
so goodbye AMD

yeah, then you can buy a $2000 processor from intel's monopoly

some people are just so stupid...
May 2, 2006 3:23:39 PM

Dear Stranger,
Please do not bothering to post things you obviously know nothing about.

thank you,
Your Friendly Fellow Members, Etc., Etc.......
Anonymous
a b à CPUs
May 2, 2006 5:16:54 PM

Well nobody seams to have paid attention to the AT link he gave.
It's a really interesting read and not even Lacostiade seams to read it carefully.

After reading that I had some insight on what you should expect from K8L. They keep saying K8L its evolutionary and not revolutionnary. I can see quite a few tweaks AMD could do on the K8 to bring it back on par with the <expected> conroe performance. If you look at the life of the K7/K8 architechture, you will see that all the evolution provided a good bump in performance and that this architchture has the potential too scale much further before a total re-design has to take place, thats unlike NetBurst and much more like the P6/Banias/dothan/yonah.

My feeling on this right now is that Conroe should have a good edge over K8 when it comes out. After that, K8L should be able to catch up if AMD engeneer do a good job. And then we might see a good fight between the two compagnies and that would be the best for every consumer.

I really dont see why people think AMD will get nailed for 3 years just because intel release a new architechture. Intel NEEDED a brand new architechture because they obviously did the wrong thing with NETBurst and they could'nt continue doing so. AMD on the other hand has a good and strong scalable architechture. AMD engeneer have'nt been sleeping either and Im sure they can do some tweaks intel's just did with core and bring performance back on par.

I dont see how some people think AMD is unbeatable and that core will suck but thats another story
May 2, 2006 5:25:55 PM

Another thread that wastes forumz space and members time.
I don't understand what has that article from ananadtech with something K8L, K10, or whatever you are thinking will be a name of any future AMD architecture?
It is saying not a letter about K8L, K10 or whatever else....
What is that reverse hyperthreading and how is it possible, I want to know, please explain it to me.... :x :evil: 
May 2, 2006 5:34:34 PM

the ridiculous thing at anandtech is that he is licking intel's ball before they release their final products. I dont know if the guys that come up with posts like this noticed that remarkable sites, like THG for example, dont even mention the fact that intel made a performance preview.

by the way, the concept behind "reverse hyperthreading" is to make multiples cores on a cpu appear as only one cpu to the operating system, thus, it's like hyperthreading (one cpu shows up as multiple), but on the reverse.
May 2, 2006 5:50:46 PM

Quote:
by the way, the concept behind "reverse hyperthreading" is to make multiples cores on a cpu appear as only one cpu to the operating system, thus, it's like hyperthreading (one cpu shows up as multiple), but on the reverse.

I know what this wish means, but is very unpossible and unlogic for me....
So if someone can explain, I wonder.....
May 2, 2006 5:52:43 PM

I actually read the whole article, AND I don't disagree with your comments. That doesn't mean I agree with the way the person started this post or his blunt stupidity. My K6-2 wasn't a very good cpu at its time. If AMD were going to die, it would have then. Moral of the story. If you want to talk trash, better be able to back it. Haven't seen him respond yet.
May 2, 2006 6:01:50 PM

Firstly... Word.

Secondly, Reverse hyperthreading is a step forward towards true parallel computing, by having two cores processing one thread. Imagine a quad core cpu, and having it run on a dual-thread configuration, meaning two cores per thread. The OS will see the cpu as a dual-core system in terms of functionality, but in terms of hardware, is still a quad-core. By having more than one core processing a thread, you are able to get a bigger performance boost out of the processor. It does this by spreading a potentially heavy task across the multiple cores assigned to that thread, allowing the task to be completed a lot more quickly. The technology required to do this would be quite awesome, since the amount of communication that would have to occur between the cores would be tremendous, however, AMD did perfect the SOI process for 90nm chips, and soon for 65nm chips, so in terms of ability, AMD have the stuff it takes to pull this sort of stuff. Such technology would see its greatest benefit in single-threaded apps like games and such, though even then, threaded apps should also get a meaty boost. Want more information, this link will be your friend.
May 2, 2006 6:40:10 PM

Quote:
by the way, the concept behind "reverse hyperthreading" is to make multiples cores on a cpu appear as only one cpu to the operating system, thus, it's like hyperthreading (one cpu shows up as multiple), but on the reverse.

I know what this wish means, but is very unpossible and unlogic for me....
So if someone can explain, I wonder.....

In the past, a CPU did one and only one task at a time. It was the easiest way to do it.

Think of a CPU as a train engine. If you had two separate loads and needed to transport one to Chicago and another to Walla Walla Washington, you could either send the same train to two places (single processor), or send one train to Chicago and the other to Walla Walla (dual processor).

Now imagine you have these two train engines, but you only have one really big load to send to Chicago. The obvious answer is to link the engines together and send it all to Chicago. The railroad companies actually do this. It gets the load where it's going on time. That is what reverse hyperthreading is like. You use the entire CPU capability to concentrate on one task.

In theory, it should be up to twice as fast as using a single core. In practice, it may wind up being a 15-30% speed boost, depending on how efficiently they can segment the program to take advantage of both cores.
May 2, 2006 6:52:27 PM

Quote:
Firstly... Word.

Secondly, Reverse hyperthreading is a step forward towards true parallel computing, by having two cores processing one thread. Imagine a quad core cpu, and having it run on a dual-thread configuration, meaning two cores per thread. The OS will see the cpu as a dual-core system in terms of functionality, but in terms of hardware, is still a quad-core. By having more than one core processing a thread, you are able to get a bigger performance boost out of the processor. It does this by spreading a potentially heavy task across the multiple cores assigned to that thread, allowing the task to be completed a lot more quickly. The technology required to do this would be quite awesome, since the amount of communication that would have to occur between the cores would be tremendous, however, AMD did perfect the SOI process for 90nm chips, and soon for 65nm chips, so in terms of ability, AMD have the stuff it takes to pull this sort of stuff. Such technology would see its greatest benefit in single-threaded apps like games and such, though even then, threaded apps should also get a meaty boost. Want more information, this link will be your friend.

Quote:
In the past, a CPU did one and only one task at a time. It was the easiest way to do it.

Think of a CPU as a train engine. If you had two separate loads and needed to transport one to Chicago and another to Walla Walla Washington, you could either send the same train to two places (single processor), or send one train to Chicago and the other to Walla Walla (dual processor).

Now imagine you have these two train engines, but you only have one really big load to send to Chicago. The obvious answer is to link the engines together and send it all to Chicago. The railroad companies actually do this. It gets the load where it's going on time. That is what reverse hyperthreading is like. You use the entire CPU capability to concentrate on one task.

In theory, it should be up to twice as fast as using a single core. In practice, it may wind up being a 15-30% speed boost, depending on how efficiently they can segment the program to take advantage of both cores.

Again, these are wishes, thre is nothing technical in the explanatiton.
Tell me how one linear thread can be divided and processed by many cores. Any logical explanation.
It is better to explain like this, and see that is impossible and is onlt wish of uneducated fanboy:
If one woman can born 1 child for 9 months, than 9 woman can born 9 childern for 9 months. Does it means that 9 mothers born 1 child for 1 month?
May 2, 2006 7:17:13 PM



Thanks for trying though!
May 2, 2006 7:33:14 PM

can you tell me, technically, how intel's hyperthread works?
May 2, 2006 7:43:50 PM

If we COULD tell you TECHNICALLY how all this works, we would be PAID NOT TO. Do you honestly think there are that many people in the world that are educated enough in the field that one would be so bored and un-employed to spend his day posting on TG? I'm not saying people here are un-employed, I'm saying that that type of info would only be released by someone who wasn't employed or under contract. Heck, If I knew all the details on this, you betcha I'd get me a job working in the field. Look up the average salary for an Intel employee.
May 2, 2006 7:49:18 PM

hey, i'm at work right now, i'm not unemployed :p 

for gOJDO, intel's HT is out for years already, and i cant see anything more technical than "allow your OS to see one cpu as two"... pretty technical huh?
May 2, 2006 7:53:19 PM

WHAT?!?!?! We're both working? :p 
I hope you read the whole post, wasn't saying that you weren't :D 
May 2, 2006 8:17:07 PM

Hey, I just tried to explain it as best as I could. Unfortunately, I have not seen any technical blueprints or detailed reports on Reverse Hyperthreading, so i can't be as technical as you want me to be. Either way, I'm interested to see how AMD might develop such a concept into reality.
a b à CPUs
May 2, 2006 8:40:38 PM

Quote:
intel's core architecture has no disagvantages, so we conclude that AMD will be history in 2 or 3 years.

hmmmm no. AMD was always behind intel until athlon XP, so if they didnt broke when they were behind intel, why would they fall after losing the crown? where where you in the last 10 years? you are probably a 15 year old kid who likes to spend your father's money

Quote:
so goodbye AMD

yeah, then you can buy a $2000 processor from intel's monopoly

some people are just so stupid...

How old are you ? do you remember a chip called the thunderbird
I do and I also remember it kicking ass all over the p3

So no AMD was not always behind untill the xp series

I sell intel I think they both make great products
I get sidepipe just setting up any system socket 775 or 939
providing they arent cheap celeron pricks
May 2, 2006 8:45:44 PM

Quote:
hey, i'm at work right now, i'm not unemployed :p 

for gOJDO, intel's HT is out for years already, and i cant see anything more technical than "allow your OS to see one cpu as two"... pretty technical huh?

Yes, what about Reverse Hyperthreading mentioned and some-how explained more as wish than as idea by some disscutants on this thread.
I know exactly how HT works, but the folowing article can explain it better and more than I:
http://www.digit-life.com/articles/pentium4xeonhyperthreading/
May 2, 2006 10:30:11 PM

Well as long as reverse hyperthreading doesn't come in K9, because its its a pit bull, Ill be reverse hyperthreding in the opposite direction
May 2, 2006 10:35:23 PM

Reverse Hyperthreading does not exist! Impossible! If you can't understand than learn as something impossible.
May 2, 2006 10:47:20 PM

Oh right! Just like the boogieman doesn't exist! *Checks under his bed*

:lol:  Joke.... The thing is, there's enough speculation on the web to say that SOMETHING along the lines of Reverse Hyperthreading is being developed into the later AMD architectures. Just like how effective the Conroe processor really is, Reverse HT is mostly speculation, so instead of denying its existance, why not discuss it? Or even better... google it! :lol: 
May 2, 2006 11:11:57 PM

It was disscused. The speculations are from AMD fanboy blogger, and posted here on the forum from other AMD fanboys. Both does not understand what are they saying. Point 1 link from apropriate person with any clue and any technical explanation.
You can make one core to act as many slower, but there is no way to make more cores to act as one faster.
It is very simple, job B waits for job A to finish before the calculated data by job A can be used in B. Lets say that one core can do job A for 1 second, and job B for 2 seconds. If there is one core it will finsh both jobs for 1+2=3 secs. If there are 2 cores, and core1 is doing job A, while core2 is doing job B than the core2 will wait 1 secs before it can start doing jobB, becouse there is no calculated data available from core1, and the both jobs will be finished again 1+2=3 secs. This is the case where the latncy of the bus that conects the two cores is 0, and that is simply impossible. So with the ReverseHyperthreading only slower performance can be achieved.

than w
May 2, 2006 11:42:26 PM

But you're assuming that everything in a single process is inter-related. This may not be the case. There are many cases where a single kernel level (i.e. the OS is aware of it) thread (process) encapsulates multiple user level threads (the OS is unaware of them as separate threads). This is done to save overhead since switching threads of execution with in a process does not require a OS level task swap, which is relatively expensive.

Now, I have no idea how AMD could id user level threads on chip and distribute them across cores automatically, but if it can be done, or something like it, it should theoretically yield a substantial performance increase. Thouch as you point out, communication between cores is going to be relatively expensive compared to keeping everything in one core.
But to outright proclaim that it can't be done with a performance hit is extreme to say the least.
May 2, 2006 11:52:19 PM

Quote:
But you're assuming that everything in a single process is inter-related. This may not be the case. There are many cases where a single kernel level (i.e. the OS is aware of it) thread (process) encapsulates multiple user level threads (the OS is unaware of them as separate threads). This is done to save overhead since switching threads of execution with in a process does not require a OS level task swap, which is relatively expensive.

Now, I have no idea how AMD could id user level threads on chip and distribute them across cores automatically, but if it can be done, or something like it, it should theoretically yield a substantial performance increase. Thouch as you point out, communication between cores is going to be relatively expensive compared to keeping everything in one core.
But to outright proclaim that it can't be done with a performance hit is extreme to say the least.

You are missunderstanding the point.
We are talking about ONE SINGLE THREAD that with the "ReverseHyperThreading" should be divided as many threads and processed in parallel!?!?
It is very impossible, the explanation of the disscutants is in style:
you are in city A and you have to go to city B, that is 100km away. 1 man with 1 car will travel 1.6 hours, but if 4 man travel with 4 cars they will travel 0.4 hours.....stupid, you agree?
May 3, 2006 12:24:17 AM

hmmm i'm 27 and i remember some k6-II and III kicking some pentium II asses.. the same with thunderbirds, but when i said amd got the lead with athlon XP was when they changed from the "wtf is this?" company to a well recognized company.. i mean, to the average user who mostly knows only intels... even nowadays there are some people that never heard of amd, but that's another story..

Quote:
Yes, what about Reverse Hyperthreading mentioned and some-how explained more as wish than as idea by some disscutants on this thread.

actually, like somene already said, if you google about you'll see a lot of news from different sites mentioning that amd is already researching about it, and it will come to production, so i guess this is not a wish...

Quote:
It is very impossible, the explanation of the disscutants is in style:
you are in city A and you have to go to city B, that is 100km away. 1 man with 1 car will travel 1.6 hours, but if 4 man travel with 4 cars they will travel 0.4 hours.....stupid, you agree?

i dont know how amd will do it, but i wouldnt say it's impossible...
i dont think your example is too good...
but i think a task.. let's suppose, pile up 10 boxes. 1 person can pile 1 box in 1 second... 1 person would take 10 seconds, 10 people would do it in 1 second...

and i think a single thread can be divided into separated threads internally. that's why caching, buffering, pre-fetching is good for... if you can load a whole instruction set in cache, why cant you set different core to process a different part from an instruction set?

as my old math teacher used to say: "never say never"
May 3, 2006 12:44:07 AM

First of all, no K6 kicked P2 at same freq, and meanwhile while K6(before K7) there were P3. Becouse of MMX, AMD invented the 3D Now! which was not that bad, but unsupported by software developers. It was maybe the first thing that AMD made by them selves and before Intel. After Intel implemented the SSE, which was much more better than the 3D Now! and software supported. With the K7 there was competition, and they won the battle with the Thunderbird and the folowing AthlonXP chips.

Again point one link where someone apropriate talks, explains, gives a clue, gives a rasonable idea about that reverse ht.

The given example is for parallel processing, not for reverse HT. Explain how 10 men can pile 1 box in 0.1 sec. if 1 man can pile 1 box for 1 sec.
May 3, 2006 12:45:17 AM

I agree it's a bit strong to say Reverse Hyperthreading is impossible, but I'd love to see how AMD is going to overcome the many inheirent problems. Yes it is possible to divide a thread up into parts and distribute it to processors, but then you would have to work out dependancies which could get very complicated. It would appear that you would need some type of logic on the processor to divide the thread and more importantly put it back together. That all adds processing time and coordination between the cores which reduces the speed up. Then there's the problem of what happens if one of the cores stalls. All the cores will then have to wait for the other core to finish making the whole system stall. You could argue the other cores could load their completed parts to cache and do something else, but then everything has to be loaded back once the last core is finished for reassembly which takes time. It'll also reduce available cache space for other operations.

As well, the reason why I highlighted that the logic must be mostly on the processor is because if it is software based I highly doubt the technology will go anywhere. There's no way that software companies will go the opposite direction and code for reverse hyperthreading when the industry is concentrating it's efforts on increasing multithreading.

Like I said before, it's not impossible, and if AMD can get it to work I'll have to applaud their ingenuity.
May 3, 2006 1:24:32 AM

Quote:
hmmm i'm 27 and i remember some k6-II and III kicking some pentium II asses.. the same with thunderbirds, but when i said amd got the lead with athlon XP was when they changed from the "wtf is this?" company to a well recognized company.. i mean, to the average user who mostly knows only intels... even nowadays there are some people that never heard of amd, but that's another story..

Yes, what about Reverse Hyperthreading mentioned and some-how explained more as wish than as idea by some disscutants on this thread.

actually, like somene already said, if you google about you'll see a lot of news from different sites mentioning that amd is already researching about it, and it will come to production, so i guess this is not a wish...

Quote:
It is very impossible, the explanation of the disscutants is in style:
you are in city A and you have to go to city B, that is 100km away. 1 man with 1 car will travel 1.6 hours, but if 4 man travel with 4 cars they will travel 0.4 hours.....stupid, you agree?

i dont know how amd will do it, but i wouldnt say it's impossible...
i dont think your example is too good...
but i think a task.. let's suppose, pile up 10 boxes. 1 person can pile 1 box in 1 second... 1 person would take 10 seconds, 10 people would do it in 1 second...

and i think a single thread can be divided into separated threads internally. that's why caching, buffering, pre-fetching is good for... if you can load a whole instruction set in cache, why cant you set different core to process a different part from an instruction set?

as my old math teacher used to say: "never say never"

So let me get this straight they are researching a reverse hyper threading; now to me that clearly states they are attempting to take 2 threads and combine them into one thread for executions. Since as we all know hyper threading takes 1 thread and splits it into 2 threads to better utilize the execution pipeline.

So to what ends would taking 2 threads and combining them do for a multi-core machine?

Um the entire instruction set in cache?
May 3, 2006 1:24:54 AM

well, the first link i've seen about reverse hyperthread was this one:

http://techreport.com/onearticle.x/9788

i dont know what you think about techreport, but i personally think he's much better than anandtech or x-bit labs for example.

about 10 men piling up 1 box... well, that's the same thing that running a non-threaded application on a smp system :)  you must know that a software must be thread optimized to take full advantage of smp, dont you?
May 3, 2006 1:30:03 AM

Quote:
Since as we all know hyperthreading takes 1 thread and splits it into 2 threads to better utilize the execution pipeline.

hmmm no. i'm sorry but you're wrong... hyperthread runs 2 threads in one cpu... "reverse hyperthread" is supposed to runs 1 thread on multiple cpus

Quote:
Um the entire instruction set in cache?

oops, the whole data to be processed by the instruction set :p  anyway, the same thing about that conroe and super pi controversy... conroe can perform pretty well in super pi because it can cache all the data to be processed in cache... if this can be done, why splitting this data in different pieces and assigning each core to process cant be?
May 3, 2006 1:31:11 AM

Yes, that's just a news report pointing to X86-secret.com who was the first to mention reverse hyperthreading. All it saids is what it's supposed to do rather than how it actually does it, which is what is puzzling most people.
May 3, 2006 1:32:57 AM

Well it is all gravey to me. I am about performance. However, I have become a bit partial to AMD because of that crap that intel pulled with the P4. I mean think of how many unsuspecting people they screwed with the "look at how big our numbers are" scam. It is just rediculous. Not everyone out there is as informed as the people who frequent these forumz. I hope AMD comes back. I am not a fanboy but I don't like the underhanded business that some corperations pull and intel certainly came out with some misleading marketing. I mean perfect example, the late PIII would out perform early P4.........how gay is that? Think about what that means for a second. We are going to release a 2006 car but the 2005 model has better features and and better ratings........doesn't make much sense. Same underhanded marketing that microsoft pulled with the release of ME.
May 3, 2006 1:33:10 AM

Quote:

The given example is for parallel processing, not for reverse HT. Explain how 10 men can pile 1 box in 0.1 sec. if 1 man can pile 1 box for 1 sec.


The analogy is flawed because a box can't be broken up and moved in pieces, a thread can. In fact that is exactly how a thread is always handled, it's taken one set of instructions at a time and processed.
Now consider this:
Instruction set come from outside the chip into some sort of controller
Controller assigns the instruction sets to a core of it's choosing.
core processes instructions and writes to L1 cache which all cores share

Now the obvious problem with this is sharing L1 cache among several cores, but intel is supposedly sharing L2 cache in the conroe architecture so this sort of thing is not out of reach.
The slightly less obvious problem with this is concurrency control among cores for who is using what from L1 cache. This is potentially a severe performance limiter.
There are a couple other bottlenecks in this system, but it would effectively abstract multiple cores into a unit indistinguishable from a single core to external processes.

Do I think this is what AMD is doing? Not at all. I think the engineers that have dominated the PC performance market for the past 5 years are probably doing something a lot more efficient. The point is, if it's possible for me to come up with an inefficient design to do something like this inside of an hour I'm quite sure the folks at AMD could come up with a very marketable product based on the concept here.
May 3, 2006 1:36:51 AM

Quote:
gives a rasonable idea about that reverse ht

i got the idea.. if you dont, i dont know what your problem is...
if you want to know more, why dont you wait and see? or not
May 3, 2006 1:59:41 AM

Quote:
Since as we all know hyperthreading takes 1 thread and splits it into 2 threads to better utilize the execution pipeline.

hmmm no. i'm sorry but you're wrong... hyperthread runs 2 threads in one cpu... "reverse hyperthread" is supposed to runs 1 thread on multiple cpus

Quote:
Um the entire instruction set in cache?

oops, the whole data to be processed by the instruction set :p  anyway, the same thing about that conroe and super pi controversy... conroe can perform pretty well in super pi because it can cache all the data to be processed in cache... if this can be done, why splitting this data in different pieces and assigning each core to process cant be?

No Windows XP is coded for HT, there is only one thread being produced. If it was a case of 2 real threads being executed by one machine, then perhaps Windows 2K would work with HT correctly and not suffer a performance hit, unless you think Windows 2K doesn't support 2 processors.

Now to keep things in perspective Windows XP is coded to understand that the one thread it produces can be spit internally by the CPU or more specifically load balancing and resource balancing by the CPU’s logic. In essence all its doing is filling the empty spots in the P4 pipeline as best it can.

Unless of course you believe Intel's techno spin doctors.

No you don't understand what that cache does for the Conroe and neither do you understand how SuperPi is being executed. Conroe’s cache is necessary to give the prefetch engines the needed resources to allow continuous execution to occur. It certainly isn’t storing 1 million outputs from a SuperPi execution.

Now Pi is 3.14159265... and so on and so forth, now the machine is calculating the ratio of a circle's circumference divided by its diameter. Which it will continue to do till in this case the 1 million times, no where do these output numbers stay in cache, they are immediately written to memory and no where does prefetching or the cache assist the calculation of Pi. This is all machine math done internally, which while in execution will not leave L1 other that to output.
May 3, 2006 2:21:28 AM

hmmmmm now i know you dont know what a thread is :) 
windows 2k and xp are smp capable, that means: they can recognize 2 cpu's and take advantage of that. the problem with hyperthreading is that there are no 2 physical cpu's, that explains performance loss with HT enabled.

for a thread, we can simplify as a process running. a single processor running two threads would have to process a bit of a thread, stop, process a bit of the other, go back to the first and so on. it processes two real processes in real time, it does not split a single thread into separate thread pieces. This is supposed to be done by amd's technology.

Quote:
It certainly isn’t storing 1 million outputs from a SuperPi execution

well, conroe has 4MB of L2... i was talking about super pi 1M... as long as i know, 1MB fits into 4MB
May 3, 2006 3:20:48 AM

Quote:
hmmmmm now i know you dont know what a thread is :) 
windows 2k and xp are smp capable, that means: they can recognize 2 cpu's and take advantage of that. the problem with hyperthreading is that there are no 2 physical cpu's, that explains performance loss with HT enabled.

for a thread, we can simplify as a process running. a single processor running two threads would have to process a bit of a thread, stop, process a bit of the other, go back to the first and so on. it processes two real processes in real time, it does not split a single thread into separate thread pieces. This is supposed to be done by amd's technology.

It certainly isn’t storing 1 million outputs from a SuperPi execution

well, conroe has 4MB of L2... i was talking about super pi 1M... as long as i know, 1MB fits into 4MB

I don't think you quite grasp what a thread is and how it is being executed, but thats fine I am not going to sit here and have a back and forth match with someone that honestly believes they are correct.

Pi 1M is 1,000,000 digits of Pi, which is approximately 1MB worth of data, but what you fail to understand is that the software is directly outputing the results straight to main memory, it is not being stored in the cache. It is all internal division to be specific nothing more and nothing less.
May 3, 2006 3:32:56 AM

Believe it or not "reverse hyperthreading" is what parallel processing is all about. It has been going on for decades and is only recently new idea to the PC world. It's quite easy to understand if you know what most computers do.

Most processor time is taken up by numerous repetative calculations, especially when you're talking graphics. This is why Processors have piplines, to speed up these calculations by doing them in parallel. 9-stage pipe = 9 steps to each calculation, then you shift the data in. After 9 clocks you start getting 1 calculation completed every clock rather than 9 clocks for each. This saves a great deal of time when you have a few billion calcs to do.

GPU's work in parallel too, 8 pixel shaders work on 8 parallel calculations, even though they are for the same thread and even the same frame. The programmer for a game doesn't have to know how many shaders there are, just that they exist, the GPU does the load sharing.

Now with "reverse hyperthreading" you simply have 2 cores that can talk to eachother and combine to make 1 larger virtual pipline unit when there's no need to have 2 separate cores. Processor 1 works out stages 1-9 then shuffles the data into the second processor for stages 10-18. --OR-- it can work like that GPU and do two parallel 9 stage pipes. Since there might be 40 billion identical calculations to be processed for that single thread... it's trivial for the processors to anticipate this and divide up the workload without the OS even knowing.

There are several good parallel processing textbooks out there that go through conceptual design and mathematical theory. Like I said it isn't new and is well documented, AMD is just rebranding it for better marketing. For the sake of single threads I hope they produce it.
May 3, 2006 3:35:01 AM

No one is saying that it exists, but to say that it never will is silly. Maybe the form it will take will be altered, but that doesn't mean something in this thought line won't happen. If Intel can micro-manage cache between two cores, how far are we from having one core that can split into multiple mini cores when that will be more efficient, and from there, it just does not seem to be much of a step to apply that to more cores/sockets/cpu's. Sure, the use of the technology might be VERY limited. Heck, how many situations does HT actually help in? But, it is a logical step to take to get better performance with less heat/power consumption. But honestly folks, all this talk about CPU performance is getting kinda silly, too. I am a bit of a power nut, myself, but things need to get focused back on funfactor, not GPU/CPU performance. The cheapest budget cpu's out there are more than up to what normal people need a computer for. And $100-150 GPU's are quickly becoming overkill for most games. The graphics and speed are great, but bring back more fun! I still like old games with terrible graphics if they are fun.
May 3, 2006 5:04:25 AM

While it's nice to mention GPUs, my understanding is that they practice extreme parallism akin to multithreading rather than reverse hyperthreading. Shaders are individual programs that are designed to work on either one pixel or one vertex at a time making me think that they are more similar to a thread rather than a part of a thread.

Quote:
Since there might be 40 billion identical calculations to be processed for that single thread... it's trivial for the processors to anticipate this and divide up the workload without the OS even knowing.

I'm unsure whether the processor will actually be able to anticipate how to divide up a thread without sacrificing enough performance to not make it worthwhile. I would think that dependencies and branching would make the code difficult to anticipate. There's also the matter that not all the parts will finish at the same time making recombining more difficult. I guess by K10 quad cores will be prevalent so 1 core could be dedicated to analysing and dividing threads, 2 cores for processing, and the last core for recombining. However, given that K8L will be released in 2007, K10 won't likely be out until 2008 when most programs will already be mulithreaded reducing the need for reverse hyperthreading.
May 3, 2006 7:59:25 AM

Quote:
well, conroe has 4MB of L2... i was talking about super pi 1M... as long as i know, 1MB fits into 4MB

There was a thrad form the same AMD fanboy blogger about the SuperPI 1M calc fitting in 4MB Cache. First of all, both the blogger and you jap0nes have no idea what you are talking, and what are you thinking is absolutely wrong.
SuperPI is using 3MB of RAM when idle, and it needs an extra 8MB RAM for calculating the 1M length value of Pi and 256Mb RAM for the 32M. It is RAM, not L1 or L2 cache which is ondie memory for very different purpose: caching the allready allocated data from the RAM that will be needed(or maybe not) for processing latter. When you start the calculation the program gives a message about how many interations will be made, how much RAM is allocated, and how much time each iteration takes. You have to be blind or you have never used SuperPi if you don't know this.

Quote:
The analogy is flawed because a box can't be broken up and moved in pieces, a thread can. In fact that is exactly how a thread is always handled, it's taken one set of instructions at a time and processed.

OK, if you say so, how are you going yo broke this thread:
a=gettickcount();
b=a*a*3.14;
c=(a^b)-(a/b);
I say there is no way becouse all the calculations are dependend to each other. In the first step, the current RTC value is asigned to variable a. There is no way to calculate var b becouse you don't know the value of a, so it can't be passed to antoher core and processed meanwhile. The same is for var c, you can't calculate it before you calculate the value of var a and var b.
Well it is possible to calculate each value on different core, but each waiting for others to finish before it can start calculating. That is not reverse HT, that is not parallel processing.

Quote:
GPU's work in parallel too, 8 pixel shaders work on 8 parallel calculations, even though they are for the same thread and even the same frame. The programmer for a game doesn't have to know how many shaders there are, just that they exist, the GPU does the load sharing.

Now with "reverse hyperthreading" you simply have 2 cores that can talk to eachother and combine to make 1 larger virtual pipline unit when there's no need to have 2 separate cores. Processor 1 works out stages 1-9 then shuffles the data into the second processor for stages 10-18. --OR-- it can work like that GPU and do two parallel 9 stage pipes. Since there might be 40 billion identical calculations to be processed for that single thread... it's trivial for the processors to anticipate this and divide up the workload without the OS even knowing.

Well, 8 shaders are working on 8 pixels at once, not on 1. All the multimedia can be processed in parallel, becouse the data that needs to be processed is independend, and that is many threads, not one. Streaming Single Instruction Multiple Data Extension, the name of SSE explains everything. Imagine about Streaming Multi Instruction Single Data Extension, that is kind of "reverse HT" stuff.
May 3, 2006 10:37:06 AM

Quote:
Well as long as reverse hyperthreading doesn't come in K9, because its its a pit bull, Ill be reverse hyperthreding in the opposite direction


K9 has been cancelled and replaced by k8l. go to aswers.com if u dont believe me
May 3, 2006 11:25:11 AM

Quote:
Pi 1M is 1,000,000 digits of Pi, which is approximately 1MB worth of data, but what you fail to understand is that the software is directly outputing the results straight to main memory, it is not being stored in the cache. It is all internal division to be specific nothing more and nothing less.

well, what i was saying is that the program can load all the data necessary to do its calculation in cache, then outputs its results to memory

oh, and your vision of HT is still wrong... what you are saying is that HT can split a non-threaded application into separated threads internally. what HT do allow is you to run 2 different non-threaded programs at the same time, or launch 2 or more threads from a threaded program at the same time


Quote:
OK, if you say so, how are you going yo broke this thread:
a=gettickcount();
b=a*a*3.14;
c=(a^b)-(a/b);
I say there is no way becouse all the calculations are dependend to each other. In the first step, the current RTC value is asigned to variable a. There is no way to calculate var b becouse you don't know the value of a, so it can't be passed to antoher core and processed meanwhile. The same is for var c, you can't calculate it before you calculate the value of var a and var b.
Well it is possible to calculate each value on different core, but each waiting for others to finish before it can start calculating. That is not reverse HT, that is not parallel processing.

again, i say: a program must be thread optimized to run better on a smp system. i dont know how amd will do that, but if they do, i guess not all program will take advantage of this technology, probably most will, some dont.
a b à CPUs
May 3, 2006 11:44:38 AM

"AMD was always behind intel until athlon XP"

AMD had the lead back in the Slot-A /Athlon "classic" (500, 550, 600) vs. the Intel (pre-Coppermine) P3 series as well, as I recall...; although Coppermine snatched it right back for most applications. (Still have my mid-1999 Slot A 650 doing heavy duty MS Word, printing, and Solitaire duties at my X-wifes house! :-)
May 3, 2006 12:30:53 PM

hey, i already answered that, but i'm too lazy to look for it :p 
my mother's computer is a k6-III 500 and she uses to browse the internet, ms word and printing too with no problems at all
!