Sign in with
Sign up | Sign in
Your question

why not increase die size?

Tags:
  • CPUs
  • Processors
Last response: in CPUs
Share
October 3, 2003 5:33:16 PM

Nowadays the transistor size is shrinking to the limit and still the die for the processor remains very small.Why cannot increase the size of the processor?

If this question seems stupid sorry for that.

More about : increase die size

October 3, 2003 6:24:43 PM

# of chips / wafer.

Shadus
October 3, 2003 6:29:40 PM

Please explain a little bit more
Related resources
October 3, 2003 7:01:04 PM

If wafer costs 3000 bucks they want to sqeeze as much out of the wafer as possible.

-Jeremy
Unofficial Intel PR Spokesman. (nVidia fill in rep for CoolSquirtle)

:evil:  <A HREF="http://forumz.tomshardware.com/modules.php?name=Forums&..." target="_new">Busting Sh@t Up In My Buddies Face!!!</A> :evil: 
October 3, 2003 7:32:02 PM

the way they make processors is punching them out of things called waffers, im not sure what they are made of but you have to use them for various reasons. they are expensive so they try to make as many fit on one waffer as possible. thats where the real cost of making a cpu comes from. besides the equipment it takes to print circuits on those things. thats how i think it works, im probly wrong on a few things but you got the jist of it.

wpdclan.com cs game server - 69.12.5.119:27015
October 3, 2003 8:02:01 PM

Like others mentioned the size of wafer is the reson why they want to keep the die size small.

One more reason is that the yield will be effected if you try to put too many transitors on a single die.

KG

"Artificial intelligence is no match for natural stupidity." - Sarah Chambers
October 3, 2003 8:38:33 PM

I thought that gate lengths had something to do with it too. Give electrons too far to travel and electron migration weakens the signal to beyond usable. Up the voltage to accomodate for this and you get too much wasted energy in the form of heat.

I could be wrong but that was the reasoning that I understood was why they don't just spread the same circuitry over a larger area.

<pre><A HREF="http://ars.userfriendly.org/cartoons/?id=20030905" target="_new"><font color=black>People don't understand how hard being a dark god can be. - Hastur</font color=black></A></pre><p>
October 3, 2003 8:59:17 PM

Chips are made out of silicon. Silicon is everywhere, even in beach sand, so you might think it is cheap. However, the problem is purity. Silicon has a crystal structure, and to make transistors out of silicon, the crystal has to be nearly perfect. I don't know the exact number but it's something like 1 defect per 1 billion atoms. Nearly nothing in the world has this purity, and certainly not natural crystals.

To get this incredible purity, several processes are required. First they make a long sausage with it by melting chemically purified silicon and letting that crystalize very slowly. After that it still needs to be purified further. They do this by letting the silicon sausage go through a high-current electrical loop. This induces a current in the silicon itself which melts it very locally through the section of the loop. A nice property of the impurities is that they run away from the molten section. So what they do is, again really slowly, slide the sausage through the loop so that the impurities are transported to one end of it. This isn't done once but several times!

These are the basic processes I've heard of, but I believe there are more tricks. Intel currently focusses on 12-inch diameter sausages. From this they saw the 12-inch wafers, and then polish them slowly till the surface is flat and the bumps on it are not bigger than a few atoms. In these processes they waste about two third of their precious pure silicon. And then there's also waste because of yield. Plus I haven't started talking about the other costs directly related to die size yet...

So, given the incredible demand for silicon in every branch of electronics, I think it's a miracle current CPUs are 1 cm² in size and cost 'only' 100 euros.

(I'm not sure about some details so don't quote me on it.)
October 3, 2003 9:11:01 PM

Almost forgot to mention...

CPUs run at light speed. Let me explain. The speed at which a signal travels silicon is a fraction of the light speed. Again I don't know the exact number, but let's say 1/3. So that's 100.000 km/s. Near-future chips will run at 10 GHz, so in this time, the signal travels 1 cm. This means the signal can hardly cross a 1 x 1 cm die from side to side! And that's the time the signal needs to just travel the distance, you also have to add the time that is required for the signal to actually open/close transistors. And the speed of electrons is much lower than the speed of the signal they carry.

So, even if silicon wasn't that expensive, increasing die size wouldn't be beneficial because then you need to lower clock speed. The Pentium 4 already loses several clock cycles just to allow data to travel from one side of the chip to the other. More advanced asyncronous clock distributions help a bit but they can't do miracles...
October 3, 2003 9:41:37 PM

Actually, neither Intel nor AMD makes their wafers. They're purchased from third parties. And as I understand it, the cost of the wafers isn't a big factor in the final production cost (although SOI wafers cost about 35% more than prior types).

The big cost and bottleneck is lithography - imaging the layers of the chip on the wafer. You want to keep the dies small so you can image several at once. Newer lithography tools are somewhere on the order of $15,000,000 *each*, so no one buys a lot of extras to have on hand just for kicks.
October 3, 2003 10:14:40 PM

Yes, but, correct me if I'm wrong, I don't think the cost of the masks is bigger when producing bigger dies. The wafer is simply always the same size so it doesn't matter. And the topic was about die size and not the fixed cost...
October 4, 2003 3:02:10 AM

Quote:
The speed at which a signal travels silicon is a fraction of the light speed. Again I don't know the exact number, but let's say 1/3. So that's 100.000 km/s. Near-future chips will run at 10 GHz, so in this time, the signal travels 1 cm.

Uh... the signal no matter how you look at it has to move at the speed of electrons.

10ghz is just how many pulses get sent from the clock signal, not how fast the signal is sent or goes through.

If electrons moved at the speed of light...those are some pretty amazing electrons, I would think everyone would want to find some of those electrons...
October 4, 2003 4:54:10 AM

Depends on what you mean by "signal". The electron itself isn't the signal, it's simply the carrier. The electric field it generates is the signal. When a transistor is open allows flow of electrons, what it really allows is for a change in the electric field. That change in the electric field then propogates at the speed of light.
As for the time it takes for a signal to cross the die, that doesn't happen often. With modern pipelining architectures, most signals only go from one part of the die to a part directly next to it. That counts as one "clock". If you pipeline multiple signals, then while it takes one signal 20 cycles to go from start to end, when you fill up the pipeline, you'll still get a throughput of 1 signal every cycle.

"We are Microsoft, resistance is futile." - Bill Gates, 2015.
October 4, 2003 3:06:46 PM

Sorry, but you are very wrong...

Let me use an analogy everybody knows. Sound. It travels at roughly 1000 km/h through air. However, this doesn't mean that the air being pushed by your speaker travels 1000 km in an hour. Only the sound wave travels this fast. Not even the air molecules immediately leaving the speaker have this speed, since the coil doesn't move that fast and the speed of sound is independent of the frequency!

Electrical signals in CPUs can be best compared to shockwaves. In front and behind the wave the electron density is slightly different. The charge required to create 1V on the minuscule wires in a CPU is unimaginably small and requires only a slight displacement of the electrons. This happens at nearly the speed of light and thus this is the speed at which the signal travels. Again, it's not the electrons that displace at the speed of light, it's the displacement itself that propagates at this speed, just like the sound wave.
October 4, 2003 3:30:16 PM

Let me reply for everybody in one post.
Slvr_pheonix,shadus,spud,kemche and sonoran explained why these manufacturers are making the die size small.Now I understand so many factors are there such as cost ,power consumption and heat reasoning the small die size.But what will happen when these dies accomadate maximum number of
transistors(upto its physical limit).I was looking for that situation and I wanted to know at that time is there is an option for increasing the die size.And u c0d1f1ed got the point.IF ur "speed of electrons" theory is right that is the actual reason,because you pointed out that even if we increase the die size there is no perfomance advantage(or it will hamper the perfomance).That might be a real reason,but our dear friend imgod2u pointed out that efficient pipelining can overcome that problem.I am really confused and cannot comment now,but at the first observation that pipeling therory is not valid because,as the entire processor is synchronized(is there any new technology which is asynchronous) with the same clock that clock signal itself take some time to reach all the parts.However,I need to study the subject from various angles.Dear friends if you can locate any study material about this processor manufacturing please let me know.And if anybody's opinion is different about the "real"problem for increasing the die size prove it.
October 4, 2003 3:44:22 PM

I thought electrons moved like 1 meter every like 7 hours or something like that. The actually electricity is just a electron exchange from each atom. Which is pretty close to the speed of light in theory.

In the case of semi-conductors its a passing of electrons from the doped silicon to the other layer of doped silicon.

But wafers are crazy expensive and take a hell of along time to make since they have to do the "spining" of the ultra pure silicon. Then they polish like 80% of it away . Then toss it in a oven that oxidate's it or something like that causeing a thin layer of silicon oxide I think that acts like a electrical terminal like + - deal.

Its crazy complicated there are other steps but its been a while since I have had to worry about it so I kinda forgot.

Then there is lithography which is its own beast.

~Jeremy
Unofficial Intel PR spokesman.(nVidia fill in rep for CoolSquirtle)

:evil:  <A HREF="http://forumz.tomshardware.com/modules.php?name=Forums&..." target="_new">Busting Sh@t Up In My Buddies Face!!!</A> :evil: 
October 4, 2003 4:15:03 PM

Quote:
Uh... the signal no matter how you look at it has to move at the speed of electrons.

10ghz is just how many pulses get sent from the clock signal, not how fast the signal is sent or goes through.

If electrons moved at the speed of light...those are some pretty amazing electrons, I would think everyone would want to find some of those electrons...

Uh, sorry, but you're wrong. Electricity propagates as an electron movement signal, much like the information that water is flowing in a hose gets transported much faster than the water itself. Simply put, each water element just "pushes" the next.

:evil:  <font color=red><b>M</b></font color=red>ephistopheles
October 4, 2003 4:27:37 PM

Quote:
Depends on what you mean by "signal". The electron itself isn't the signal, it's simply the carrier. The electric field it generates is the signal. When a transistor is open allows flow of electrons, what it really allows is for a change in the electric field. That change in the electric field then propogates at the speed of light.

Beautifully said, imgod2u. Really. Exactly. Very accurate. *congrats*

:evil:  <font color=red><b>M</b></font color=red>ephistopheles
October 4, 2003 4:33:04 PM

Quote:
Not even the air molecules immediately leaving the speaker have this speed,

Erm, actually, yes, there are several air molecules travelling at speeds in excess of 1000kph. However, the mean free path (the length they can actually travel before hitting another molecule) is very short, so air is just a gazillion molecules flying completely randomly and very quickly... And hitting each other. The speed in which a molecule can effectively travel in air is, of course, nowhere near 1000kph. Think about farts... If they travelled at those speeds, we'd be in trouble...

:evil:  <font color=red><b>M</b></font color=red>ephistopheles
October 4, 2003 4:44:04 PM

Quote:
I thought electrons moved like 1 meter every like 7 hours or something like that. The actually electricity is just a electron exchange from each atom. Which is pretty close to the speed of light in theory.

That's exactly right. Electrons travel at speeds of a few dozen centimeters every hour... The electrical signal inside silicon, however, propagates at the speed of light inside silicon (actually, calling it speed of electromagnetic radiation would be more accurate). This speed is only a fraction - for silicon, typically 1/4-1/3 - of the speed of light in vacuum. Light speed is so important for electrical signal because photons - light particles - are the carriers for the electric/magnetic field signals. Determining the exact speed in which the electric signal travels is probably tricky business, because <A HREF="http://www.ioffe.rssi.ru/SVA/NSM/Semicond/Si/Figs/141.g..." target="_new">the index of refraction</A> - which is the property that relates speed of light in vacuum to speed of light inside silicon - changes with photon wavelength... So, who knows?... I don't know of anyone who works that closely with silicon...

:evil:  <font color=red><b>M</b></font color=red>ephistopheles
October 4, 2003 5:27:47 PM

Dear Mephistopheles and Spud
So what I am understanding from your posts is ,a signal will take some time to reach from one end of silicon to other end.The processor clock signal should be smaller than the propogation delay of the signal.As the propogation dealy(one end to the other end) increases with the die size there is defenitely some limitation of the die size for a given frequency of clock signal.So if u want to attain high clock rates your die sizes should be small.Hence even if we are not considering factors like heat,power consumption and cost we cannot increase the die size because ,even if increasing the die size increases the number of transistors it will limit he processor clock rate and hence will hamper the overall perfomance.
Tell your opinions about the above said theory.Is it the real thing?
October 4, 2003 5:49:52 PM

As far as I know there is no real limit to die size. Just a machine(its the correct term to describe a cpu) tick issue will arize. Plus clock generateing issues too along with timeing issues for all the silicon to work as "one" machine.

Since as far as I know the P7 has I think 3 clock generators on board. One fore the 2x pumped ALU's FPU not FPU's but FPU since there is just extra issue ports being used (holds up pretty good compared to the Athlons 3 FPU's). Another for external core frequency and another to manage cache sub systems. Ill have to double check for you guys to make sure I am right... pretty sure I am though.

But as far as I understand a "clock signal" is just a power on/off sequence for the gates to work, could be wrong though.

But increaseing transistors will not hamper the clock speed so much as a timeing system has to be installed into the Logic of the machine. Look at the Alpha and Itanium series of machines they are monstrous machines the Itanium is like 420 million transistors (most of it cache), and the Alpha was always 3x to 4x the size of any compareable x86 machine for the longest time.

~Jeremy
Unofficial Intel PR spokesman.(nVidia fill in rep for CoolSquirtle)

:evil:  <A HREF="http://forumz.tomshardware.com/modules.php?name=Forums&..." target="_new">Busting Sh@t Up In My Buddies Face!!!</A> :evil: 
October 4, 2003 7:00:46 PM

Quote:
Uh, sorry, but you're wrong. Electricity propagates as an electron movement signal, much like the information that water is flowing in a hose gets transported much faster than the water itself. Simply put, each water element just "pushes" the next.

Sorry if the explaination was not clear. I am neither a physicist nor an electrical engineer, therefore my conclusions are simple at best. I do however have basic understanding of transistors and digital circuits.

It is my understanding that a transistor works like a gate. If we compare it to water, it is exactly like a gate. For example, when the transistor is 'on' water is allowed to flow through, but when it is 'off', water is not allowed to flow through. So my reasoning is such: when the transistor switches it's state, there is a delay for the water to pass since there is no water on the other side of the gate. Thus, the speed at which the water will reach the next gate depends on how fast the water flows (if we assume that air does not stop the flow of water).

How this compares to electricity and current transistors I am not sure.
October 4, 2003 7:40:36 PM

Quote:
It is my understanding that a transistor works like a gate. If we compare it to water, it is exactly like a gate. For example, when the transistor is 'on' water is allowed to flow through, but when it is 'off', water is not allowed to flow through. So my reasoning is such: when the transistor switches it's state, there is a delay for the water to pass since there is no water on the other side of the gate. Thus, the speed at which the water will reach the next gate depends on how fast the water flows (if we assume that air does not stop the flow of water).


There is a delay but you have to consider that that water starts moving almost instantaneously. As soon as transistor gate is open, water can *begin* moving. Now, as it moves, it pushes against the electron next to it, and so on and so on. That push, is the signal. The water particle itself only has to move an infinitely small amount of space, but it still pushes the water particle next to it, and so on and so on. The question is, how fast does that "push" propogate.
Think of a line of dominos. If you push one domino, it'll knock down the next one, and then the next, then the next. The domino itself is moving a very small distance but that "pushing" signal still propogates. For dominos, the force used to push is a physical force (although at its most basic, it's still electric interactions). That propogation is limited to whatever speed the dominos fall at. Electrons, however, use electric fields to "push" the next electron. That electric field transfers the "pushing" force at the speed of light. So as one electron begins to move and start pushing, another electron that is 1 km away (assuming no dissipation) will feel that push (1 km)/(300,000 km/sec) seconds later.

"We are Microsoft, resistance is futile." - Bill Gates, 2015.
October 5, 2003 12:02:44 PM

You are 100% correct on that one. You have to up the voltage with the more transistors that you have per wafer. If you increase die size, you won't be able to increase transistor size very well until your transistor size decreases...more transistors = more components = more heat. Hence the reasoning behind .15u and the leaf blower heat sinks you need for them (aka volcano 6+).

----------
<b>I'm not normally a religious man, but if you're up there, save me, Superman! </b> <i>Homer Simpson</i>
October 5, 2003 3:03:17 PM

Quote:
since there is no water on the other side of the gate.

There is water on the other side of the gate alright; it's just not moving.

:evil:  <font color=red><b>M</b></font color=red>ephistopheles
October 5, 2003 6:54:47 PM

Ok, now the explaination is clearer, at least to me and after reading a couple of documents.

Quote:
infinitely small amount of space

That would have helped in the explaination. Maybe it was assumed I don't know, but if you line up dominoes with about an inch worth of space then it's quote obvious the delay is there, but if you line them up with very little space, the dominoes begin to act more like a single block.

Quote:
CPUs run at light speed.

That is the original line that "bugs" me. But maybe he was refering to the electrical signal rather than the processor's computational capability which I incorrectly addressed.

Quote:
There is water on the other side of the gate alright; it's just not moving.

That makes more sense.



As for the topic at hand (die size), I found a good indirect answer/explaination in relation to clock speed, die size, and transistor capabilities:

<A HREF="http://www.physlink.com/Education/AskExperts/ae391.cfm" target="_new">http://www.physlink.com/Education/AskExperts/ae391.cfm&...;/A>

In short: assuming perfect transistors and perfect prossessor design, the clock speed is limited by the speed of electrical signals and the length. However, the first and second answers suggest that processor circuit design is obviously hard and there isn't a perfect transistor (yet).
October 5, 2003 9:26:20 PM

mmv
October 6, 2003 2:00:32 PM

Quote:
I am really confused and cannot comment now,but at the first observation that pipeling therory is not valid because,as the entire processor is synchronized(is there any new technology which is asynchronous) with the same clock that clock signal itself take some time to reach all the parts.

The synchronosity is far less true today than it was years ago. Intel's CPUs are perfect examples of this. The Pentium 4 is full of buffers and caches that allow the components of the CPU to run very asynchronously, and yet also allow the CPU itself to put the answers as they come in back into a synchronous ordering.

As the clock speed of CPUs increase it becomes more and more necessary for the sections of the CPU to perform independantly and for a component in the CPU to organize the results as they come in all nilly-willy into sequential responses. So CPUs are moving more and more from a centralized clock-based execution to a clockless asynchronous execution that is then stacked into a pseudo-clock response.

Consequently, this also solves the problems associated with die sizes. Because CPUs are becoming more and more oriented towards turning asychnonous processing into a pseudo-synchronized response this also allows for more die space to be used. Hence why in the future we will be having multi-cored CPUs. That way you get double the execution potential in a pseudo-synchronous controlled environment. (Because it's better to have 'two' CPUs each running independantly on the same die and synchronized by software than it is to have 'one' CPU with twice as many execution units where their delay in response of the execution units increases the further away the components are.)

<pre><A HREF="http://ars.userfriendly.org/cartoons/?id=20030905" target="_new"><font color=black>People don't understand how hard being a dark god can be. - Hastur</font color=black></A></pre><p>
October 7, 2003 5:23:41 AM

That was new information for me and sounds interesting. Can u locate any material touching this “asynchronous processor� subject?
October 8, 2003 2:36:38 AM

I find it sad AMD does not go to such lengths on their site with PDFs on the K7 core. It would be damn nice.

--
<A HREF="http://www.lochel.com/THGC/album.html" target="_new"><font color=blue><b>This just in, over 56 no-lifers have their pics up on THGC's Photo Album! </b></font color=blue></A> :lol: 
October 8, 2003 3:29:57 AM

Monkeys cant get Adobe user licences you have to be some sort of citizen not a "group".

~Jeremy
Unofficial Intel PR Spokesman.(nVidia fill in rep for CoolSquirtle)

:evil:  <A HREF="http://forumz.tomshardware.com/modules.php?name=Forums&..." target="_new">Busting Sh@t Up In My Buddies Face!!!</A> :evil: 
October 8, 2003 5:35:03 PM

Quote:
I find it sad AMD does not go to such lengths on their site with PDFs on the K7 core. It would be damn nice.

You mean like these <A HREF="http://www.amd.com/us-en/Processors/TechnicalResources/..." target="_new">for the K7</A> and <A HREF="http://www.amd.com/us-en/Processors/TechnicalResources/..." target="_new">for the K8</A>?

<pre><A HREF="http://ars.userfriendly.org/cartoons/?id=20030905" target="_new"><font color=black>People don't understand how hard being a dark god can be. - Hastur</font color=black></A></pre><p>
October 9, 2003 3:19:34 PM

So,what is the conclusion?There is no limit in die size?To be more specific, imagine you are using perfect liquid cooling to beat the heat and just leave the power consumption issue-so there is no limit in die size?As I read the posts the asynchronous technology will make it possible to have any die size(But I guess serious perfomance degradation is there compared to the synchronous ones for a given transistor count).
October 9, 2003 4:27:19 PM

Quote:
So,what is the conclusion?There is no limit in die size?To be more specific, imagine you are using perfect liquid cooling to beat the heat and just leave the power consumption issue-so there is no limit in die size?As I read the posts the asynchronous technology will make it possible to have any die size(But I guess serious perfomance degradation is there compared to the synchronous ones for a given transistor count).

The conclusion is that technically there is no limit on die size <i>if</i>you design the CPU around asych operation. <i>However</i> the benefits from increasing the die size do reduce exponentially as the die size increases because of the cycles wasted waiting for far-away units to give their response.

The much more efficient solution is to just use the die space to mount two (or more) independant cores which can each operate at peak efficiency. The only flaw with this is that you would also have to use multi-threaded software in order to harness that processing power.

<pre><A HREF="http://ars.userfriendly.org/cartoons/?id=20030905" target="_new"><font color=black>People don't understand how hard being a dark god can be. - Hastur</font color=black></A></pre><p>
October 9, 2003 4:44:18 PM

What I don't understand or fathom easily is how can you design clockless CPUs that can manage everything right?

Suppose you had a pipelined ALU, and the second stage is clockless. Suppose it is slower. Well, how can it manage to take in what the faster stage is doing?

I'm just wondering really how do they cooperate or even time themselves to work right. Also as well, how DO you make clockless CPUs? Or does that not exist on a unity, but rather that each part is clocked at something but you can't state the real clock representing the entire core?

--
<A HREF="http://www.lochel.com/THGC/album.html" target="_new"><font color=blue><b>This just in, over 56 no-lifers have their pics up on THGC's Photo Album! </b></font color=blue></A> :lol: 
October 9, 2003 5:06:40 PM

Quote:
You mean like these for the K7 and for the K8?


Well I'll be damned, thanks!
I checked the x86 optimization guide thing. Indeed this is what I wanted to see, especially the later part which discusses in-depth the microarchitecture. Totally cool!

Thanks again Slvr, I shoulda searched a bit there heh.

--
<A HREF="http://www.lochel.com/THGC/album.html" target="_new"><font color=blue><b>This just in, over 56 no-lifers have their pics up on THGC's Photo Album! </b></font color=blue></A> :lol: 
Anonymous
a b à CPUs
October 9, 2003 5:44:55 PM

>So,what is the conclusion?There is no limit in die size?To
>be more specific, imagine you are using perfect liquid
>cooling to beat the heat and just leave the power
>consumption issue-so there is no limit in die size?

technically, I guess there is hardly any limit to how big a cpu you could make. Still technically, it would be a challenge though to clock such an immense piece of silicon high enough to get any benefits from the increased transistor count.

The more important issue though, is economical. While in theory, you could design a cpu the size of a 300mm wafer, the reality is that such a cpu would be completely unaffordable. let me explain:

My brother works in a company that designs CMOS based image sensors, for aerospace, medical applications but also digital camera's (they designed the (worthless) 14 Mpixel sensor for Kodak's latest prosumer camera). anyway, production cost goes up exponentially with "die" size. They are currently working on a sensor for medical radioscopy (word ?) which is as big as a single wafer (200mm). I'm not sure what that beast will cost, but I doubt its below $10.000 per chip, even using comparably ancient process technologies (.50 or .35, Im not too sure) and with no real clock scaling issues. No wonder, if you realize they may need as many as 4 wafer runs to get just one (partially) working sensor. And with an image sensor you can live with and workaround a few pixels that don't work (like you can live with a cpu with a few defect cache transistors), but once the defect (due to wafer inpurity and other things) hits a logic part, your sensor or chip simply won't work.

Now since performance doesnt exactly scale linearly with die (cache) size (just look at P4 versus P4EE), but cost does scale exponentially with die size, there is a point of diminishing return above which it just doesnt make sense to increase the die size any further. Especially as an even bigger transistor count will not only backfire on cost, but on clock scaling and power consumption as well.

= The views stated herein are my personal views, and not necessarily the views of my wife. =
October 9, 2003 6:27:57 PM

Quote:

you would also have to use multi-threaded software in order to harness that processing power.

That is a simpler way,and nothing new in that as the work is done by the software.But I will only appretiate if there is a completely new architectre which is done in the processor core itself,irrespective of the software.
October 9, 2003 6:32:23 PM

Example from experience is worth.Thank you.
October 9, 2003 7:58:28 PM

Quote:
I'm just wondering really how do they cooperate or even time themselves to work right. Also as well, how DO you make clockless CPUs? Or does that not exist on a unity, but rather that each part is clocked at something but you can't state the real clock representing the entire core?

I think the idea of a clockless CPU is just a bunch of independent processing units and a lot of queues. Say it's got three ALUs. The ALU queue gets loaded up with ten commands and each time one finishes, it just pops the next in the queue and then drops it's answer back. And the FPUs do the same with their own queue. And somewhere some central management takes those answers from the queues and re-linearizes them and spits out the answers as fast as it can. And if there's nothing in the queue then you just idle until something is there.

If done right the synchronization is just handled by units that look at the queues for opportunities and put their answers back into that queue (or maybe into an answer queue) when they're done. And each queue item / answer would obviously have to have an identifier on it of course. :)  And the central management that keeps things in order has it's own queue of commands and responses of course, which is how dependant actions are handled safely.

Obviously commands would have to be broken down into lots of really basic micro code to ensure dependancy operation order is observed, but it's really not that different from how a P4 or an Itanium already operates. :o 

Who needs synchronization to any periodic clock when you have queues? That's how we write most of our multi-threaded software here. Actually ... we write a lot of single-threaded software like that too come to think of it. Queues rock.

The biggest problem really is two fold:
1) When everything runs at full steam that's going to generate a crap load of heat. At least on a clocked cycle components get time to cool off between cycle hits.

2) How do you rate the CPU? In a truly clockless CPU each CPU could have very differing performance depending on how well each individual unit runs, all from the same design. You'd have to come up with some way to benchmark and categorize them, but even then it'd be only vague at best, where at least clocks are very accurate for determining consistancy in performance.

Quote:
Well I'll be damned, thanks!
I checked the x86 optimization guide thing. Indeed this is what I wanted to see, especially the later part which discusses in-depth the microarchitecture. Totally cool!

Thanks again Slvr, I shoulda searched a bit there heh.

Intel makes it a <i>lot</i> easier to find than AMD, and I find that Intel's documentation is usually easier to read too. But AMD does do it. They pretty much have to, since you can't do much tailoring without that kind of information. Heh heh.

<pre><A HREF="http://ars.userfriendly.org/cartoons/?id=20030905" target="_new"><font color=black>People don't understand how hard being a dark god can be. - Hastur</font color=black></A></pre><p>
October 9, 2003 8:22:49 PM

Quote:
Now since performance doesnt exactly scale linearly with die (cache) size (just look at P4 versus P4EE), but cost does scale exponentially with die size, there is a point of diminishing return above which it just doesnt make sense to increase the die size any further. Especially as an even bigger transistor count will not only backfire on cost, but on clock scaling and power consumption as well.

I am pretty sure the EE and xeons use off die cache, which would not increase the die size.



If it isn't a P6 then it isn't a procesor
110% BX fanboy
Anonymous
a b à CPUs
October 9, 2003 10:16:36 PM

No its ondie in both cases, Itanium L3 is also ondie. Off die is very rare these days (I think just about only IBM's Power4+ uses offdie, on MCM cache, but I could be wrong). Maybe you are confusing the fast those are L3 caches, and not l2 ?

Either way, this doesnt change my point much. P4EE has over 3x the ammount of cache of a Northwood, yet manages roughly 10-15% better performance. Northwood is 131 mm², P4EE around 220 mm². Thats is 70% bigger for 10% better performance. And bare in mind the next 2 megabytes would be even less effective, and far more expensive. Diminishing returns, that was my point :) 

= The views stated herein are my personal views, and not necessarily the views of my wife. =
October 10, 2003 12:40:53 AM

Yes i am talking about the l3 cache, i do not think it is on die...i beleive that it is off of the cpu core but still on the cpus pcb.


If it isn't a P6 then it isn't a procesor
110% BX fanboy
October 10, 2003 3:03:36 AM

There is no "PCB" per se on modern socket-type MPU's. The packaging contains the pins, some regulators, but that's about it. Everything's on-die. The only exception would be the Itanium Merced or the IBM Power4+, which uses packaging and keeps its cache in the same package but not on the processor die.

"We are Microsoft, resistance is futile." - Bill Gates, 2015.
Anonymous
a b à CPUs
October 10, 2003 11:30:30 AM

>Yes i am talking about the l3 cache, i do not think it is
>on die...i beleive that it is off of the cpu core but still
>on the cpus pcb.

Believe what you want, doesnt make it true. I can't access intel's website for the moment to give you an official link, but there is always <A HREF="http://www.sandpile.org/impl/p4.htm" target="_new">sandpile.org</A>

= The views stated herein are my personal views, and not necessarily the views of my wife. =
October 10, 2003 2:00:44 PM

Die Size 217 mm² (0.18 µm with 128 KB L2 Cache)
217 mm² (0.18 µm with 256 KB L2 Cache)
??? mm² (same as above + 512 KB L3 Cache)
??? mm² (same as above + 1024 KB L3 Cache)
131 mm² (0.13 µm with 256 KB L2 Cache)
<b>146 mm² (0.13 µm with 512 KB L2 Cache), then 131 mm²
??? mm² (same as above + 1024 KB L3 Cache)
??? mm² (same as above + 2048 KB L3 Cache)</b>

The l3 cache is not actually part of the core, thats what i mean.


If it isn't a P6 then it isn't a procesor
110% BX fanboy
October 10, 2003 2:07:24 PM

well arnt the pins connected to the core via copper traces? Wouldnt that pretty much be considered a pcb? There are also resistors mounted on the bottom of the cpu, along with pins and regulators...

What is the diffrence between the die and the core?


If it isn't a P6 then it isn't a procesor
110% BX fanboy
Anonymous
a b à CPUs
October 10, 2003 2:19:24 PM

WTF are you saying now ? its <b>ondie</b>, just like any L1 or L2 cache these days, and unlike older Pentium 2's and thunderbirds that had off die L2 caches (seperate chips on a common PCB module). It's one and the same chip, a single piece of silicon.

You can define the word 'core' anyway you like, even to exclude L1 or L2 caches, in which case you are right that the L3 is not part of the "core", but you cannot seperate the cache from the rest of the chip. The L3 has an impact on the diesize (and a rather big impact I might add), just like the 1 MB L2 cache impacts A64 die size and the 2 MB L2 cache impacts Prescott's diesize.

The only cpu that I am aware off that still uses offdie caches (L3) is the Power4 that uses some huge 128 MB L3 cache shared by 4 chips in a single MCM. AFAIK, every other cpu only has ondie cache, be it level 1, 2 or 3. Even the Itanium 6 MB and soon the 9M has its cache entirely ondie (making one huge MF die).

= The views stated herein are my personal views, and not necessarily the views of my wife. =
      • 1 / 2
      • 2
      • Newest
!