Intel's glued-together dual-cores

Guest · Dec 15, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips, comp.sys.intel (More info?)

SiliconStrategies.com - Intel 'dual-core' could be two die glued
together, says report
http://www.siliconstrategies.com/article/showArticle.jhtml?articleId=55301654

LONDON - The planned dual-core processor from Intel Corp.
known as Smithfield could start out as two Pentium 4 chips in a single
package, according to a report that appeared online Tuesday (Dec 14.).

According to the report in The Register, which appeared as Intel held a
telephone press conference to discuss its dual-core processor which is
expected to ship mid-2005, a company executive did not deny the
suggestion that Smithfield would be based on two Pentium 4 processors
glued together in a single package.

Smithfield would initially be fabbed using a 90-nanometer manufacturing
process, but would migrate to a 65-nm process in 2006, the report
quoted Steve Smith, vice president for Intel's desktop platforms group,
as saying.

By the end of 2006 Intel expects over 70 per cent of its desktop CPU
production to be dual-core chips, Smith also said, according to the
report.

The report said Smith declined to comment on whether Smithfield is one
ot more chips in a single package and would only say that Smithfield
contains two execution cores. Smithfiields is expected to operate at a
lower clock frequency than a single P4.

Guest · Dec 16, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On 15 Dec 2004 08:28:44 -0800, "YKhan" <yjkhan@gmail.com> wrote:

>SiliconStrategies.com - Intel 'dual-core' could be two die glued
>together, says report
>http://www.siliconstrategies.com/article/showArticle.jhtml?articleId=55301654
>
>

LONDON - The planned dual-core processor from Intel Corp.
>known as Smithfield could start out as two Pentium 4 chips in a single
>package, according to a report that appeared online Tuesday (Dec 14.).
>
>According to the report in The Register, which appeared as Intel held a
>telephone press conference to discuss its dual-core processor which is
>expected to ship mid-2005, a company executive did not deny the
>suggestion that Smithfield would be based on two Pentium 4 processors
>glued together in a single package.
>
>Smithfield would initially be fabbed using a 90-nanometer manufacturing
>process, but would migrate to a 65-nm process in 2006, the report
>quoted Steve Smith, vice president for Intel's desktop platforms group,
>as saying.
>
>By the end of 2006 Intel expects over 70 per cent of its desktop CPU
>production to be dual-core chips, Smith also said, according to the
>report.
>
>The report said Smith declined to comment on whether Smithfield is one
>ot more chips in a single package and would only say that Smithfield
>contains two execution cores. Smithfiields is expected to operate at a
>lower clock frequency than a single P4.
>
>

Hmmm, VIA talked along similar lines a month or so ago... calling it "twin
core" IIRC.

Rgds, George Macdonald

"Just because they're paranoid doesn't mean you're not psychotic" - Who, me??

Guest · Dec 16, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

George Macdonald wrote:
> Hmmm, VIA talked along similar lines a month or so ago... calling it "twin
> core" IIRC.

Yup, Intel is racing to keep up against VIA. 🙂

Yousuf Khan

keith · Dec 17, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Thu, 16 Dec 2004 08:33:20 -0500, Yousuf Khan wrote:

> George Macdonald wrote:
>> Hmmm, VIA talked along similar lines a month or so ago... calling it "twin
>> core" IIRC.
>
> Yup, Intel is racing to keep up against VIA. 🙂

Ouch! You're cruel! ;-)

--
Keith

Guest · Dec 17, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

YKhan wrote:
> SiliconStrategies.com - Intel 'dual-core' could be two die glued
> together, says report
> http://www.siliconstrategies.com/article/showArticle.jhtml?articleId=55301654
>
>

LONDON - The planned dual-core processor from Intel Corp.
> known as Smithfield could start out as two Pentium 4 chips in a single
> package, according to a report that appeared online Tuesday (Dec 14.).

I believe the original PentiumPro was two chips in a single die carrier,
the CPU and the cache.
>
> According to the report in The Register, which appeared as Intel held a
> telephone press conference to discuss its dual-core processor which is
> expected to ship mid-2005, a company executive did not deny the
> suggestion that Smithfield would be based on two Pentium 4 processors
> glued together in a single package.

To the point, are these current production compatible P4 (ie. HT
enabled)? And do they share L2 (or L3) cache?
>
> Smithfield would initially be fabbed using a 90-nanometer manufacturing
> process, but would migrate to a 65-nm process in 2006, the report
> quoted Steve Smith, vice president for Intel's desktop platforms group,
> as saying.
>
> By the end of 2006 Intel expects over 70 per cent of its desktop CPU
> production to be dual-core chips, Smith also said, according to the
> report.
>
> The report said Smith declined to comment on whether Smithfield is one
> ot more chips in a single package and would only say that Smithfield
> contains two execution cores. Smithfiields is expected to operate at a
> lower clock frequency than a single P4.
>
>

>
There are a lot of interesting questions about this coming technology,
it could be really neat or it could be a true cob job.

--
bill davidsen (davidsen@darkstar.prodigy.com)
SBC/Prodigy Yorktown Heights NY data center
Project Leader, USENET news
http://newsgroups.news.prodigy.com

mygarbage2000 · Dec 17, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On 15 Dec 2004 08:28:44 -0800, "YKhan" <yjkhan@gmail.com> wrote:

>SiliconStrategies.com - Intel 'dual-core' could be two die glued
>together, says report
>http://www.siliconstrategies.com/article/showArticle.jhtml?articleId=55301654
>
>

LONDON - The planned dual-core processor from Intel Corp.
>known as Smithfield could start out as two Pentium 4 chips in a single
>package, according to a report that appeared online Tuesday (Dec 14.).
>
>According to the report in The Register, which appeared as Intel held a
>telephone press conference to discuss its dual-core processor which is
>expected to ship mid-2005, a company executive did not deny the
>suggestion that Smithfield would be based on two Pentium 4 processors
>glued together in a single package.
....snip...
I already have an oil-filled electric heater that has dual (600 W and
900W) core. The cores can be turned on separately or together, thus
providing 3 heating levels. Is Intel-branded dual-core P4 space
heater going to have the same feature, i.e. could one of the cores be
turned off when it gets too hot in the room?
;-)

Guest · Dec 18, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Bill Davidsen wrote:
> I believe the original PentiumPro was two chips in a single die carrier,
> the CPU and the cache.

Yes, and I would guess that the current production Xeons with L3 caches
are also similar, with the L3 being a separate chip?

> To the point, are these current production compatible P4 (ie. HT
> enabled)? And do they share L2 (or L3) cache?

No, they don't share any of their caches with each other. Actually, the
AMD dual-cores are going to be similar to this too, with no shared
cache. You lose a lot of cost savings at the very least, by not
integrating the L2 caches. But you might get slightly better performance
by having the dedicated L2's.

> There are a lot of interesting questions about this coming technology,
> it could be really neat or it could be a true cob job.

I think the main question is whether the internal CPU-CPU communications
mechanism is properly designed or just cobbled together. A properly
designed one would reduce if not eliminate entirely the amount of
cache-snoop traffic going over the FSB.

Yousuf Khan

Guest · Dec 18, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"Yousuf Khan" <bbbl67@ezrs.com> wrote in message
news:3YadnSfwH88y61ncRVn-1g@rogers.com...
> No, they don't share any of their caches with each other. Actually, the
> AMD dual-cores are going to be similar to this too, with no shared cache.
> You lose a lot of cost savings at the very least, by not integrating the
> L2 caches. But you might get slightly better performance by having the
> dedicated L2's.
>
> Yousuf Khan

Interesting, I thought that the DC Opterons were going to share their L2. I
could have sworn I saw that in one of their presentations.

Carlo

Guest · Dec 19, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sat, 18 Dec 2004 12:20:53 -0500, Yousuf Khan <bbbl67@ezrs.com>
wrote:

>Bill Davidsen wrote:
>> I believe the original PentiumPro was two chips in a single die carrier,
>> the CPU and the cache.
>
>Yes, and I would guess that the current production Xeons with L3 caches
>are also similar, with the L3 being a separate chip?

Actually no, all integrated on-die. The L3 just has a narrower
(64-bit vs. 256-bit) connection to the processor core and higher
latency when compared to the L2 cache. Same goes for Itaniums.

>> To the point, are these current production compatible P4 (ie. HT
>> enabled)? And do they share L2 (or L3) cache?
>
>No, they don't share any of their caches with each other. Actually, the
>AMD dual-cores are going to be similar to this too, with no shared
>cache. You lose a lot of cost savings at the very least, by not
>integrating the L2 caches. But you might get slightly better performance
>by having the dedicated L2's.

It probably also simplifies design by a fair bit. A shared cache is
going to be trickier to design than a separate one. By no means an
insurmountable problem, but it would probably just compound add to the
performance hit, making it not worthwhile.

Besides which we seem to be quickly getting to a point where designers
have more transistors than they can figure out what to do with.

-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca

Guest · Dec 19, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Carlo Razzeto wrote:
> Interesting, I thought that the DC Opterons were going to share their L2. I
> could have sworn I saw that in one of their presentations.

Nope, and you'll notice that DC Opterons are almost exactly twice the
size of their SC versions. That's cause they not only add an extra core,
they also added the whole L2 cache too.

From what I've heard, AMD did indeed make their Opterons DC-capable
right from the beginning, but what that actually meant was that they had
simply designed the core so that if they cut two cores side-to-side,
they would see communications channels directly aligned up on each die.
So they were actually ever planning on sharing caches with each other.

Yousuf Khan

Guest · Dec 20, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"Yousuf Khan" <bbbl67@ezrs.com> wrote in message
news:M8CdnYb3RcXqbFjcRVn-iA@rogers.com...
>
> Nope, and you'll notice that DC Opterons are almost exactly twice the size
> of their SC versions. That's cause they not only add an extra core, they
> also added the whole L2 cache too.
>
> From what I've heard, AMD did indeed make their Opterons DC-capable right
> from the beginning, but what that actually meant was that they had simply
> designed the core so that if they cut two cores side-to-side, they would
> see communications channels directly aligned up on each die. So they were
> actually ever planning on sharing caches with each other.
>
> Yousuf Khan

Very interesting... I guess in the end it would make sense to have separate
cache's for each core. Simpler to design, minimal tweaking required to fab
these chips v. single core, and presumably a small performance boost.

Carlo

keith · Dec 20, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 19 Dec 2004 14:56:32 -0500, Yousuf Khan wrote:

> Carlo Razzeto wrote:
>> Interesting, I thought that the DC Opterons were going to share their L2. I
>> could have sworn I saw that in one of their presentations.
>
> Nope, and you'll notice that DC Opterons are almost exactly twice the
> size of their SC versions. That's cause they not only add an extra core,
> they also added the whole L2 cache too.

Which isn't surprising, considering the architecture. The second/spare
port is into the HT controller, not the L2.

> From what I've heard, AMD did indeed make their Opterons DC-capable
> right from the beginning, but what that actually meant was that they had
> simply designed the core so that if they cut two cores side-to-side,
> they would see communications channels directly aligned up on each die.
> So they were actually ever planning on sharing caches with each other.

I heard the same, but I'd like to see some more detail. I'm quite sure
it's not all that "simple". There is a left-right issue and all sorts of
other trivia as well.

--
Keith

keith · Dec 20, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 19 Dec 2004 18:29:57 -0500, Carlo Razzeto wrote:

> "Yousuf Khan" <bbbl67@ezrs.com> wrote in message
> news:M8CdnYb3RcXqbFjcRVn-iA@rogers.com...
>>
>> Nope, and you'll notice that DC Opterons are almost exactly twice the size
>> of their SC versions. That's cause they not only add an extra core, they
>> also added the whole L2 cache too.
>>
>> From what I've heard, AMD did indeed make their Opterons DC-capable right
>> from the beginning, but what that actually meant was that they had simply
>> designed the core so that if they cut two cores side-to-side, they would
>> see communications channels directly aligned up on each die. So they were
>> actually ever planning on sharing caches with each other.
>>
>> Yousuf Khan
>
> Very interesting... I guess in the end it would make sense to have separate
> cache's for each core. Simpler to design, minimal tweaking required to fab
> these chips v. single core, and presumably a small performance boost.

....or loss. Smaller caches and fewer ports might be faster, but
data duplication and cross-snooping might cause it to be slower. This
isn't so clear-cut.

--
Keith

Guest · Dec 20, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Tony Hill wrote:
> The L3 just has a narrower
> (64-bit vs. 256-bit) connection to the processor core and higher
> latency when compared to the L2 cache. Same goes for Itaniums.

Am I misreading you? It sounds like you are saying Itanium's L3 has a
narrower connection to the core than the L2. This is absolutely untrue.
L3 sends data to L2 before L2 sends it on. At worst it is "the same"
because data must take the same path. At best it is "twice as wide"
since the L2 can be filled faster than it can be sent on to the core.
Of course I assume "Itanium" means Itanium 2 family chips since the
original Itanium was a joke and basing any arguments about design
choices of modern processors is insulting.

Alex
--
My words are my own. They represent no other; they belong to no other.
Don't read anything into them or you may be required to compensate me
for violation of copyright. (I do not speak for my employer.)

Guest · Dec 21, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Alex Johnson wrote:
> Am I misreading you? It sounds like you are saying Itanium's L3 has a
> narrower connection to the core than the L2. This is absolutely untrue.
> L3 sends data to L2 before L2 sends it on. At worst it is "the same"
> because data must take the same path. At best it is "twice as wide"
> since the L2 can be filled faster than it can be sent on to the core. Of
> course I assume "Itanium" means Itanium 2 family chips since the
> original Itanium was a joke and basing any arguments about design
> choices of modern processors is insulting.

Were you involved in the project when the Alpha guys designed Tukwila?
Why did the PA-RISC guys not like their design?

Yousuf Khan

Guest · Dec 22, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Mon, 20 Dec 2004 09:11:35 -0500, Alex Johnson <compuwiz@jhu.edu>
wrote:

>Tony Hill wrote:
>> The L3 just has a narrower
>> (64-bit vs. 256-bit) connection to the processor core and higher
>> latency when compared to the L2 cache. Same goes for Itaniums.
>
>Am I misreading you?

Err.. I think you are.

> It sounds like you are saying Itanium's L3 has a
>narrower connection to the core than the L2.

No, I was saying the exact opposite.

> This is absolutely untrue.
> L3 sends data to L2 before L2 sends it on. At worst it is "the same"
>because data must take the same path. At best it is "twice as wide"
>since the L2 can be filled faster than it can be sent on to the core.
>Of course I assume "Itanium" means Itanium 2 family chips since the
>original Itanium was a joke and basing any arguments about design
>choices of modern processors is insulting.

I don't have any numbers for Itanium, the bit I was quoting was for
the P4EE/Xeon (256-bit wide L2 cache port, 64-bit wide L3). I would
guess that the Itanium is at least a similar ratio if not the same
numbers.

Probably more importantly than the bandwidth is the latency. The
P4EE/Xeon chips have something like a 10 cycle L2 latency and about a
40 cycle L3 latency. With Itanium my guess is that the spread is even
wider (ie the very small 256K of L2 cache in the Itanium2 probably has
very low latency while the huge 3-9MB of L3 cache probably has rather
high latency).

-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca

Guest · Dec 22, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Yousuf Khan wrote:
> Bill Davidsen wrote:
>
>> I believe the original PentiumPro was two chips in a single die
>> carrier, the CPU and the cache.
>
>
> Yes, and I would guess that the current production Xeons with L3 caches
> are also similar, with the L3 being a separate chip?
>
>> To the point, are these current production compatible P4 (ie. HT
>> enabled)? And do they share L2 (or L3) cache?
>
>
> No, they don't share any of their caches with each other. Actually, the
> AMD dual-cores are going to be similar to this too, with no shared
> cache. You lose a lot of cost savings at the very least, by not
> integrating the L2 caches. But you might get slightly better performance
> by having the dedicated L2's.

One of those "it depends" cases, you have to do snooping if you do SMP,
the only question is where.
>
>> There are a lot of interesting questions about this coming technology,
>> it could be really neat or it could be a true cob job.
>
>
> I think the main question is whether the internal CPU-CPU communications
> mechanism is properly designed or just cobbled together. A properly
> designed one would reduce if not eliminate entirely the amount of
> cache-snoop traffic going over the FSB.

Totally agree.
>
> Yousuf Khan

--
bill davidsen (davidsen@darkstar.prodigy.com)
SBC/Prodigy Yorktown Heights NY data center
Project Leader, USENET news
http://newsgroups.news.prodigy.com

Guest · Dec 22, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Tony Hill wrote:
>> It sounds like you are saying Itanium's L3 has a
>>narrower connection to the core than the L2.
>
> No, I was saying the exact opposite.
>
> 256-bit wide L2 cache port, 64-bit wide L3

You just said you meant the opposite of what I thought you said, but
then provided numbers to back up what I thought you said. I find the
Xeon to be very strange if it has 256-bit width from L2 and 64-bit width
from L3. That's 32-bytes vs 8-bytes.

Itanium 2 returns data from the L2 256-bits at a time to either the L1D
or the L1I. It fills the L2 256-bits at a time.

> P4EE/Xeon chips have something like a 10 cycle L2 latency and about a
> 40 cycle L3 latency. With Itanium my guess is that the spread is even
> wider (ie the very small 256K of L2 cache in the Itanium2 probably has
> very low latency while the huge 3-9MB of L3 cache probably has rather
> high latency).

Itanium 2 latency is 5 cycles from L2 and 12 cycles from L3. Much
better than Xeon. Xeon has a ratio of 4:1 while Itanium 2 has a ratio
of 2.4:1. Those numbers are for McKinley (the 1GHz version). I believe
the Madison (1.5GHz version) raised the latency to L3 by 2 cycles, so 5
and 14 (2.8:1). Which corresponds to 3.33ns and 9.33ns total time for
the Itanium 2 at 1.5GHz vs (since I don't know what speed Xeon your
numbers are for I'll assume the 3.0GHz Xeon MP with 4M cache) 3.33ns and
13.33ns total times. So the L2 caches have the same access time, but
the Itanium 2 is faster to reach its larger cache. I'm curious to see
what the timings will be on the Montecito, which ups the L3 ante to 12MB.

Alex
--
My words are my own. They represent no other; they belong to no other.
Don't read anything into them or you may be required to compensate me
for violation of copyright. (I do not speak for my employer.)

Guest · Dec 27, 2004

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Wed, 22 Dec 2004 08:31:36 -0500, Alex Johnson <compuwiz@jhu.edu>
wrote:

>Tony Hill wrote:
>>> It sounds like you are saying Itanium's L3 has a
>>>narrower connection to the core than the L2.
>>
>> No, I was saying the exact opposite.
>>
>> 256-bit wide L2 cache port, 64-bit wide L3
>
>You just said you meant the opposite of what I thought you said, but
>then provided numbers to back up what I thought you said. I find the
>Xeon to be very strange if it has 256-bit width from L2 and 64-bit width
>from L3. That's 32-bytes vs 8-bytes.
>
>Itanium 2 returns data from the L2 256-bits at a time to either the L1D
>or the L1I. It fills the L2 256-bits at a time.

I believe the same is true for the Xeon, it just takes 4 clock cycles
to do a fill from L3 cache.

Perhaps someone else in this newsgroup has a bit more precise
knowledge of how it works though, I know a while back there was some
big discussion going on here about cache lines vs. cache segments and
how they all fit into getting data into and out of the processor. In
the end all I took out of the discussion was that everyone seemed to
have a different definition for everything and none of it made much
sense to me! :>

>> P4EE/Xeon chips have something like a 10 cycle L2 latency and about a
>> 40 cycle L3 latency. With Itanium my guess is that the spread is even
>> wider (ie the very small 256K of L2 cache in the Itanium2 probably has
>> very low latency while the huge 3-9MB of L3 cache probably has rather
>> high latency).
>
>Itanium 2 latency is 5 cycles from L2 and 12 cycles from L3. Much
>better than Xeon. Xeon has a ratio of 4:1 while Itanium 2 has a ratio
>of 2.4:1.

Don't quote me on those numbers being exact, just rough estimates of
what I remember them being. I'm not sure if Intel has documented the
exact latency timings for the Xeon, but if they have, I'm not sure
where to find it.

-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca

Intel's glued-together dual-cores

Guest

Guest

Guest

Guest

Guest

Guest

keith

Distinguished

Guest

Guest

mygarbage2000

Distinguished

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

keith

Distinguished

keith

Distinguished

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

TRENDING THREADS

Latest posts

Moderators online

Share this page