AMD Beyond Brisbane-"Stars" Processors

Parrot

Distinguished
Feb 13, 2005
226
0
18,680
After Brisbane come Agena FX, Agena, Kuma, Rana and Spica cores.

In addition to HyperTransport 3.0, Stars family processors feature a 128-bit floating point unit for each CPU core, DDR2-1066 support, SSE4A instructions and a split power plane. Split power planes allow the processor and internal north bridge to operate at different voltages and speeds. The advantages of split power planes are it allows the north bridge speed and voltage to never change during Cool’n’Quiet power saving measures. With split power planes the Stars processors require separate PLLs for the processor and internal north bridge.
LINK
3026_large_AMD_Nov2006_Roadmap.png
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
Agena FX and Agena based processors offer identical features. New to the Agena FX and Agena cores is a shared L3 cache. 2MB of L3 cache will be shared between all four processor cores. The L2 cache will be 2MB as well. Clock frequencies of 2.7 GHz to 2.9 GHz are initially expected. The HyperTransport 3.0 frequency for Agena FX and Agena cores is expected to be clocked at 4000 MHz. Agena FX and Agena core processors will be manufacturing using a 65nm process and carry 125W TDPs. The first Agena FX and Agena based processors are expected to arrive in Q3’2007.

Strange 1: Despite the dual-channel DDR2 IMC, why not a shared L2 cache, instead?

Strange 2: 2*125W= 250W in a 4x4 layout! Why push the Agena FX/Skt F into a tremendously expensive (if fully loaded), darn hot & power hungry platform (or is it platformance?!), for the enthusiast space, precisely?!

Strange 3: 2.9GHz @65nm? Is that all?! (We all know that, for a given transition node, there's always a beginning; hence, so much for "initially".).

(For when a programmable IMC? Just wondering...).


Cheers!
 

Pippero

Distinguished
May 26, 2006
594
0
18,980
Strange 1: Despite the dual-channel DDR2 IMC, why not a shared L2 cache, instead?
Because of the shared L3.


Strange 2: 2*125W= 250W in a 4x4 layout! Why push the Agena FX/Skt F into a tremendously expensive (if fully loaded), darn hot & power hungry platform (or is it platformance?!), for the enthusiast space, precisely?!
Why not?
Once the platform is there, if someone wants to have a "non-server" 8-core, can have it...
It's gonna be a tiny niche, obviously.
 

mr_fnord

Distinguished
Dec 20, 2005
207
0
18,680
Higher latencies and more complex silicon for little gain with a shared L2 is my guess. The shared L3 helps alleviate MC load more than anything, most likely. If different cores are working on different threads or contexts sharing L2 will have no gains.
 
I like AMD's simple naming scheme.

Athlon 64
Athlon 64 X2
Athlon 64 X4?


Much less confusing (for the general public) than the "Core 2" line. Many people will be confused, "Why does a Core 2 prcoessor have 4 cores!?"

They won't understand that the two stands for the second generation of the "Core" line.
 

zarooch

Distinguished
Apr 28, 2006
350
0
18,780
I like AMD's simple naming scheme.

Athlon 64
Athlon 64 X2
Athlon 64 X4?


Much less confusing (for the general public) than the "Core 2" line. Many people will be confused, "Why does a Core 2 prcoessor have 4 cores!?"

They won't understand that the two stands for the second generation of the "Core" line.

Its Core 2 Duo
 

Alyarbank

Distinguished
Jul 12, 2006
189
0
18,680
I don't think it's a typo:

duad - two items of the same kind
couplet, distich, duet, duo, dyad, twain, twosome, brace, pair, span, yoke, couple
2, II, two, deuce - the cardinal number that is the sum of one and one or a numeral representing this number
doubleton - (bridge) a pair of playing cards that are the only cards in their suit in the hand dealt to a player


I think its a 2x2
 

Pippero

Distinguished
May 26, 2006
594
0
18,980
Why not 4MB of shared L2 instead 4x512kB of exclusive L2 + 2MB of shared L3 cache?
Interesting question.
As always with CPU design, a certain choice represents a compromise, and a lot depends on the actual implementation (e.g. latencies along the hierarchy L1-2-3, bandwidth, number of data ports, etc)
For example, the shared L3 cache might mean additional latency (in some cases) than the shared L2, but on the other hand this might turn into having faster smaller L2s which more than compensate for it (when you don't need to go fetch into L3).
A shared L2 is a much more complicated design IMO, cause you need a much higher performing interface, in fact both cores are going to access it quite often, while the same is not true for the L3 cache (so i guess AMD could make it single ported, for example).
But the biggest advantage of the shared L3 has to do with AMD's new "modular architecture" to multicore CPU design: if you have 2 CPUs your shared L2 cache needs to have less ports/bandwidth than if you have 4 CPUs, not to mention that the management for the allocation of the space etc becomes more complicated.
So you'd have to completely redesign the cache.
AMD instead can just slap in more cores (like 4 or 8 ) and keep the same L3 cache (except maybe scaling the size), or even remove it completely for a 2 core design.
That's just my 2 cents anyway.
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
Strange 1: Despite the dual-channel DDR2 IMC, why not a shared L2 cache, instead?
Because of the shared L3.

Nope. 'Instead'.


Strange 2: 2*125W= 250W in a 4x4 layout! Why push the Agena FX/Skt F into a tremendously expensive (if fully loaded), darn hot & power hungry platform (or is it platformance?!), for the enthusiast space, precisely?!
Why not?
Once the platform is there, if someone wants to have a "non-server" 8-core, can have it...
It's gonna be a tiny niche, obviously.

Well, a 'non-server' 8-core will still be a 2*125W TDP small niche; and, the plain vanilla Agena will not be compatible w/Skt F, so the story goes.

Of course, my point was to question the AMD's option to go with its top performer, 125W TDP part into the 4x4 layout, leaving no other option available, it seems. But you already know that...


Cheers!
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
Why not 4MB of shared L2 instead 4x512kB of exclusive L2 + 2MB of shared L3 cache?
Interesting question.
As always with CPU design, a certain choice represents a compromise, and a lot depends on the actual implementation (e.g. latencies along the hierarchy L1-2-3, bandwidth, number of data ports, etc)
For example, the shared L3 cache might mean additional latency (in some cases) than the shared L2, but on the other hand this might turn into having faster smaller L2s which more than compensate for it (when you don't need to go fetch into L3).
A shared L2 is a much more complicated design IMO, cause you need a much higher performing interface, in fact both cores are going to access it quite often, while the same is not true for the L3 cache (so i guess AMD could make it single ported, for example).
But the biggest advantage of the shared L3 has to do with AMD's new "modular architecture" to multicore CPU design: if you have 2 CPUs your shared L2 cache needs to have less ports/bandwidth than if you have 4 CPUs, not to mention that the management for the allocation of the space etc becomes more complicated.
So you'd have to completely redesign the cache.
AMD instead can just slap in more cores (like 4 or 8 ) and keep the same L3 cache (except maybe scaling the size), or even remove it completely for a 2 core design.
That's just my 2 cents anyway.

This was my point, precisely.
At first, it seemed odd why would AMD implement a shared L3 cache, increasing latency; your two cents do seem to make sense (to me), when scaling up to more cores... since the IMC would partly compensate for a smaller, core-exclusive L2 cache.


Cheers!