Sign in with
Sign up | Sign in
Your question

Intel's FB-DIMM, any kind of RAM will work for your contro..

Last response: in CPUs
Share
Anonymous
a b à CPUs
April 18, 2004 2:48:44 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Intel is introducing a type of DRAM called FB-DIMMs (fully buffered).
Apparently the idea is to be able to put any kind of DRAM technology (e.g.
DDR1 vs. DDR2) behind a buffer without having to worry about redesigning
your memory controller. Of course this intermediate step will add some
latency to the performance of the DRAM.

It is assumed that this is Intel's way of finally acknowledging that it has
to start integrating DRAM controllers onboard its CPUs, like AMD does
already. Of course adding latency to the interfaces is exactly the opposite
of what is the main advantage of integrating the DRAM controllers in the
first place.

http://arstechnica.com/news/posts/1082164553.html

Yousuf Khan

--
Humans: contact me at ykhan at rogers dot com
Spambots: just reply to this email address ;-)
Anonymous
a b à CPUs
April 18, 2004 6:28:25 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

A buffer is meant to reduce overall latency, not to increase it AFAIK.


On Sun, 18 Apr 2004 10:48:44 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:

>Intel is introducing a type of DRAM called FB-DIMMs (fully buffered).
>Apparently the idea is to be able to put any kind of DRAM technology (e.g.
>DDR1 vs. DDR2) behind a buffer without having to worry about redesigning
>your memory controller. Of course this intermediate step will add some
>latency to the performance of the DRAM.
>
>It is assumed that this is Intel's way of finally acknowledging that it has
>to start integrating DRAM controllers onboard its CPUs, like AMD does
>already. Of course adding latency to the interfaces is exactly the opposite
>of what is the main advantage of integrating the DRAM controllers in the
>first place.
>
>http://arstechnica.com/news/posts/1082164553.html
>
> Yousuf Khan
Anonymous
a b à CPUs
April 18, 2004 9:37:36 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

<geno_cyber@tin.it> wrote in message
news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
> A buffer is meant to reduce overall latency, not to increase it AFAIK.

Not necessarily, a buffer is also meant to increase overall bandwidth, which
may be done at the expense of latency.

Yousuf Khan
Related resources
Anonymous
a b à CPUs
April 19, 2004 1:43:19 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:

><geno_cyber@tin.it> wrote in message
>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
>
>Not necessarily, a buffer is also meant to increase overall bandwidth, which
>may be done at the expense of latency.
>

Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
latency, unless there's some serious design flaws.
I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
ratio.
Anonymous
a b à CPUs
April 19, 2004 1:43:20 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

<geno_cyber@tin.it> wrote in message
news:lft5801qjivarf2mhfoiko04riq02srkp5@4ax.com...
> On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan"
> <news.tally.bbbl67@spamgourmet.com> wrote:
>
>><geno_cyber@tin.it> wrote in message
>>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
>>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
>>
>>Not necessarily, a buffer is also meant to increase overall bandwidth,
>>which
>>may be done at the expense of latency.

> Cache on CPU is not meant to increase bandwidth but to decrease overall
> latency to retrieve data
> from slower RAM.

Yes, but not by making the RAM any faster, but by avoiding RAM accesses.
We add cache to the CPU because we admit our RAM is slow.

> More cache-like buffers in the path thru the memory controller can only
> improve
> latency, unless there's some serious design flaws.

That makes no sense. Everything between the CPU and the memory will
increase latency. Even caches increase worst case latency because some time
is spent searching the cache before we start the memory access. I think
you're confused.

> I never seen a CPU that gets slower in accessing data when it can cache
> and has a good hit/miss
> ratio.

Except that we're talking about memory latency due to buffers. And by
memory latency we mean the most time it will take between when we ask the
CPU to read a byte of memory and when we get that byte.

DS
Anonymous
a b à CPUs
April 19, 2004 2:32:32 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 18 Apr 2004 21:43:19 GMT, geno_cyber@tin.it wrote:

>On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:
>
>><geno_cyber@tin.it> wrote in message
>>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
>>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
>>
>>Not necessarily, a buffer is also meant to increase overall bandwidth, which
>>may be done at the expense of latency.
>>
>
>Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
>from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
>latency, unless there's some serious design flaws.
>I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
>ratio.

You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
would never, ever make. Caches and their effects aren't pertinent to a
discussion of the buffering technique found on Fully Buffered DIMMs and their
effects on latency and bandwidth...

/daytripper (hth ;-)
Anonymous
a b à CPUs
April 19, 2004 4:38:16 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 18 Apr 2004 22:32:32 GMT, daytripper <day_trippr@REMOVEyahoo.com> wrote:

>On Sun, 18 Apr 2004 21:43:19 GMT, geno_cyber@tin.it wrote:
>
>>On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:
>>
>>><geno_cyber@tin.it> wrote in message
>>>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
>>>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
>>>
>>>Not necessarily, a buffer is also meant to increase overall bandwidth, which
>>>may be done at the expense of latency.
>>>
>>
>>Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
>>from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
>>latency, unless there's some serious design flaws.
>>I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
>>ratio.
>
>You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
>would never, ever make. Caches and their effects aren't pertinent to a
>discussion of the buffering technique found on Fully Buffered DIMMs and their
>effects on latency and bandwidth...

FB-DIMMs are supposed to work with an added cheap CPU or DSP with some fast RAM, I doubt embedded
DRAM on-chip simply due to higher costs but you never know how much they could make a product cheap
if they really want to and no expensive DSP or CPU is needed there anyway for the FB-DIMM to work.
I know how both caches and buffers work (circular buffering, FIFO buffering and so on) and because
they're used to achieve similar results sometimes (like on DSPs architectures where buffering is a
key to performance with proper assembly code...) , it's not that wrong to refer to a cache as a
buffer even if its mechanism it's quite different the goal it's almost the same. The truth is that
both ways of making bits data faster to be retrieved are useful and a proper combination of these
techniques can achieve higher performance both at the bandwidth and latency levels.
Anonymous
a b à CPUs
April 19, 2004 7:46:04 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Mon, 19 Apr 2004 00:38:16 GMT, geno_cyber@tin.it wrote:

>On Sun, 18 Apr 2004 22:32:32 GMT, daytripper <day_trippr@REMOVEyahoo.com> wrote:
>
>>On Sun, 18 Apr 2004 21:43:19 GMT, geno_cyber@tin.it wrote:
>>
>>>On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:
>>>
>>>><geno_cyber@tin.it> wrote in message
>>>>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
>>>>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
>>>>
>>>>Not necessarily, a buffer is also meant to increase overall bandwidth, which
>>>>may be done at the expense of latency.
>>>>
>>>
>>>Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
>>>from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
>>>latency, unless there's some serious design flaws.
>>>I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
>>>ratio.
>>
>>You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
>>would never, ever make. Caches and their effects aren't pertinent to a
>>discussion of the buffering technique found on Fully Buffered DIMMs and their
>>effects on latency and bandwidth...
>
>FB-DIMMs are supposed to work with an added cheap CPU or DSP with some fast RAM, I doubt embedded
>DRAM on-chip simply due to higher costs but you never know how much they could make a product cheap
>if they really want to and no expensive DSP or CPU is needed there anyway for the FB-DIMM to work.
>I know how both caches and buffers work (circular buffering, FIFO buffering and so on) and because
>they're used to achieve similar results sometimes (like on DSPs architectures where buffering is a
>key to performance with proper assembly code...) , it's not that wrong to refer to a cache as a
>buffer even if its mechanism it's quite different the goal it's almost the same. The truth is that
>both ways of making bits data faster to be retrieved are useful and a proper combination of these
>techniques can achieve higher performance both at the bandwidth and latency levels.

Ummm.....no. You're still missing the gist of the discussion, and confusing
various forms of caching with the up and down-sides of using buffers in a
point-to-point interconnect.

Maybe going back and starting over might help...

/daytripper
Anonymous
a b à CPUs
April 19, 2004 11:33:46 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

geno_cyber@tin.it wrote:

>FB-DIMMs are supposed to work...

Do you ever get it right, Geno? I don't think I've seen it...
Anonymous
a b à CPUs
April 19, 2004 2:28:39 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote in message
news:A1zgc.114205$2oI1.47233@twister01.bloor.is.net.cable.rogers.com..
..
> <geno_cyber@tin.it> wrote in message
> news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
> > A buffer is meant to reduce overall latency, not to increase it
AFAIK.
>
> Not necessarily, a buffer is also meant to increase overall
bandwidth, which
> may be done at the expense of latency.

This particular buffer reduces the DRAM interface pinout by a factor
of 3 for CPU chips having the memory interface on-chip (such as
Opteron, the late and unlamented Timna, and future Intel CPUs). This
reduces the cost of the CPU chip while increasing the cost of the DIMM
(because of the added buffer chip).

And yes, the presence of the buffer does increase the latency.

There are other tradeoffs, the main one being the ability to add lots
more DRAM into a server. Not important for desktops. YMMV.
Anonymous
a b à CPUs
April 19, 2004 4:50:43 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Mon, 19 Apr 2004 07:33:46 -0500, chrisv <chrisv@nospam.invalid> wrote:

>geno_cyber@tin.it wrote:
>
>>FB-DIMMs are supposed to work...
>
>Do you ever get it right, Geno? I don't think I've seen it...



http://cva.stanford.edu/ee482a/scribed/lect07.pdf
Anonymous
a b à CPUs
April 19, 2004 4:58:11 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Mon, 19 Apr 2004 07:33:46 -0500, chrisv <chrisv@nospam.invalid> wrote:

>geno_cyber@tin.it wrote:
>
>>FB-DIMMs are supposed to work...
>
>Do you ever get it right, Geno? I don't think I've seen it...


-------

http://www.faqs.org/docs/artu/ch12s04.html

Caching Operation Results
Sometimes you can get the best of both worlds (low latency and good throughput) by computing
expensive results as needed and caching them for later use. Earlier we mentioned that named reduces
latency by batching; it also reduces latency by caching the results of previous network transactions
with other DNS servers.

------
Anonymous
a b à CPUs
April 19, 2004 5:01:08 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Mon, 19 Apr 2004 07:33:46 -0500, chrisv <chrisv@nospam.invalid> wrote:

>geno_cyber@tin.it wrote:
>
>>FB-DIMMs are supposed to work...
>
>Do you ever get it right, Geno? I don't think I've seen it...


http://camars.kaist.ac.kr/~maeng/cs610/03note/03lect26....
Anonymous
a b à CPUs
April 19, 2004 5:05:43 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:

><geno_cyber@tin.it> wrote in message
>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
>
>Not necessarily, a buffer is also meant to increase overall bandwidth, which
>may be done at the expense of latency.
>
> Yousuf Khan
>

http://www.analog.com/UploadedFiles/Application_Notes/1...


As you can see this Analog Devices DSP uses a mixed technique of buffering/caching to improve
latency in the best case scenario. Obviously if the caching doesn't work and the data it's not
locally available then the latency has to be higher because you've to get data from slower memory
but when the data is locally available the latency can be reduced down to zero approx in some cases.
Anonymous
a b à CPUs
April 19, 2004 7:44:03 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Mon, 19 Apr 2004 07:33:46 -0500, chrisv <chrisv@nospam.invalid> wrote:

>geno_cyber@tin.it wrote:
>
>>FB-DIMMs are supposed to work...
>
>Do you ever get it right, Geno? I don't think I've seen it...

It's a lost cause...
April 19, 2004 8:29:49 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

geno_cyber@tin.it wrote :

> FB-DIMMs are supposed to work with an added cheap CPU or DSP with
> some fast RAM, I doubt embedded DRAM on-chip simply due to higher
> costs but you never know how much they could make a product cheap
> if they really want to and no expensive DSP or CPU is needed there
> anyway for the FB-DIMM to work. I know how both caches and buffers
> work (circular buffering, FIFO buffering and so on) and because
> they're used to achieve similar results sometimes (like on DSPs
> architectures where buffering is a key to performance with proper
> assembly code...) , it's not that wrong to refer to a cache as a
> buffer even if its mechanism it's quite different the goal it's
> almost the same. The truth is that both ways of making bits data
> faster to be retrieved are useful and a proper combination of
> these techniques can achieve higher performance both at the
> bandwidth and latency levels.

cache is a form of a buffer
buffer is not necesarly a cache, imagine one byte buffer, would you
call it a cache ?

Pozdrawiam.
--
RusH //
http://pulse.pdi.net/~rush/qv30/
Like ninjas, true hackers are shrouded in secrecy and mystery.
You may never know -- UNTIL IT'S TOO LATE.
Anonymous
a b à CPUs
April 19, 2004 8:29:50 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

RusH wrote:

> geno_cyber@tin.it wrote :
>
>
>>FB-DIMMs are supposed to work with an added cheap CPU or DSP with
>>some fast RAM, I doubt embedded DRAM on-chip simply due to higher
>>costs but you never know how much they could make a product cheap
>>if they really want to and no expensive DSP or CPU is needed there
>>anyway for the FB-DIMM to work. I know how both caches and buffers
>>work (circular buffering, FIFO buffering and so on) and because
>>they're used to achieve similar results sometimes (like on DSPs
>>architectures where buffering is a key to performance with proper
>>assembly code...) , it's not that wrong to refer to a cache as a
>>buffer even if its mechanism it's quite different the goal it's
>>almost the same. The truth is that both ways of making bits data
>>faster to be retrieved are useful and a proper combination of
>>these techniques can achieve higher performance both at the
>>bandwidth and latency levels.
>
>
> cache is a form of a buffer
> buffer is not necesarly a cache, imagine one byte buffer, would you
> call it a cache ?

Sure; you can think of it as a *really* small cache, which will
therefore have a terrible hit ratio, thus (most likely) increasing latency.

--
Mike Smith
Anonymous
a b à CPUs
April 19, 2004 9:39:18 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 18 Apr 2004 22:32:32 GMT, daytripper
<day_trippr@REMOVEyahoo.com> wrote:

>You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
>would never, ever make. Caches and their effects aren't pertinent to a
>discussion of the buffering technique found on Fully Buffered DIMMs and their
>effects on latency and bandwidth...

Ah! I was getting quite confused by his statement about the buffer &
cache until you said this. Makes it perfectly clear now! :p ppP

--
L.Angel: I'm looking for web design work.
If you need basic to med complexity webpages at affordable rates, email me :) 
Standard HTML, SHTML, MySQL + PHP or ASP, Javascript.
If you really want, FrontPage & DreamWeaver too.
But keep in mind you pay extra bandwidth for their bloated code
Anonymous
a b à CPUs
April 19, 2004 9:58:52 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

<geno_cyber@tin.it> wrote in message
news:udj7801kk4mg1ba4sdsh2fcuga90knoc8f@4ax.com...
> On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan"
<news.tally.bbbl67@spamgourmet.com> wrote:
> As you can see this Analog Devices DSP uses a mixed technique of
buffering/caching to improve
> latency in the best case scenario. Obviously if the caching doesn't work
and the data it's not
> locally available then the latency has to be higher because you've to get
data from slower memory
> but when the data is locally available the latency can be reduced down to
zero approx in some cases.

In this case the buffer is used to eliminate DRAM interface differences when
going from one technology to a new one.

Yousuf Khan
Anonymous
a b à CPUs
April 20, 2004 1:55:29 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Mon, 19 Apr 2004 17:58:52 GMT, "Yousuf Khan"
<news.tally.bbbl67@spamgourmet.com> wrote:

><geno_cyber@tin.it> wrote in message
>news:udj7801kk4mg1ba4sdsh2fcuga90knoc8f@4ax.com...
>> On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan"
><news.tally.bbbl67@spamgourmet.com> wrote:
>> As you can see this Analog Devices DSP uses a mixed technique of
>buffering/caching to improve
>> latency in the best case scenario. Obviously if the caching doesn't work
>and the data it's not
>> locally available then the latency has to be higher because you've to get
>data from slower memory
>> but when the data is locally available the latency can be reduced down to
>zero approx in some cases.
>
>In this case the buffer is used to eliminate DRAM interface differences when
>going from one technology to a new one.

"But wait! There's more!"

The "FB" buffer on an FBdimm is also a bus repeater (aka "buffer") for the
"next" FBdimm in the chain of FBdimms that comprise a channel. The presence of
this buffer feature allows the channel to run at the advertised frequencies in
the face of LOTS of FBdimms on a single channel - frequencies that could not
be achieved if all those dimms were on the typical multi drop memory
interconnect (ala most multi-dimm SDR/DDR/DDR2 implementations).

Anyway...

I thought I knew the answer to this, but I haven't found it documented either
way: is the FB bus repeater simply a stateless signal buffer, thus adding its
lane-to-lane skew to the next device in the chain (which would imply some huge
de-skewing tasks for the nth FBdimm in - say - an 8 FBdimm implementation). Or
does the buffer de-skew lanes before passing the transaction on to the next
node?

/daytripper
Anonymous
a b à CPUs
April 20, 2004 9:21:19 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Mon, 19 Apr 2004 21:55:29 GMT, daytripper
<day_trippr@REMOVEyahoo.com> wrote:

>The "FB" buffer on an FBdimm is also a bus repeater (aka "buffer") for the
>"next" FBdimm in the chain of FBdimms that comprise a channel. The presence of
>this buffer feature allows the channel to run at the advertised frequencies in
>the face of LOTS of FBdimms on a single channel - frequencies that could not
>be achieved if all those dimms were on the typical multi drop memory
>interconnect (ala most multi-dimm SDR/DDR/DDR2 implementations).

Does this also mean that I could in theory put a very fast say 1.6Ghz
buffer on the FBDIMM and sell it as say DDR3-1.6Ghz because of that.
Even though the actual ram chips are only capable of say 200Mhz?
:p PpPpP

--
L.Angel: I'm looking for web design work.
If you need basic to med complexity webpages at affordable rates, email me :) 
Standard HTML, SHTML, MySQL + PHP or ASP, Javascript.
If you really want, FrontPage & DreamWeaver too.
But keep in mind you pay extra bandwidth for their bloated code
Anonymous
a b à CPUs
April 20, 2004 6:28:18 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"The little lost angel" <a?n?g?e?l@lovergirl.lrigrevol.moc.com> wrote in
message news:4084b2f1.41363671@news.pacific.net.sg...
> Does this also mean that I could in theory put a very fast say 1.6Ghz
> buffer on the FBDIMM and sell it as say DDR3-1.6Ghz because of that.
> Even though the actual ram chips are only capable of say 200Mhz?

Wasn't there also some talk back in the early days of the K7 Athlon about
Micron coming out with an AMD chipset with a huge buffer built into its own
silicon. Micron went so far as to give it a cool codename, Samurai or Mamba
or something. But nothing else came of it after that.

Yousuf Khan
Anonymous
a b à CPUs
April 20, 2004 9:21:16 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Tue, 20 Apr 2004 05:21:19 GMT, a?n?g?e?l@lovergirl.lrigrevol.moc.com (The
little lost angel) wrote:

>On Mon, 19 Apr 2004 21:55:29 GMT, daytripper
><day_trippr@REMOVEyahoo.com> wrote:
>
>>The "FB" buffer on an FBdimm is also a bus repeater (aka "buffer") for the
>>"next" FBdimm in the chain of FBdimms that comprise a channel. The presence of
>>this buffer feature allows the channel to run at the advertised frequencies in
>>the face of LOTS of FBdimms on a single channel - frequencies that could not
>>be achieved if all those dimms were on the typical multi drop memory
>>interconnect (ala most multi-dimm SDR/DDR/DDR2 implementations).
>
>Does this also mean that I could in theory put a very fast say 1.6Ghz
>buffer on the FBDIMM and sell it as say DDR3-1.6Ghz because of that.
>Even though the actual ram chips are only capable of say 200Mhz?
>:p PpPpP

The short answer is: certainly.

The longer answer is: this is *exactly* the whole point of this technology: to
make heaps of s l o w but cheap (read: "commodity") drams look fast when
viewed at the memory channel, in order to accommodate large memory capacities
for server platforms (ie: I doubt you'll be seeing FBdimms on conventional
desktop machines anytime soon).

Like the similar schemes that have gone before this one, it sacrifices some
latency at the transaction level for beau coup bandwidth at the channel level.

No doubt everyone will have their favorite benchmark to bang against this to
see if the net effect is positive...

/daytripper (Mine would use rather nasty strides ;-)
Anonymous
a b à CPUs
April 21, 2004 1:06:15 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Tue, 20 Apr 2004 14:28:18 GMT, "Yousuf Khan"
<news.tally.bbbl67@spamgourmet.com> wrote:

>Wasn't there also some talk back in the early days of the K7 Athlon about
>Micron coming out with an AMD chipset with a huge buffer built into its own
>silicon. Micron went so far as to give it a cool codename, Samurai or Mamba
>or something. But nothing else came of it after that.

Hmm, don't remember that much. Only remember for sure what you forgot,
it was Samurai :p 

--
L.Angel: I'm looking for web design work.
If you need basic to med complexity webpages at affordable rates, email me :) 
Standard HTML, SHTML, MySQL + PHP or ASP, Javascript.
If you really want, FrontPage & DreamWeaver too.
But keep in mind you pay extra bandwidth for their bloated code
Anonymous
a b à CPUs
April 21, 2004 9:49:03 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Tue, 20 Apr 2004 14:28:18 GMT, "Yousuf Khan"
<news.tally.bbbl67@spamgourmet.com> wrote:
>"The little lost angel" <a?n?g?e?l@lovergirl.lrigrevol.moc.com> wrote in
>message news:4084b2f1.41363671@news.pacific.net.sg...
>> Does this also mean that I could in theory put a very fast say 1.6Ghz
>> buffer on the FBDIMM and sell it as say DDR3-1.6Ghz because of that.
>> Even though the actual ram chips are only capable of say 200Mhz?
>
>Wasn't there also some talk back in the early days of the K7 Athlon about
>Micron coming out with an AMD chipset with a huge buffer built into its own
>silicon. Micron went so far as to give it a cool codename, Samurai or Mamba
>or something. But nothing else came of it after that.

I believe they even built a prototype. Never made it to market
though. Either way, the chipset in question just had an L3 cache (8MB
of eDRAM if my memory serves) on the chipset, nothing really to do
with the buffers in Fully Buffered DIMMs. Buffer != cache.

-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca
Anonymous
a b à CPUs
April 22, 2004 1:45:31 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <A1zgc.114205$2oI1.47233
@twister01.bloor.is.net.cable.rogers.com>, news.tally.bbbl67
@spamgourmet.com says...
> <geno_cyber@tin.it> wrote in message
> news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
> > A buffer is meant to reduce overall latency, not to increase it AFAIK.
>
> Not necessarily, a buffer is also meant to increase overall bandwidth, which
> may be done at the expense of latency.

Jeez Yousuf, a "buffer" may be used simply to increase drive
(current, if you will). An INVERTER can be a buffer "buffer"
(though most buffers are non-inverting to avoid confusion. Then
again there are unbuffered inverters (74xxU... ;-)

The point is that there are *many* uses of the term "buffer" and
most have nothing to do with any kind of a "cache". A "cache"
implies an addressed (usually with a directory) storage element.
A "buffer" implies no such thing.

--
Keith
Anonymous
a b à CPUs
April 22, 2004 1:46:52 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <cnh78054mr3tgufn00l5rf6qcrdoekvapu@4ax.com>,
chrisv@nospam.invalid says...
> geno_cyber@tin.it wrote:
>
> >FB-DIMMs are supposed to work...
>
> Do you ever get it right, Geno? I don't think I've seen it...

Bingo!

--
Keith
Anonymous
a b à CPUs
April 22, 2004 1:50:42 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <10883o5j7upgf5a@news.supernews.com>,
mike_UNDERSCORE_smith@acm.DOT.org says...
> RusH wrote:
>
> > geno_cyber@tin.it wrote :
> >
> >
> >>FB-DIMMs are supposed to work with an added cheap CPU or DSP with
> >>some fast RAM, I doubt embedded DRAM on-chip simply due to higher
> >>costs but you never know how much they could make a product cheap
> >>if they really want to and no expensive DSP or CPU is needed there
> >>anyway for the FB-DIMM to work. I know how both caches and buffers
> >>work (circular buffering, FIFO buffering and so on) and because
> >>they're used to achieve similar results sometimes (like on DSPs
> >>architectures where buffering is a key to performance with proper
> >>assembly code...) , it's not that wrong to refer to a cache as a
> >>buffer even if its mechanism it's quite different the goal it's
> >>almost the same. The truth is that both ways of making bits data
> >>faster to be retrieved are useful and a proper combination of
> >>these techniques can achieve higher performance both at the
> >>bandwidth and latency levels.
> >
> >
> > cache is a form of a buffer
> > buffer is not necesarly a cache, imagine one byte buffer, would you
> > call it a cache ?
>
> Sure; you can think of it as a *really* small cache, which will
> therefore have a terrible hit ratio, thus (most likely) increasing latency.

Ok, how about a *Zero* byte buffer (a.k.a. amplifier)? Is that a
cache?? You're wrong. A cache is a buffer of sorts, but the term
"buffer" in no way implies a cache. It may simply be an
amplifier, which I believe is the case here. 'tripper has his
ear closer to this ground than anyone else here.

--
Keith
Anonymous
a b à CPUs
April 22, 2004 1:57:39 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <c5vbe0$fjk$1@nntp.webmaster.com>,
davids@webmaster.com says...
>
> <geno_cyber@tin.it> wrote in message
> news:lft5801qjivarf2mhfoiko04riq02srkp5@4ax.com...
> > On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan"
> > <news.tally.bbbl67@spamgourmet.com> wrote:
> >
> >><geno_cyber@tin.it> wrote in message
> >>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
> >>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
> >>
> >>Not necessarily, a buffer is also meant to increase overall bandwidth,
> >>which
> >>may be done at the expense of latency.
>
> > Cache on CPU is not meant to increase bandwidth but to decrease overall
> > latency to retrieve data
> > from slower RAM.
>
> Yes, but not by making the RAM any faster, but by avoiding RAM accesses.
> We add cache to the CPU because we admit our RAM is slow.

We "admit"?? Hell it's a known issue that RAM is *SLOW*. Caches
are there to improve apparent latency, sure.
>
> > More cache-like buffers in the path thru the memory controller can only
> > improve
> > latency, unless there's some serious design flaws.
>
> That makes no sense. Everything between the CPU and the memory will
> increase latency. Even caches increase worst case latency because some time
> is spent searching the cache before we start the memory access. I think
> you're confused.

No necessarily. Addresses can be broadcast to the entire memory
hierarchy simultaneously. The first to answer wins. If it's
cached, it's fast. If not, there is no penalty in asking the
cach if it's there and being answered in the negative.

> > I never seen a CPU that gets slower in accessing data when it can cache
> > and has a good hit/miss
> > ratio.

> Except that we're talking about memory latency due to buffers. And by
> memory latency we mean the most time it will take between when we ask the
> CPU to read a byte of memory and when we get that byte.

Buffers <> caches. IIRC, the issue here was about buffers.

--
Keith
Anonymous
a b à CPUs
April 22, 2004 1:57:40 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"KR Williams" <krw@att.biz> wrote in message
news:MPG.1af0f0784b9af0ad98976a@news1.news.adelphia.net...

>> That makes no sense. Everything between the CPU and the memory will
>> increase latency. Even caches increase worst case latency because some
>> time
>> is spent searching the cache before we start the memory access. I think
>> you're confused.

> No necessarily. Addresses can be broadcast to the entire memory
> hierarchy simultaneously. The first to answer wins. If it's
> cached, it's fast. If not, there is no penalty in asking the
> cach if it's there and being answered in the negative.

Consider two back-to-back addresses. We start broadcasting the first
address on the memory bus but the cache answers first. Now we can't
broadcast the second address onto the memory bus until we can quiesce the
address bus from the first address, can we?

>> > I never seen a CPU that gets slower in accessing data when it can cache
>> > and has a good hit/miss
>> > ratio.

>> Except that we're talking about memory latency due to buffers. And by
>> memory latency we mean the most time it will take between when we ask the
>> CPU to read a byte of memory and when we get that byte.

> Buffers <> caches. IIRC, the issue here was about buffers.

Buffers must increase latency. Caches generally increase worst case
latency; however, unless you have a pathological load, they should improve
average latency.

DS
Anonymous
a b à CPUs
April 22, 2004 2:02:00 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <rRNgc.195$eZ5.136@newsread1.news.pas.earthlink.net>,
fmsfnf@jfoops.net says...
> "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote in message
> news:A1zgc.114205$2oI1.47233@twister01.bloor.is.net.cable.rogers.com..
> .
> > <geno_cyber@tin.it> wrote in message
> > news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
> > > A buffer is meant to reduce overall latency, not to increase it
> AFAIK.
> >
> > Not necessarily, a buffer is also meant to increase overall
> bandwidth, which
> > may be done at the expense of latency.
>
> This particular buffer reduces the DRAM interface pinout by a factor
> of 3 for CPU chips having the memory interface on-chip (such as
> Opteron, the late and unlamented Timna, and future Intel CPUs). This
> reduces the cost of the CPU chip while increasing the cost of the DIMM
> (because of the added buffer chip).
>
> And yes, the presence of the buffer does increase the latency.

It may reduce it too! ;-) On-chip delay goes up with the square
of the length of the wire. Adding a *buffer* in the wire drops
this to 2x half the length squared (plus buffer delay). "Buffer"
has many meanings. Me thinks CG doesn't "get it".
>
> There are other tradeoffs, the main one being the ability to add lots
> more DRAM into a server. Not important for desktops. YMMV.

In this specific instance, perhaps not. Memory is good though.
More is better, and an upgrade path is also goodness. ...at
least for the folks in this group. ;-)

--
Keith
Anonymous
a b à CPUs
April 22, 2004 9:07:45 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz" <davids@webmaster.com>
wrote:

>
>"KR Williams" <krw@att.biz> wrote in message
>news:MPG.1af0f0784b9af0ad98976a@news1.news.adelphia.net...
>
>>> That makes no sense. Everything between the CPU and the memory will
>>> increase latency. Even caches increase worst case latency because some
>>> time
>>> is spent searching the cache before we start the memory access. I think
>>> you're confused.
>
>> No necessarily. Addresses can be broadcast to the entire memory
>> hierarchy simultaneously. The first to answer wins. If it's
>> cached, it's fast. If not, there is no penalty in asking the
>> cach if it's there and being answered in the negative.
>
> Consider two back-to-back addresses. We start broadcasting the first
>address on the memory bus but the cache answers first. Now we can't
>broadcast the second address onto the memory bus until we can quiesce the
>address bus from the first address, can we?

The look aside vs. look through cache. It depends... on all the relative
timings. First a cache does not have to be "searched" - from the lookup
you can get a hit/miss answer in one cycle. Assuming look aside cache, if
the memory requests are queued to the memory controller, there's the
question of whether you can get a Burst Terminate command through to the
memory chips past, or before, the 2nd memory access.

>>> > I never seen a CPU that gets slower in accessing data when it can cache
>>> > and has a good hit/miss
>>> > ratio.
>
>>> Except that we're talking about memory latency due to buffers. And by
>>> memory latency we mean the most time it will take between when we ask the
>>> CPU to read a byte of memory and when we get that byte.
>
>> Buffers <> caches. IIRC, the issue here was about buffers.
>
> Buffers must increase latency. Caches generally increase worst case
>latency; however, unless you have a pathological load, they should improve
>average latency.

Two points here. I don't think we're talking about data buffering - more
like "electrical" buffering, as in registered modules. If you have 4 (or
more) ranks of memory modules (per channel) operating at current speeds,
you need the registering/buffering somewhere. It makes sense to move it
closer to the channel interface of the collective DIMMs than to have it
working independently on each DIMM. I'm not sure there's necessarily any
increased latency for that situation.

Rgds, George Macdonald

"Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
Anonymous
a b à CPUs
April 22, 2004 11:33:16 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
> On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz"
> <davids@webmaster.com>
> wrote:

>>> No necessarily. Addresses can be broadcast to the entire memory
>>> hierarchy simultaneously. The first to answer wins. If it's
>>> cached, it's fast. If not, there is no penalty in asking the
>>> cach if it's there and being answered in the negative.

>> Consider two back-to-back addresses. We start broadcasting the first
>>address on the memory bus but the cache answers first. Now we can't
>>broadcast the second address onto the memory bus until we can quiesce the
>>address bus from the first address, can we?

> The look aside vs. look through cache. It depends... on all the relative
> timings. First a cache does not have to be "searched" - from the lookup
> you can get a hit/miss answer in one cycle.

That would still be a one cycle delay while the cache was searched,
whether or not you found anything in it.

> Assuming look aside cache, if
> the memory requests are queued to the memory controller, there's the
> question of whether you can get a Burst Terminate command through to the
> memory chips past, or before, the 2nd memory access.

Even so, it takes some time to terminate the burst.

>>>> > I never seen a CPU that gets slower in accessing data when it can
>>>> > cache
>>>> > and has a good hit/miss
>>>> > ratio.

>>>> Except that we're talking about memory latency due to buffers. And
>>>> by
>>>> memory latency we mean the most time it will take between when we ask
>>>> the
>>>> CPU to read a byte of memory and when we get that byte.

>>> Buffers <> caches. IIRC, the issue here was about buffers.

>> Buffers must increase latency. Caches generally increase worst case
>>latency; however, unless you have a pathological load, they should improve
>>average latency.

> Two points here. I don't think we're talking about data buffering - more
> like "electrical" buffering, as in registered modules.

No difference. There is not a data buffer in the world whose output
transitions before or at the same time as its input. They all add some delay
to the signals.

> If you have 4 (or
> more) ranks of memory modules (per channel) operating at current speeds,
> you need the registering/buffering somewhere. It makes sense to move it
> closer to the channel interface of the collective DIMMs than to have it
> working independently on each DIMM. I'm not sure there's necessarily any
> increased latency for that situation.

If you have so many modules per channel that you need buffering, then
you suffer a buffering penalty. That's my point. Whether that means you need
faster memory chips to keep the same cycle speed or you cycle more slowly,
you have a buffering delay.

I'm really not saying anything controversial. Buffers and caches
increase latency, at least in the worst case access.

DS
Anonymous
a b à CPUs
April 22, 2004 10:29:08 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in
message news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
>
> Two points here. I don't think we're talking about data buffering -
more
> like "electrical" buffering, as in registered modules. If you have
4 (or
> more) ranks of memory modules (per channel) operating at current
speeds,
> you need the registering/buffering somewhere. It makes sense to
move it
> closer to the channel interface of the collective DIMMs than to have
it
> working independently on each DIMM. I'm not sure there's
necessarily any
> increased latency for that situation.

I think the "increased latency" is with respect to the usual (in PCs)
one or two unbuffered DIMMs. In this case, the FB-DIMMs do indeed
have a greater latency.

Keep in mind that there will really be no choice once you've bought
your mobo. The CPU socket will either be for a CPU to use traditional
DIMMs or (with 66% fewer memory pins) to use FB-DIMMs. You will never
ever stand there with both types of memory modules in hand and have to
decide which to plug in.
Anonymous
a b à CPUs
April 22, 2004 10:29:09 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"Felger Carbon" <fmsfnf@jfoops.net> wrote:

>> you need the registering/buffering somewhere. It makes sense to
>move it
>> closer to the channel interface of the collective DIMMs than to have
>it
>> working independently on each DIMM. I'm not sure there's
>necessarily any
>> increased latency for that situation.
>
>I think the "increased latency" is with respect to the usual (in PCs)
>one or two unbuffered DIMMs. In this case, the FB-DIMMs do indeed
>have a greater latency.
>
>Keep in mind that there will really be no choice once you've bought
>your mobo. The CPU socket will either be for a CPU to use traditional
>DIMMs or (with 66% fewer memory pins) to use FB-DIMMs. You will never
>ever stand there with both types of memory modules in hand and have to
>decide which to plug in.

"OE quotefix", dude.
Anonymous
a b à CPUs
April 23, 2004 2:33:33 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <c67beq$656$1@nntp.webmaster.com>,
davids@webmaster.com says...
>
> "KR Williams" <krw@att.biz> wrote in message
> news:MPG.1af0f0784b9af0ad98976a@news1.news.adelphia.net...
>
> >> That makes no sense. Everything between the CPU and the memory will
> >> increase latency. Even caches increase worst case latency because some
> >> time
> >> is spent searching the cache before we start the memory access. I think
> >> you're confused.
>
> > No necessarily. Addresses can be broadcast to the entire memory
> > hierarchy simultaneously. The first to answer wins. If it's
> > cached, it's fast. If not, there is no penalty in asking the
> > cach if it's there and being answered in the negative.
>
> Consider two back-to-back addresses. We start broadcasting the first
> address on the memory bus but the cache answers first. Now we can't
> broadcast the second address onto the memory bus until we can quiesce the
> address bus from the first address, can we?

You're assuming the time to access to the caches is a significant
fraction of the time required to access main memory. It's
certainly not. Cache results are known *long* before the address
is broadcast to the mass memory. By the time the memory request
gets near the chip's I/O the caches know whether they can deliver
the data. If so the memory request is killed. THere is no
additional latency here.
>
> >> > I never seen a CPU that gets slower in accessing data when it can cache
> >> > and has a good hit/miss
> >> > ratio.
>
> >> Except that we're talking about memory latency due to buffers. And by
> >> memory latency we mean the most time it will take between when we ask the
> >> CPU to read a byte of memory and when we get that byte.
>
> > Buffers <> caches. IIRC, the issue here was about buffers.
>
> Buffers must increase latency.

Ok. You're right. Except that things don't work without
buffers. Does that mean they increase latency, or does it mean
that they allow things to *work*?

> Caches generally increase worst case
> latency; however, unless you have a pathological load, they should improve
> average latency.

Again, BUFFERS <> CACHES! A buffer can be a simple amplifier
(thus no storage element at all). It's naive to say that a
buffer increases latency (particularly since many here don't seem
to understand the term).

--
Keith
Anonymous
a b à CPUs
April 23, 2004 2:37:13 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com>,
fammacd=!SPAM^nothanks@tellurian.com says...
> On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz" <davids@webmaster.com>
> wrote:
>
> >
> >"KR Williams" <krw@att.biz> wrote in message
> >news:MPG.1af0f0784b9af0ad98976a@news1.news.adelphia.net...
> >
> >>> That makes no sense. Everything between the CPU and the memory will
> >>> increase latency. Even caches increase worst case latency because some
> >>> time
> >>> is spent searching the cache before we start the memory access. I think
> >>> you're confused.
> >
> >> No necessarily. Addresses can be broadcast to the entire memory
> >> hierarchy simultaneously. The first to answer wins. If it's
> >> cached, it's fast. If not, there is no penalty in asking the
> >> cach if it's there and being answered in the negative.
> >
> > Consider two back-to-back addresses. We start broadcasting the first
> >address on the memory bus but the cache answers first. Now we can't
> >broadcast the second address onto the memory bus until we can quiesce the
> >address bus from the first address, can we?
>
> The look aside vs. look through cache. It depends... on all the relative
> timings. First a cache does not have to be "searched" - from the lookup
> you can get a hit/miss answer in one cycle. Assuming look aside cache, if
> the memory requests are queued to the memory controller, there's the
> question of whether you can get a Burst Terminate command through to the
> memory chips past, or before, the 2nd memory access.

Sure. THe cache is "searched" in smaller time than the request
gets to the I/O. If it's satisfied by the caches, the storage
request can be canceled with no overhead. If not, the storage
request I allowed to continue.
>
> >>> > I never seen a CPU that gets slower in accessing data when it can cache
> >>> > and has a good hit/miss
> >>> > ratio.
> >
> >>> Except that we're talking about memory latency due to buffers. And by
> >>> memory latency we mean the most time it will take between when we ask the
> >>> CPU to read a byte of memory and when we get that byte.
> >
> >> Buffers <> caches. IIRC, the issue here was about buffers.
> >
> > Buffers must increase latency. Caches generally increase worst case
> >latency; however, unless you have a pathological load, they should improve
> >average latency.
>
> Two points here. I don't think we're talking about data buffering - more
> like "electrical" buffering, as in registered modules.

Bingo! ...though I thought this was clear.

> If you have 4 (or
> more) ranks of memory modules (per channel) operating at current speeds,
> you need the registering/buffering somewhere. It makes sense to move it
> closer to the channel interface of the collective DIMMs than to have it
> working independently on each DIMM. I'm not sure there's necessarily any
> increased latency for that situation.

I'm not either. It works one way, and not the other. DOes that
mean the way it *works* is slower?

--
Keith
Anonymous
a b à CPUs
April 23, 2004 2:45:40 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <c68l3m$s6i$1@nntp.webmaster.com>,
davids@webmaster.com says...
>
> "George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
> news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
> > On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz"
> > <davids@webmaster.com>
> > wrote:
>
> >>> No necessarily. Addresses can be broadcast to the entire memory
> >>> hierarchy simultaneously. The first to answer wins. If it's
> >>> cached, it's fast. If not, there is no penalty in asking the
> >>> cach if it's there and being answered in the negative.
>
> >> Consider two back-to-back addresses. We start broadcasting the first
> >>address on the memory bus but the cache answers first. Now we can't
> >>broadcast the second address onto the memory bus until we can quiesce the
> >>address bus from the first address, can we?
>
> > The look aside vs. look through cache. It depends... on all the relative
> > timings. First a cache does not have to be "searched" - from the lookup
> > you can get a hit/miss answer in one cycle.
>
> That would still be a one cycle delay while the cache was searched,
> whether or not you found anything in it.

Oh, NO! The cache is *not* "searched". THe answer is yes/no and
that answer is quick. In addition the request can be sent in
parallel to the next level of hierarchy and canceled if satisfied
at a lower level. The load/store queues must be coherent for
other reasons, this is a minor architectural complication.

> > Assuming look aside cache, if
> > the memory requests are queued to the memory controller, there's the
> > question of whether you can get a Burst Terminate command through to the
> > memory chips past, or before, the 2nd memory access.
>
> Even so, it takes some time to terminate the burst.

The burst hasn't even started. Sheesh!
>
> >>>> > I never seen a CPU that gets slower in accessing data when it can
> >>>> > cache
> >>>> > and has a good hit/miss
> >>>> > ratio.
>
> >>>> Except that we're talking about memory latency due to buffers. And
> >>>> by
> >>>> memory latency we mean the most time it will take between when we ask
> >>>> the
> >>>> CPU to read a byte of memory and when we get that byte.
>
> >>> Buffers <> caches. IIRC, the issue here was about buffers.
>
> >> Buffers must increase latency. Caches generally increase worst case
> >>latency; however, unless you have a pathological load, they should improve
> >>average latency.
>
> > Two points here. I don't think we're talking about data buffering - more
> > like "electrical" buffering, as in registered modules.
>
> No difference. There is not a data buffer in the world whose output
> transitions before or at the same time as its input. They all add some delay
> to the signals.

Sure. If the signals don't get there they're hardly useful
though.
>
> > If you have 4 (or
> > more) ranks of memory modules (per channel) operating at current speeds,
> > you need the registering/buffering somewhere. It makes sense to move it
> > closer to the channel interface of the collective DIMMs than to have it
> > working independently on each DIMM. I'm not sure there's necessarily any
> > increased latency for that situation.
>
> If you have so many modules per channel that you need buffering, then
> you suffer a buffering penalty. That's my point. Whether that means you need
> faster memory chips to keep the same cycle speed or you cycle more slowly,
> you have a buffering delay.
>
> I'm really not saying anything controversial. Buffers and caches
> increase latency, at least in the worst case access.

Certainly anything that adds latency, adds latency (duh!), but
you're arguing that buffers == caches *and* that caches increase
latency. This is just not so!

--
Keith
Anonymous
a b à CPUs
April 23, 2004 6:05:06 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Thu, 22 Apr 2004 07:33:16 -0700, "David Schwartz" <davids@webmaster.com>
wrote:

>
>"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
>news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
>> On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz"
>> <davids@webmaster.com>
>> wrote:
>
>>>> No necessarily. Addresses can be broadcast to the entire memory
>>>> hierarchy simultaneously. The first to answer wins. If it's
>>>> cached, it's fast. If not, there is no penalty in asking the
>>>> cach if it's there and being answered in the negative.
>
>>> Consider two back-to-back addresses. We start broadcasting the first
>>>address on the memory bus but the cache answers first. Now we can't
>>>broadcast the second address onto the memory bus until we can quiesce the
>>>address bus from the first address, can we?
>
>> The look aside vs. look through cache. It depends... on all the relative
>> timings. First a cache does not have to be "searched" - from the lookup
>> you can get a hit/miss answer in one cycle.
>
> That would still be a one cycle delay while the cache was searched,
>whether or not you found anything in it.

On current CPUs, a memory channel cycle is 10-15 or so cache cycles - get
things aligned right, call it a coupla cache clocks, and there's no need to
shove the address on the memory bus (AMD) or FSB (Intel). Accurate info is
elusive on this kind of thing now but I believe that look-aside caches are
just considered unnecessary now.

>> Assuming look aside cache, if
>> the memory requests are queued to the memory controller, there's the
>> question of whether you can get a Burst Terminate command through to the
>> memory chips past, or before, the 2nd memory access.
>
> Even so, it takes some time to terminate the burst.

I doubt that it's going to need to be terminated - IOW the 1st cache
hit/miss result (not necessarily the cache data) should be available before
the memory address has passed out of the CPU.

>>> Buffers must increase latency. Caches generally increase worst case
>>>latency; however, unless you have a pathological load, they should improve
>>>average latency.
>
>> Two points here. I don't think we're talking about data buffering - more
>> like "electrical" buffering, as in registered modules.
>
> No difference.

D'oh!

> There is not a data buffer in the world whose output
>transitions before or at the same time as its input. They all add some delay
>to the signals.

But it's not data that's being buffered - it's simply a (near)zero-gain
amplifier to keep all the modules talking in unison.

>> If you have 4 (or
>> more) ranks of memory modules (per channel) operating at current speeds,
>> you need the registering/buffering somewhere. It makes sense to move it
>> closer to the channel interface of the collective DIMMs than to have it
>> working independently on each DIMM. I'm not sure there's necessarily any
>> increased latency for that situation.
>
> If you have so many modules per channel that you need buffering, then
>you suffer a buffering penalty. That's my point. Whether that means you need
>faster memory chips to keep the same cycle speed or you cycle more slowly,
>you have a buffering delay.

That's what the damned thing is for - large memory systems. Currently you
put registered DIMMs, with their latency penalty, in such a system and even
there you run into problems with the multi-drop memory channel of DDR.
What I'm saying is that the buffering of FB-DIMMs is not necessarily any
worse and you get the DIMMs to "talk" consistently to the channel.

> I'm really not saying anything controversial. Buffers and caches
>increase latency, at least in the worst case access.

You seem to be stuck in data buffers!

Rgds, George Macdonald

"Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
Anonymous
a b à CPUs
April 23, 2004 6:05:07 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Thu, 22 Apr 2004 18:29:08 GMT, "Felger Carbon" <fmsfnf@jfoops.net>
wrote:

>"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in
>message news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
>>
>> Two points here. I don't think we're talking about data buffering -
>more
>> like "electrical" buffering, as in registered modules. If you have
>4 (or
>> more) ranks of memory modules (per channel) operating at current
>speeds,
>> you need the registering/buffering somewhere. It makes sense to
>move it
>> closer to the channel interface of the collective DIMMs than to have
>it
>> working independently on each DIMM. I'm not sure there's
>necessarily any
>> increased latency for that situation.
>
>I think the "increased latency" is with respect to the usual (in PCs)
>one or two unbuffered DIMMs. In this case, the FB-DIMMs do indeed
>have a greater latency.

Sure but compared with registering independently on every DIMM and hoping
that they all talk on the same edges... or close enough so that it works??
It's not clear to me if, in a large memory system, say 8 ranks per channel,
accesses to the farthest DIMMs are going to have extra cycles of latency
added but if the clock frequency can be jacked up significantly, does it
matter much?

>Keep in mind that there will really be no choice once you've bought
>your mobo. The CPU socket will either be for a CPU to use traditional
>DIMMs or (with 66% fewer memory pins) to use FB-DIMMs. You will never
>ever stand there with both types of memory modules in hand and have to
>decide which to plug in.

CPU socket? Oh we're talking AMD as the "standard" now?:-) Daytripper
mentioned he's not sure the FB-DIMM is going to make it to the desktop
anyway. Makes me wonder how the pricing is going to fall out with the
market fragmentation - to date we've all -- desktop through to server --
benefited from that model till now. Could get awkward for CPU on-die
memory controllers too if we need to have different CPUs according to the
memory type.

Rgds, George Macdonald

"Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
Anonymous
a b à CPUs
April 23, 2004 6:59:03 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"chrisv" <chrisv@nospam.invalid> wrote in message
news:8t8g801v6krtjm5q7kc7gl8jn5dtrn4oen@4ax.com...
> "Felger Carbon" <fmsfnf@jfoops.net> wrote:
>
> >I think the "increased latency" is with respect to the usual (in
PCs)
> >one or two unbuffered DIMMs. In this case, the FB-DIMMs do indeed
> >have a greater latency.
> >
> >Keep in mind that there will really be no choice once you've bought
> >your mobo. The CPU socket will either be for a CPU to use
traditional
> >DIMMs or (with 66% fewer memory pins) to use FB-DIMMs. You will
never
> >ever stand there with both types of memory modules in hand and have
to
> >decide which to plug in.
>
> "OE quotefix", dude.

"Huh?" asks Felger, who is easily puzzled. ;-)
Anonymous
a b à CPUs
April 23, 2004 10:28:36 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
news:eujh80h59vq3ho5pdl4sr8uelpfrrcak3e@4ax.com...
> CPU socket? Oh we're talking AMD as the "standard" now?:-) Daytripper
> mentioned he's not sure the FB-DIMM is going to make it to the desktop
> anyway. Makes me wonder how the pricing is going to fall out with the
> market fragmentation - to date we've all -- desktop through to server --
> benefited from that model till now. Could get awkward for CPU on-die
> memory controllers too if we need to have different CPUs according to the
> memory type.

Oh, I think it's all much ado. We'll keep the desktop to server DRAM
interface commonality for a long time. Afterall, how many new types of DRAM
come out in a given amount of time? I'd say a new standard every 3 to 5
years? Hardly a breakneck frequency. People will continue to design new DRAM
controllers based on upcoming standards, and they will also put backwards
compatibility into these controllers for previous generations of DRAM.

Yousuf Khan
Anonymous
a b à CPUs
April 24, 2004 4:39:39 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in
message news:eujh80h59vq3ho5pdl4sr8uelpfrrcak3e@4ax.com...
>
> Daytripper
> mentioned he's not sure the FB-DIMM is going to make it to the
desktop
> anyway. Makes me wonder how the pricing is going to fall out with
the
> market fragmentation - to date we've all -- desktop through to
server --
> benefited from that model till now. Could get awkward for CPU
on-die
> memory controllers too if we need to have different CPUs according
to the
> memory type.

Intel may see this as an opportunity to increase the ASPs on "Xeon"
CPUs - in other words, on CPUs for servers. I think it'll be a cold
day in hell when this shows up on personal desktops.
Anonymous
a b à CPUs
April 25, 2004 7:34:48 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Fri, 23 Apr 2004 18:28:36 GMT, "Yousuf Khan"
<news.tally.bbbl67@spamgourmet.com> wrote:

>"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
>news:eujh80h59vq3ho5pdl4sr8uelpfrrcak3e@4ax.com...
>> CPU socket? Oh we're talking AMD as the "standard" now?:-) Daytripper
>> mentioned he's not sure the FB-DIMM is going to make it to the desktop
>> anyway. Makes me wonder how the pricing is going to fall out with the
>> market fragmentation - to date we've all -- desktop through to server --
>> benefited from that model till now. Could get awkward for CPU on-die
>> memory controllers too if we need to have different CPUs according to the
>> memory type.
>
>Oh, I think it's all much ado. We'll keep the desktop to server DRAM
>interface commonality for a long time. Afterall, how many new types of DRAM
>come out in a given amount of time? I'd say a new standard every 3 to 5
>years? Hardly a breakneck frequency. People will continue to design new DRAM
>controllers based on upcoming standards, and they will also put backwards
>compatibility into these controllers for previous generations of DRAM.

On the "frequency", od standards, we had a close call with DRDRAM. Was it
a "standard" or not?... it came close at least. I don't see how backwards
compatibility is something they can even think of - different signalling is
just different.

Rgds, George Macdonald

"Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
Anonymous
a b à CPUs
April 25, 2004 11:53:01 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
news:rnhl80lp7m98o5vofunh3orlpnclecoea1@4ax.com...
> >Oh, I think it's all much ado. We'll keep the desktop to server DRAM
> >interface commonality for a long time. Afterall, how many new types of
DRAM
> >come out in a given amount of time? I'd say a new standard every 3 to 5
> >years? Hardly a breakneck frequency. People will continue to design new
DRAM
> >controllers based on upcoming standards, and they will also put backwards
> >compatibility into these controllers for previous generations of DRAM.
>
> On the "frequency", od standards, we had a close call with DRDRAM. Was it
> a "standard" or not?... it came close at least. I don't see how backwards
> compatibility is something they can even think of - different signalling
is
> just different.

Well, they've had chipsets in the past which implemented compatibility with
both EDO and SDR rams. Then later we had chipsets which did both SDR and
DDR1 compatibility. Why should it be difficult to put dual DDR1 and DDR2
capabilities? It detects which type of ram it's connected to and switches to
the circuitry for that particular type of RAM.

Yousuf Khan
Anonymous
a b à CPUs
April 26, 2004 4:37:07 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 25 Apr 2004 19:53:01 GMT, "Yousuf Khan"
<news.20.bbbl67@spamgourmet.com> wrote:
>Well, they've had chipsets in the past which implemented compatibility with
>both EDO and SDR rams. Then later we had chipsets which did both SDR and
>DDR1 compatibility. Why should it be difficult to put dual DDR1 and DDR2
>capabilities? It detects which type of ram it's connected to and switches to
>the circuitry for that particular type of RAM.

Different voltage swings, different (IO) voltages, and I don't think DDR1 used
ODT. And then you have the whole dimm socket keying issue.

In a heavily commoditized market it'd unnecessarily drive up the chipset and
platform implementation costs to accommodate both technologies with a single
solution...

/daytripper
Anonymous
a b à CPUs
April 26, 2004 5:44:59 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <Xaaic.8576$e4.3166@newsread2.news.pas.earthlink.net>,
"Felger Carbon" <fmsfnf@jfoops.net> wrote:
| "Huh?" asks Felger, who is easily puzzled. ;-)

For the love of all that is good and holy, immediately download
and install this:

http://home.in.tum.de/~jain/software/oe-quotefix/

before you mangle another quote on Usenet again!

I believe that is what chrisv meant.
Anonymous
a b à CPUs
April 26, 2004 6:16:20 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

ihoc-a-attbi-d-com wrote:
> In article <Xaaic.8576$e4.3166@newsread2.news.pas.earthlink.net>,
> "Felger Carbon" <fmsfnf@jfoops.net> wrote:
>> "Huh?" asks Felger, who is easily puzzled. ;-)
>
> For the love of all that is good and holy, immediately download
> and install this:
>
> http://home.in.tum.de/~jain/software/oe-quotefix/
>
> before you mangle another quote on Usenet again!
>
> I believe that is what chrisv meant.

Working okay, so far. Let's try this on some really complicated quotes. :-)

Yousuf Khan
Anonymous
a b à CPUs
April 26, 2004 8:23:39 AM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sun, 25 Apr 2004 19:53:01 GMT, "Yousuf Khan"
<news.20.bbbl67@spamgourmet.com> wrote:

>"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
>news:rnhl80lp7m98o5vofunh3orlpnclecoea1@4ax.com...
>> >Oh, I think it's all much ado. We'll keep the desktop to server DRAM
>> >interface commonality for a long time. Afterall, how many new types of
>DRAM
>> >come out in a given amount of time? I'd say a new standard every 3 to 5
>> >years? Hardly a breakneck frequency. People will continue to design new
>DRAM
>> >controllers based on upcoming standards, and they will also put backwards
>> >compatibility into these controllers for previous generations of DRAM.
>>
>> On the "frequency", od standards, we had a close call with DRDRAM. Was it
>> a "standard" or not?... it came close at least. I don't see how backwards
>> compatibility is something they can even think of - different signalling
>is
>> just different.
>
>Well, they've had chipsets in the past which implemented compatibility with
>both EDO and SDR rams. Then later we had chipsets which did both SDR and
>DDR1 compatibility. Why should it be difficult to put dual DDR1 and DDR2
>capabilities? It detects which type of ram it's connected to and switches to
>the circuitry for that particular type of RAM.

Obviously it depends onhow big a jump there is between the technology of
the two memory channels - SDRAM and DDR-SDRAM are not too far apart in
terms of signalling - a few extra pins for source synch clocking and a few
others which were used slightly differently. OTOH we never saw a dual
DRDRAM and SDRAM chipset - too different... would certainly require
independent pins. I'm not up on the details of FB-DIMM interfacing but I'd
think it'd be different enough.

Rgds, George Macdonald

"Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
Anonymous
a b à CPUs
April 26, 2004 3:38:13 PM

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

George Macdonald wrote:
>> Well, they've had chipsets in the past which implemented
>> compatibility with both EDO and SDR rams. Then later we had chipsets
>> which did both SDR and DDR1 compatibility. Why should it be
>> difficult to put dual DDR1 and DDR2 capabilities? It detects which
>> type of ram it's connected to and switches to the circuitry for that
>> particular type of RAM.
>
> Obviously it depends onhow big a jump there is between the technology of
> the two memory channels - SDRAM and DDR-SDRAM are not too far apart in
> terms of signalling - a few extra pins for source synch clocking and
> a few others which were used slightly differently. OTOH we never saw
> a dual DRDRAM and SDRAM chipset - too different... would certainly
> require independent pins. I'm not up on the details of FB-DIMM
> interfacing but I'd think it'd be different enough.

Well, there were different voltages for SDR and DDR, so it wasn't exactly
the simple jump that from SDR to DDR that you describe. Plus you needed
different sockets for each type. And in most cases you couldn't use both
types of RAM at the same time because of the voltage issue.

As for SDR and RDR together, must I remind you of the infamous Intel MTH?
Okay, I didn't say that it had to be a successful chipset (or even a good
chipset), but you did see the capability of using either type of memory at
one point in time. :-)

Yousuf Khan
!