Intel's FB-DIMM, any kind of RAM will work for your contro..

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Intel is introducing a type of DRAM called FB-DIMMs (fully buffered).
Apparently the idea is to be able to put any kind of DRAM technology (e.g.
DDR1 vs. DDR2) behind a buffer without having to worry about redesigning
your memory controller. Of course this intermediate step will add some
latency to the performance of the DRAM.

It is assumed that this is Intel's way of finally acknowledging that it has
to start integrating DRAM controllers onboard its CPUs, like AMD does
already. Of course adding latency to the interfaces is exactly the opposite
of what is the main advantage of integrating the DRAM controllers in the
first place.

http://arstechnica.com/news/posts/1082164553.html

Yousuf Khan

--
Humans: contact me at ykhan at rogers dot com
Spambots: just reply to this email address ;-)
52 answers Last reply
More about intel dimm kind work contro
  1. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    A buffer is meant to reduce overall latency, not to increase it AFAIK.


    On Sun, 18 Apr 2004 10:48:44 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:

    >Intel is introducing a type of DRAM called FB-DIMMs (fully buffered).
    >Apparently the idea is to be able to put any kind of DRAM technology (e.g.
    >DDR1 vs. DDR2) behind a buffer without having to worry about redesigning
    >your memory controller. Of course this intermediate step will add some
    >latency to the performance of the DRAM.
    >
    >It is assumed that this is Intel's way of finally acknowledging that it has
    >to start integrating DRAM controllers onboard its CPUs, like AMD does
    >already. Of course adding latency to the interfaces is exactly the opposite
    >of what is the main advantage of integrating the DRAM controllers in the
    >first place.
    >
    >http://arstechnica.com/news/posts/1082164553.html
    >
    > Yousuf Khan
  2. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    <geno_cyber@tin.it> wrote in message
    news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    > A buffer is meant to reduce overall latency, not to increase it AFAIK.

    Not necessarily, a buffer is also meant to increase overall bandwidth, which
    may be done at the expense of latency.

    Yousuf Khan
  3. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:

    ><geno_cyber@tin.it> wrote in message
    >news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    >> A buffer is meant to reduce overall latency, not to increase it AFAIK.
    >
    >Not necessarily, a buffer is also meant to increase overall bandwidth, which
    >may be done at the expense of latency.
    >

    Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
    from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
    latency, unless there's some serious design flaws.
    I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
    ratio.
  4. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    <geno_cyber@tin.it> wrote in message
    news:lft5801qjivarf2mhfoiko04riq02srkp5@4ax.com...
    > On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan"
    > <news.tally.bbbl67@spamgourmet.com> wrote:
    >
    >><geno_cyber@tin.it> wrote in message
    >>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    >>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
    >>
    >>Not necessarily, a buffer is also meant to increase overall bandwidth,
    >>which
    >>may be done at the expense of latency.

    > Cache on CPU is not meant to increase bandwidth but to decrease overall
    > latency to retrieve data
    > from slower RAM.

    Yes, but not by making the RAM any faster, but by avoiding RAM accesses.
    We add cache to the CPU because we admit our RAM is slow.

    > More cache-like buffers in the path thru the memory controller can only
    > improve
    > latency, unless there's some serious design flaws.

    That makes no sense. Everything between the CPU and the memory will
    increase latency. Even caches increase worst case latency because some time
    is spent searching the cache before we start the memory access. I think
    you're confused.

    > I never seen a CPU that gets slower in accessing data when it can cache
    > and has a good hit/miss
    > ratio.

    Except that we're talking about memory latency due to buffers. And by
    memory latency we mean the most time it will take between when we ask the
    CPU to read a byte of memory and when we get that byte.

    DS
  5. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Sun, 18 Apr 2004 21:43:19 GMT, geno_cyber@tin.it wrote:

    >On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:
    >
    >><geno_cyber@tin.it> wrote in message
    >>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    >>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
    >>
    >>Not necessarily, a buffer is also meant to increase overall bandwidth, which
    >>may be done at the expense of latency.
    >>
    >
    >Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
    >from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
    >latency, unless there's some serious design flaws.
    >I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
    >ratio.

    You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
    would never, ever make. Caches and their effects aren't pertinent to a
    discussion of the buffering technique found on Fully Buffered DIMMs and their
    effects on latency and bandwidth...

    /daytripper (hth ;-)
  6. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Sun, 18 Apr 2004 22:32:32 GMT, daytripper <day_trippr@REMOVEyahoo.com> wrote:

    >On Sun, 18 Apr 2004 21:43:19 GMT, geno_cyber@tin.it wrote:
    >
    >>On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:
    >>
    >>><geno_cyber@tin.it> wrote in message
    >>>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    >>>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
    >>>
    >>>Not necessarily, a buffer is also meant to increase overall bandwidth, which
    >>>may be done at the expense of latency.
    >>>
    >>
    >>Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
    >>from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
    >>latency, unless there's some serious design flaws.
    >>I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
    >>ratio.
    >
    >You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
    >would never, ever make. Caches and their effects aren't pertinent to a
    >discussion of the buffering technique found on Fully Buffered DIMMs and their
    >effects on latency and bandwidth...

    FB-DIMMs are supposed to work with an added cheap CPU or DSP with some fast RAM, I doubt embedded
    DRAM on-chip simply due to higher costs but you never know how much they could make a product cheap
    if they really want to and no expensive DSP or CPU is needed there anyway for the FB-DIMM to work.
    I know how both caches and buffers work (circular buffering, FIFO buffering and so on) and because
    they're used to achieve similar results sometimes (like on DSPs architectures where buffering is a
    key to performance with proper assembly code...) , it's not that wrong to refer to a cache as a
    buffer even if its mechanism it's quite different the goal it's almost the same. The truth is that
    both ways of making bits data faster to be retrieved are useful and a proper combination of these
    techniques can achieve higher performance both at the bandwidth and latency levels.
  7. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Mon, 19 Apr 2004 00:38:16 GMT, geno_cyber@tin.it wrote:

    >On Sun, 18 Apr 2004 22:32:32 GMT, daytripper <day_trippr@REMOVEyahoo.com> wrote:
    >
    >>On Sun, 18 Apr 2004 21:43:19 GMT, geno_cyber@tin.it wrote:
    >>
    >>>On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:
    >>>
    >>>><geno_cyber@tin.it> wrote in message
    >>>>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    >>>>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
    >>>>
    >>>>Not necessarily, a buffer is also meant to increase overall bandwidth, which
    >>>>may be done at the expense of latency.
    >>>>
    >>>
    >>>Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
    >>>from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
    >>>latency, unless there's some serious design flaws.
    >>>I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
    >>>ratio.
    >>
    >>You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
    >>would never, ever make. Caches and their effects aren't pertinent to a
    >>discussion of the buffering technique found on Fully Buffered DIMMs and their
    >>effects on latency and bandwidth...
    >
    >FB-DIMMs are supposed to work with an added cheap CPU or DSP with some fast RAM, I doubt embedded
    >DRAM on-chip simply due to higher costs but you never know how much they could make a product cheap
    >if they really want to and no expensive DSP or CPU is needed there anyway for the FB-DIMM to work.
    >I know how both caches and buffers work (circular buffering, FIFO buffering and so on) and because
    >they're used to achieve similar results sometimes (like on DSPs architectures where buffering is a
    >key to performance with proper assembly code...) , it's not that wrong to refer to a cache as a
    >buffer even if its mechanism it's quite different the goal it's almost the same. The truth is that
    >both ways of making bits data faster to be retrieved are useful and a proper combination of these
    >techniques can achieve higher performance both at the bandwidth and latency levels.

    Ummm.....no. You're still missing the gist of the discussion, and confusing
    various forms of caching with the up and down-sides of using buffers in a
    point-to-point interconnect.

    Maybe going back and starting over might help...

    /daytripper
  8. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    geno_cyber@tin.it wrote:

    >FB-DIMMs are supposed to work...

    Do you ever get it right, Geno? I don't think I've seen it...
  9. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote in message
    news:A1zgc.114205$2oI1.47233@twister01.bloor.is.net.cable.rogers.com..
    ..
    > <geno_cyber@tin.it> wrote in message
    > news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    > > A buffer is meant to reduce overall latency, not to increase it
    AFAIK.
    >
    > Not necessarily, a buffer is also meant to increase overall
    bandwidth, which
    > may be done at the expense of latency.

    This particular buffer reduces the DRAM interface pinout by a factor
    of 3 for CPU chips having the memory interface on-chip (such as
    Opteron, the late and unlamented Timna, and future Intel CPUs). This
    reduces the cost of the CPU chip while increasing the cost of the DIMM
    (because of the added buffer chip).

    And yes, the presence of the buffer does increase the latency.

    There are other tradeoffs, the main one being the ability to add lots
    more DRAM into a server. Not important for desktops. YMMV.
  10. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Mon, 19 Apr 2004 07:33:46 -0500, chrisv <chrisv@nospam.invalid> wrote:

    >geno_cyber@tin.it wrote:
    >
    >>FB-DIMMs are supposed to work...
    >
    >Do you ever get it right, Geno? I don't think I've seen it...


    http://cva.stanford.edu/ee482a/scribed/lect07.pdf
  11. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Mon, 19 Apr 2004 07:33:46 -0500, chrisv <chrisv@nospam.invalid> wrote:

    >geno_cyber@tin.it wrote:
    >
    >>FB-DIMMs are supposed to work...
    >
    >Do you ever get it right, Geno? I don't think I've seen it...


    -------

    http://www.faqs.org/docs/artu/ch12s04.html

    Caching Operation Results
    Sometimes you can get the best of both worlds (low latency and good throughput) by computing
    expensive results as needed and caching them for later use. Earlier we mentioned that named reduces
    latency by batching; it also reduces latency by caching the results of previous network transactions
    with other DNS servers.

    ------
  12. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Mon, 19 Apr 2004 07:33:46 -0500, chrisv <chrisv@nospam.invalid> wrote:

    >geno_cyber@tin.it wrote:
    >
    >>FB-DIMMs are supposed to work...
    >
    >Do you ever get it right, Geno? I don't think I've seen it...


    http://camars.kaist.ac.kr/~maeng/cs610/03note/03lect26.ppt
  13. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote:

    ><geno_cyber@tin.it> wrote in message
    >news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    >> A buffer is meant to reduce overall latency, not to increase it AFAIK.
    >
    >Not necessarily, a buffer is also meant to increase overall bandwidth, which
    >may be done at the expense of latency.
    >
    > Yousuf Khan
    >

    http://www.analog.com/UploadedFiles/Application_Notes/144361534EE157.pdf


    As you can see this Analog Devices DSP uses a mixed technique of buffering/caching to improve
    latency in the best case scenario. Obviously if the caching doesn't work and the data it's not
    locally available then the latency has to be higher because you've to get data from slower memory
    but when the data is locally available the latency can be reduced down to zero approx in some cases.
  14. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Mon, 19 Apr 2004 07:33:46 -0500, chrisv <chrisv@nospam.invalid> wrote:

    >geno_cyber@tin.it wrote:
    >
    >>FB-DIMMs are supposed to work...
    >
    >Do you ever get it right, Geno? I don't think I've seen it...

    It's a lost cause...
  15. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    geno_cyber@tin.it wrote :

    > FB-DIMMs are supposed to work with an added cheap CPU or DSP with
    > some fast RAM, I doubt embedded DRAM on-chip simply due to higher
    > costs but you never know how much they could make a product cheap
    > if they really want to and no expensive DSP or CPU is needed there
    > anyway for the FB-DIMM to work. I know how both caches and buffers
    > work (circular buffering, FIFO buffering and so on) and because
    > they're used to achieve similar results sometimes (like on DSPs
    > architectures where buffering is a key to performance with proper
    > assembly code...) , it's not that wrong to refer to a cache as a
    > buffer even if its mechanism it's quite different the goal it's
    > almost the same. The truth is that both ways of making bits data
    > faster to be retrieved are useful and a proper combination of
    > these techniques can achieve higher performance both at the
    > bandwidth and latency levels.

    cache is a form of a buffer
    buffer is not necesarly a cache, imagine one byte buffer, would you
    call it a cache ?

    Pozdrawiam.
    --
    RusH //
    http://pulse.pdi.net/~rush/qv30/
    Like ninjas, true hackers are shrouded in secrecy and mystery.
    You may never know -- UNTIL IT'S TOO LATE.
  16. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    RusH wrote:

    > geno_cyber@tin.it wrote :
    >
    >
    >>FB-DIMMs are supposed to work with an added cheap CPU or DSP with
    >>some fast RAM, I doubt embedded DRAM on-chip simply due to higher
    >>costs but you never know how much they could make a product cheap
    >>if they really want to and no expensive DSP or CPU is needed there
    >>anyway for the FB-DIMM to work. I know how both caches and buffers
    >>work (circular buffering, FIFO buffering and so on) and because
    >>they're used to achieve similar results sometimes (like on DSPs
    >>architectures where buffering is a key to performance with proper
    >>assembly code...) , it's not that wrong to refer to a cache as a
    >>buffer even if its mechanism it's quite different the goal it's
    >>almost the same. The truth is that both ways of making bits data
    >>faster to be retrieved are useful and a proper combination of
    >>these techniques can achieve higher performance both at the
    >>bandwidth and latency levels.
    >
    >
    > cache is a form of a buffer
    > buffer is not necesarly a cache, imagine one byte buffer, would you
    > call it a cache ?

    Sure; you can think of it as a *really* small cache, which will
    therefore have a terrible hit ratio, thus (most likely) increasing latency.

    --
    Mike Smith
  17. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Sun, 18 Apr 2004 22:32:32 GMT, daytripper
    <day_trippr@REMOVEyahoo.com> wrote:

    >You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
    >would never, ever make. Caches and their effects aren't pertinent to a
    >discussion of the buffering technique found on Fully Buffered DIMMs and their
    >effects on latency and bandwidth...

    Ah! I was getting quite confused by his statement about the buffer &
    cache until you said this. Makes it perfectly clear now! :PppP

    --
    L.Angel: I'm looking for web design work.
    If you need basic to med complexity webpages at affordable rates, email me :)
    Standard HTML, SHTML, MySQL + PHP or ASP, Javascript.
    If you really want, FrontPage & DreamWeaver too.
    But keep in mind you pay extra bandwidth for their bloated code
  18. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    <geno_cyber@tin.it> wrote in message
    news:udj7801kk4mg1ba4sdsh2fcuga90knoc8f@4ax.com...
    > On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan"
    <news.tally.bbbl67@spamgourmet.com> wrote:
    > As you can see this Analog Devices DSP uses a mixed technique of
    buffering/caching to improve
    > latency in the best case scenario. Obviously if the caching doesn't work
    and the data it's not
    > locally available then the latency has to be higher because you've to get
    data from slower memory
    > but when the data is locally available the latency can be reduced down to
    zero approx in some cases.

    In this case the buffer is used to eliminate DRAM interface differences when
    going from one technology to a new one.

    Yousuf Khan
  19. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Mon, 19 Apr 2004 17:58:52 GMT, "Yousuf Khan"
    <news.tally.bbbl67@spamgourmet.com> wrote:

    ><geno_cyber@tin.it> wrote in message
    >news:udj7801kk4mg1ba4sdsh2fcuga90knoc8f@4ax.com...
    >> On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan"
    ><news.tally.bbbl67@spamgourmet.com> wrote:
    >> As you can see this Analog Devices DSP uses a mixed technique of
    >buffering/caching to improve
    >> latency in the best case scenario. Obviously if the caching doesn't work
    >and the data it's not
    >> locally available then the latency has to be higher because you've to get
    >data from slower memory
    >> but when the data is locally available the latency can be reduced down to
    >zero approx in some cases.
    >
    >In this case the buffer is used to eliminate DRAM interface differences when
    >going from one technology to a new one.

    "But wait! There's more!"

    The "FB" buffer on an FBdimm is also a bus repeater (aka "buffer") for the
    "next" FBdimm in the chain of FBdimms that comprise a channel. The presence of
    this buffer feature allows the channel to run at the advertised frequencies in
    the face of LOTS of FBdimms on a single channel - frequencies that could not
    be achieved if all those dimms were on the typical multi drop memory
    interconnect (ala most multi-dimm SDR/DDR/DDR2 implementations).

    Anyway...

    I thought I knew the answer to this, but I haven't found it documented either
    way: is the FB bus repeater simply a stateless signal buffer, thus adding its
    lane-to-lane skew to the next device in the chain (which would imply some huge
    de-skewing tasks for the nth FBdimm in - say - an 8 FBdimm implementation). Or
    does the buffer de-skew lanes before passing the transaction on to the next
    node?

    /daytripper
  20. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Mon, 19 Apr 2004 21:55:29 GMT, daytripper
    <day_trippr@REMOVEyahoo.com> wrote:

    >The "FB" buffer on an FBdimm is also a bus repeater (aka "buffer") for the
    >"next" FBdimm in the chain of FBdimms that comprise a channel. The presence of
    >this buffer feature allows the channel to run at the advertised frequencies in
    >the face of LOTS of FBdimms on a single channel - frequencies that could not
    >be achieved if all those dimms were on the typical multi drop memory
    >interconnect (ala most multi-dimm SDR/DDR/DDR2 implementations).

    Does this also mean that I could in theory put a very fast say 1.6Ghz
    buffer on the FBDIMM and sell it as say DDR3-1.6Ghz because of that.
    Even though the actual ram chips are only capable of say 200Mhz?
    :PPpPpP

    --
    L.Angel: I'm looking for web design work.
    If you need basic to med complexity webpages at affordable rates, email me :)
    Standard HTML, SHTML, MySQL + PHP or ASP, Javascript.
    If you really want, FrontPage & DreamWeaver too.
    But keep in mind you pay extra bandwidth for their bloated code
  21. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "The little lost angel" <a?n?g?e?l@lovergirl.lrigrevol.moc.com> wrote in
    message news:4084b2f1.41363671@news.pacific.net.sg...
    > Does this also mean that I could in theory put a very fast say 1.6Ghz
    > buffer on the FBDIMM and sell it as say DDR3-1.6Ghz because of that.
    > Even though the actual ram chips are only capable of say 200Mhz?

    Wasn't there also some talk back in the early days of the K7 Athlon about
    Micron coming out with an AMD chipset with a huge buffer built into its own
    silicon. Micron went so far as to give it a cool codename, Samurai or Mamba
    or something. But nothing else came of it after that.

    Yousuf Khan
  22. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Tue, 20 Apr 2004 05:21:19 GMT, a?n?g?e?l@lovergirl.lrigrevol.moc.com (The
    little lost angel) wrote:

    >On Mon, 19 Apr 2004 21:55:29 GMT, daytripper
    ><day_trippr@REMOVEyahoo.com> wrote:
    >
    >>The "FB" buffer on an FBdimm is also a bus repeater (aka "buffer") for the
    >>"next" FBdimm in the chain of FBdimms that comprise a channel. The presence of
    >>this buffer feature allows the channel to run at the advertised frequencies in
    >>the face of LOTS of FBdimms on a single channel - frequencies that could not
    >>be achieved if all those dimms were on the typical multi drop memory
    >>interconnect (ala most multi-dimm SDR/DDR/DDR2 implementations).
    >
    >Does this also mean that I could in theory put a very fast say 1.6Ghz
    >buffer on the FBDIMM and sell it as say DDR3-1.6Ghz because of that.
    >Even though the actual ram chips are only capable of say 200Mhz?
    >:PPpPpP

    The short answer is: certainly.

    The longer answer is: this is *exactly* the whole point of this technology: to
    make heaps of s l o w but cheap (read: "commodity") drams look fast when
    viewed at the memory channel, in order to accommodate large memory capacities
    for server platforms (ie: I doubt you'll be seeing FBdimms on conventional
    desktop machines anytime soon).

    Like the similar schemes that have gone before this one, it sacrifices some
    latency at the transaction level for beau coup bandwidth at the channel level.

    No doubt everyone will have their favorite benchmark to bang against this to
    see if the net effect is positive...

    /daytripper (Mine would use rather nasty strides ;-)
  23. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Tue, 20 Apr 2004 14:28:18 GMT, "Yousuf Khan"
    <news.tally.bbbl67@spamgourmet.com> wrote:

    >Wasn't there also some talk back in the early days of the K7 Athlon about
    >Micron coming out with an AMD chipset with a huge buffer built into its own
    >silicon. Micron went so far as to give it a cool codename, Samurai or Mamba
    >or something. But nothing else came of it after that.

    Hmm, don't remember that much. Only remember for sure what you forgot,
    it was Samurai :P

    --
    L.Angel: I'm looking for web design work.
    If you need basic to med complexity webpages at affordable rates, email me :)
    Standard HTML, SHTML, MySQL + PHP or ASP, Javascript.
    If you really want, FrontPage & DreamWeaver too.
    But keep in mind you pay extra bandwidth for their bloated code
  24. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Tue, 20 Apr 2004 14:28:18 GMT, "Yousuf Khan"
    <news.tally.bbbl67@spamgourmet.com> wrote:
    >"The little lost angel" <a?n?g?e?l@lovergirl.lrigrevol.moc.com> wrote in
    >message news:4084b2f1.41363671@news.pacific.net.sg...
    >> Does this also mean that I could in theory put a very fast say 1.6Ghz
    >> buffer on the FBDIMM and sell it as say DDR3-1.6Ghz because of that.
    >> Even though the actual ram chips are only capable of say 200Mhz?
    >
    >Wasn't there also some talk back in the early days of the K7 Athlon about
    >Micron coming out with an AMD chipset with a huge buffer built into its own
    >silicon. Micron went so far as to give it a cool codename, Samurai or Mamba
    >or something. But nothing else came of it after that.

    I believe they even built a prototype. Never made it to market
    though. Either way, the chipset in question just had an L3 cache (8MB
    of eDRAM if my memory serves) on the chipset, nothing really to do
    with the buffers in Fully Buffered DIMMs. Buffer != cache.

    -------------
    Tony Hill
    hilla <underscore> 20 <at> yahoo <dot> ca
  25. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <A1zgc.114205$2oI1.47233
    @twister01.bloor.is.net.cable.rogers.com>, news.tally.bbbl67
    @spamgourmet.com says...
    > <geno_cyber@tin.it> wrote in message
    > news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    > > A buffer is meant to reduce overall latency, not to increase it AFAIK.
    >
    > Not necessarily, a buffer is also meant to increase overall bandwidth, which
    > may be done at the expense of latency.

    Jeez Yousuf, a "buffer" may be used simply to increase drive
    (current, if you will). An INVERTER can be a buffer "buffer"
    (though most buffers are non-inverting to avoid confusion. Then
    again there are unbuffered inverters (74xxU... ;-)

    The point is that there are *many* uses of the term "buffer" and
    most have nothing to do with any kind of a "cache". A "cache"
    implies an addressed (usually with a directory) storage element.
    A "buffer" implies no such thing.

    --
    Keith
  26. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <cnh78054mr3tgufn00l5rf6qcrdoekvapu@4ax.com>,
    chrisv@nospam.invalid says...
    > geno_cyber@tin.it wrote:
    >
    > >FB-DIMMs are supposed to work...
    >
    > Do you ever get it right, Geno? I don't think I've seen it...

    Bingo!

    --
    Keith
  27. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <10883o5j7upgf5a@news.supernews.com>,
    mike_UNDERSCORE_smith@acm.DOT.org says...
    > RusH wrote:
    >
    > > geno_cyber@tin.it wrote :
    > >
    > >
    > >>FB-DIMMs are supposed to work with an added cheap CPU or DSP with
    > >>some fast RAM, I doubt embedded DRAM on-chip simply due to higher
    > >>costs but you never know how much they could make a product cheap
    > >>if they really want to and no expensive DSP or CPU is needed there
    > >>anyway for the FB-DIMM to work. I know how both caches and buffers
    > >>work (circular buffering, FIFO buffering and so on) and because
    > >>they're used to achieve similar results sometimes (like on DSPs
    > >>architectures where buffering is a key to performance with proper
    > >>assembly code...) , it's not that wrong to refer to a cache as a
    > >>buffer even if its mechanism it's quite different the goal it's
    > >>almost the same. The truth is that both ways of making bits data
    > >>faster to be retrieved are useful and a proper combination of
    > >>these techniques can achieve higher performance both at the
    > >>bandwidth and latency levels.
    > >
    > >
    > > cache is a form of a buffer
    > > buffer is not necesarly a cache, imagine one byte buffer, would you
    > > call it a cache ?
    >
    > Sure; you can think of it as a *really* small cache, which will
    > therefore have a terrible hit ratio, thus (most likely) increasing latency.

    Ok, how about a *Zero* byte buffer (a.k.a. amplifier)? Is that a
    cache?? You're wrong. A cache is a buffer of sorts, but the term
    "buffer" in no way implies a cache. It may simply be an
    amplifier, which I believe is the case here. 'tripper has his
    ear closer to this ground than anyone else here.

    --
    Keith
  28. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <c5vbe0$fjk$1@nntp.webmaster.com>,
    davids@webmaster.com says...
    >
    > <geno_cyber@tin.it> wrote in message
    > news:lft5801qjivarf2mhfoiko04riq02srkp5@4ax.com...
    > > On Sun, 18 Apr 2004 17:37:36 GMT, "Yousuf Khan"
    > > <news.tally.bbbl67@spamgourmet.com> wrote:
    > >
    > >><geno_cyber@tin.it> wrote in message
    > >>news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    > >>> A buffer is meant to reduce overall latency, not to increase it AFAIK.
    > >>
    > >>Not necessarily, a buffer is also meant to increase overall bandwidth,
    > >>which
    > >>may be done at the expense of latency.
    >
    > > Cache on CPU is not meant to increase bandwidth but to decrease overall
    > > latency to retrieve data
    > > from slower RAM.
    >
    > Yes, but not by making the RAM any faster, but by avoiding RAM accesses.
    > We add cache to the CPU because we admit our RAM is slow.

    We "admit"?? Hell it's a known issue that RAM is *SLOW*. Caches
    are there to improve apparent latency, sure.
    >
    > > More cache-like buffers in the path thru the memory controller can only
    > > improve
    > > latency, unless there's some serious design flaws.
    >
    > That makes no sense. Everything between the CPU and the memory will
    > increase latency. Even caches increase worst case latency because some time
    > is spent searching the cache before we start the memory access. I think
    > you're confused.

    No necessarily. Addresses can be broadcast to the entire memory
    hierarchy simultaneously. The first to answer wins. If it's
    cached, it's fast. If not, there is no penalty in asking the
    cach if it's there and being answered in the negative.

    > > I never seen a CPU that gets slower in accessing data when it can cache
    > > and has a good hit/miss
    > > ratio.

    > Except that we're talking about memory latency due to buffers. And by
    > memory latency we mean the most time it will take between when we ask the
    > CPU to read a byte of memory and when we get that byte.

    Buffers <> caches. IIRC, the issue here was about buffers.

    --
    Keith
  29. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "KR Williams" <krw@att.biz> wrote in message
    news:MPG.1af0f0784b9af0ad98976a@news1.news.adelphia.net...

    >> That makes no sense. Everything between the CPU and the memory will
    >> increase latency. Even caches increase worst case latency because some
    >> time
    >> is spent searching the cache before we start the memory access. I think
    >> you're confused.

    > No necessarily. Addresses can be broadcast to the entire memory
    > hierarchy simultaneously. The first to answer wins. If it's
    > cached, it's fast. If not, there is no penalty in asking the
    > cach if it's there and being answered in the negative.

    Consider two back-to-back addresses. We start broadcasting the first
    address on the memory bus but the cache answers first. Now we can't
    broadcast the second address onto the memory bus until we can quiesce the
    address bus from the first address, can we?

    >> > I never seen a CPU that gets slower in accessing data when it can cache
    >> > and has a good hit/miss
    >> > ratio.

    >> Except that we're talking about memory latency due to buffers. And by
    >> memory latency we mean the most time it will take between when we ask the
    >> CPU to read a byte of memory and when we get that byte.

    > Buffers <> caches. IIRC, the issue here was about buffers.

    Buffers must increase latency. Caches generally increase worst case
    latency; however, unless you have a pathological load, they should improve
    average latency.

    DS
  30. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <rRNgc.195$eZ5.136@newsread1.news.pas.earthlink.net>,
    fmsfnf@jfoops.net says...
    > "Yousuf Khan" <news.tally.bbbl67@spamgourmet.com> wrote in message
    > news:A1zgc.114205$2oI1.47233@twister01.bloor.is.net.cable.rogers.com..
    > .
    > > <geno_cyber@tin.it> wrote in message
    > > news:u34580ltlccpd5p5e47mjv9j2c4lk4b4d9@4ax.com...
    > > > A buffer is meant to reduce overall latency, not to increase it
    > AFAIK.
    > >
    > > Not necessarily, a buffer is also meant to increase overall
    > bandwidth, which
    > > may be done at the expense of latency.
    >
    > This particular buffer reduces the DRAM interface pinout by a factor
    > of 3 for CPU chips having the memory interface on-chip (such as
    > Opteron, the late and unlamented Timna, and future Intel CPUs). This
    > reduces the cost of the CPU chip while increasing the cost of the DIMM
    > (because of the added buffer chip).
    >
    > And yes, the presence of the buffer does increase the latency.

    It may reduce it too! ;-) On-chip delay goes up with the square
    of the length of the wire. Adding a *buffer* in the wire drops
    this to 2x half the length squared (plus buffer delay). "Buffer"
    has many meanings. Me thinks CG doesn't "get it".
    >
    > There are other tradeoffs, the main one being the ability to add lots
    > more DRAM into a server. Not important for desktops. YMMV.

    In this specific instance, perhaps not. Memory is good though.
    More is better, and an upgrade path is also goodness. ...at
    least for the folks in this group. ;-)

    --
    Keith
  31. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz" <davids@webmaster.com>
    wrote:

    >
    >"KR Williams" <krw@att.biz> wrote in message
    >news:MPG.1af0f0784b9af0ad98976a@news1.news.adelphia.net...
    >
    >>> That makes no sense. Everything between the CPU and the memory will
    >>> increase latency. Even caches increase worst case latency because some
    >>> time
    >>> is spent searching the cache before we start the memory access. I think
    >>> you're confused.
    >
    >> No necessarily. Addresses can be broadcast to the entire memory
    >> hierarchy simultaneously. The first to answer wins. If it's
    >> cached, it's fast. If not, there is no penalty in asking the
    >> cach if it's there and being answered in the negative.
    >
    > Consider two back-to-back addresses. We start broadcasting the first
    >address on the memory bus but the cache answers first. Now we can't
    >broadcast the second address onto the memory bus until we can quiesce the
    >address bus from the first address, can we?

    The look aside vs. look through cache. It depends... on all the relative
    timings. First a cache does not have to be "searched" - from the lookup
    you can get a hit/miss answer in one cycle. Assuming look aside cache, if
    the memory requests are queued to the memory controller, there's the
    question of whether you can get a Burst Terminate command through to the
    memory chips past, or before, the 2nd memory access.

    >>> > I never seen a CPU that gets slower in accessing data when it can cache
    >>> > and has a good hit/miss
    >>> > ratio.
    >
    >>> Except that we're talking about memory latency due to buffers. And by
    >>> memory latency we mean the most time it will take between when we ask the
    >>> CPU to read a byte of memory and when we get that byte.
    >
    >> Buffers <> caches. IIRC, the issue here was about buffers.
    >
    > Buffers must increase latency. Caches generally increase worst case
    >latency; however, unless you have a pathological load, they should improve
    >average latency.

    Two points here. I don't think we're talking about data buffering - more
    like "electrical" buffering, as in registered modules. If you have 4 (or
    more) ranks of memory modules (per channel) operating at current speeds,
    you need the registering/buffering somewhere. It makes sense to move it
    closer to the channel interface of the collective DIMMs than to have it
    working independently on each DIMM. I'm not sure there's necessarily any
    increased latency for that situation.

    Rgds, George Macdonald

    "Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
  32. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
    news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
    > On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz"
    > <davids@webmaster.com>
    > wrote:

    >>> No necessarily. Addresses can be broadcast to the entire memory
    >>> hierarchy simultaneously. The first to answer wins. If it's
    >>> cached, it's fast. If not, there is no penalty in asking the
    >>> cach if it's there and being answered in the negative.

    >> Consider two back-to-back addresses. We start broadcasting the first
    >>address on the memory bus but the cache answers first. Now we can't
    >>broadcast the second address onto the memory bus until we can quiesce the
    >>address bus from the first address, can we?

    > The look aside vs. look through cache. It depends... on all the relative
    > timings. First a cache does not have to be "searched" - from the lookup
    > you can get a hit/miss answer in one cycle.

    That would still be a one cycle delay while the cache was searched,
    whether or not you found anything in it.

    > Assuming look aside cache, if
    > the memory requests are queued to the memory controller, there's the
    > question of whether you can get a Burst Terminate command through to the
    > memory chips past, or before, the 2nd memory access.

    Even so, it takes some time to terminate the burst.

    >>>> > I never seen a CPU that gets slower in accessing data when it can
    >>>> > cache
    >>>> > and has a good hit/miss
    >>>> > ratio.

    >>>> Except that we're talking about memory latency due to buffers. And
    >>>> by
    >>>> memory latency we mean the most time it will take between when we ask
    >>>> the
    >>>> CPU to read a byte of memory and when we get that byte.

    >>> Buffers <> caches. IIRC, the issue here was about buffers.

    >> Buffers must increase latency. Caches generally increase worst case
    >>latency; however, unless you have a pathological load, they should improve
    >>average latency.

    > Two points here. I don't think we're talking about data buffering - more
    > like "electrical" buffering, as in registered modules.

    No difference. There is not a data buffer in the world whose output
    transitions before or at the same time as its input. They all add some delay
    to the signals.

    > If you have 4 (or
    > more) ranks of memory modules (per channel) operating at current speeds,
    > you need the registering/buffering somewhere. It makes sense to move it
    > closer to the channel interface of the collective DIMMs than to have it
    > working independently on each DIMM. I'm not sure there's necessarily any
    > increased latency for that situation.

    If you have so many modules per channel that you need buffering, then
    you suffer a buffering penalty. That's my point. Whether that means you need
    faster memory chips to keep the same cycle speed or you cycle more slowly,
    you have a buffering delay.

    I'm really not saying anything controversial. Buffers and caches
    increase latency, at least in the worst case access.

    DS
  33. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in
    message news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
    >
    > Two points here. I don't think we're talking about data buffering -
    more
    > like "electrical" buffering, as in registered modules. If you have
    4 (or
    > more) ranks of memory modules (per channel) operating at current
    speeds,
    > you need the registering/buffering somewhere. It makes sense to
    move it
    > closer to the channel interface of the collective DIMMs than to have
    it
    > working independently on each DIMM. I'm not sure there's
    necessarily any
    > increased latency for that situation.

    I think the "increased latency" is with respect to the usual (in PCs)
    one or two unbuffered DIMMs. In this case, the FB-DIMMs do indeed
    have a greater latency.

    Keep in mind that there will really be no choice once you've bought
    your mobo. The CPU socket will either be for a CPU to use traditional
    DIMMs or (with 66% fewer memory pins) to use FB-DIMMs. You will never
    ever stand there with both types of memory modules in hand and have to
    decide which to plug in.
  34. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "Felger Carbon" <fmsfnf@jfoops.net> wrote:

    >> you need the registering/buffering somewhere. It makes sense to
    >move it
    >> closer to the channel interface of the collective DIMMs than to have
    >it
    >> working independently on each DIMM. I'm not sure there's
    >necessarily any
    >> increased latency for that situation.
    >
    >I think the "increased latency" is with respect to the usual (in PCs)
    >one or two unbuffered DIMMs. In this case, the FB-DIMMs do indeed
    >have a greater latency.
    >
    >Keep in mind that there will really be no choice once you've bought
    >your mobo. The CPU socket will either be for a CPU to use traditional
    >DIMMs or (with 66% fewer memory pins) to use FB-DIMMs. You will never
    >ever stand there with both types of memory modules in hand and have to
    >decide which to plug in.

    "OE quotefix", dude.
  35. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <c67beq$656$1@nntp.webmaster.com>,
    davids@webmaster.com says...
    >
    > "KR Williams" <krw@att.biz> wrote in message
    > news:MPG.1af0f0784b9af0ad98976a@news1.news.adelphia.net...
    >
    > >> That makes no sense. Everything between the CPU and the memory will
    > >> increase latency. Even caches increase worst case latency because some
    > >> time
    > >> is spent searching the cache before we start the memory access. I think
    > >> you're confused.
    >
    > > No necessarily. Addresses can be broadcast to the entire memory
    > > hierarchy simultaneously. The first to answer wins. If it's
    > > cached, it's fast. If not, there is no penalty in asking the
    > > cach if it's there and being answered in the negative.
    >
    > Consider two back-to-back addresses. We start broadcasting the first
    > address on the memory bus but the cache answers first. Now we can't
    > broadcast the second address onto the memory bus until we can quiesce the
    > address bus from the first address, can we?

    You're assuming the time to access to the caches is a significant
    fraction of the time required to access main memory. It's
    certainly not. Cache results are known *long* before the address
    is broadcast to the mass memory. By the time the memory request
    gets near the chip's I/O the caches know whether they can deliver
    the data. If so the memory request is killed. THere is no
    additional latency here.
    >
    > >> > I never seen a CPU that gets slower in accessing data when it can cache
    > >> > and has a good hit/miss
    > >> > ratio.
    >
    > >> Except that we're talking about memory latency due to buffers. And by
    > >> memory latency we mean the most time it will take between when we ask the
    > >> CPU to read a byte of memory and when we get that byte.
    >
    > > Buffers <> caches. IIRC, the issue here was about buffers.
    >
    > Buffers must increase latency.

    Ok. You're right. Except that things don't work without
    buffers. Does that mean they increase latency, or does it mean
    that they allow things to *work*?

    > Caches generally increase worst case
    > latency; however, unless you have a pathological load, they should improve
    > average latency.

    Again, BUFFERS <> CACHES! A buffer can be a simple amplifier
    (thus no storage element at all). It's naive to say that a
    buffer increases latency (particularly since many here don't seem
    to understand the term).

    --
    Keith
  36. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com>,
    fammacd=!SPAM^nothanks@tellurian.com says...
    > On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz" <davids@webmaster.com>
    > wrote:
    >
    > >
    > >"KR Williams" <krw@att.biz> wrote in message
    > >news:MPG.1af0f0784b9af0ad98976a@news1.news.adelphia.net...
    > >
    > >>> That makes no sense. Everything between the CPU and the memory will
    > >>> increase latency. Even caches increase worst case latency because some
    > >>> time
    > >>> is spent searching the cache before we start the memory access. I think
    > >>> you're confused.
    > >
    > >> No necessarily. Addresses can be broadcast to the entire memory
    > >> hierarchy simultaneously. The first to answer wins. If it's
    > >> cached, it's fast. If not, there is no penalty in asking the
    > >> cach if it's there and being answered in the negative.
    > >
    > > Consider two back-to-back addresses. We start broadcasting the first
    > >address on the memory bus but the cache answers first. Now we can't
    > >broadcast the second address onto the memory bus until we can quiesce the
    > >address bus from the first address, can we?
    >
    > The look aside vs. look through cache. It depends... on all the relative
    > timings. First a cache does not have to be "searched" - from the lookup
    > you can get a hit/miss answer in one cycle. Assuming look aside cache, if
    > the memory requests are queued to the memory controller, there's the
    > question of whether you can get a Burst Terminate command through to the
    > memory chips past, or before, the 2nd memory access.

    Sure. THe cache is "searched" in smaller time than the request
    gets to the I/O. If it's satisfied by the caches, the storage
    request can be canceled with no overhead. If not, the storage
    request I allowed to continue.
    >
    > >>> > I never seen a CPU that gets slower in accessing data when it can cache
    > >>> > and has a good hit/miss
    > >>> > ratio.
    > >
    > >>> Except that we're talking about memory latency due to buffers. And by
    > >>> memory latency we mean the most time it will take between when we ask the
    > >>> CPU to read a byte of memory and when we get that byte.
    > >
    > >> Buffers <> caches. IIRC, the issue here was about buffers.
    > >
    > > Buffers must increase latency. Caches generally increase worst case
    > >latency; however, unless you have a pathological load, they should improve
    > >average latency.
    >
    > Two points here. I don't think we're talking about data buffering - more
    > like "electrical" buffering, as in registered modules.

    Bingo! ...though I thought this was clear.

    > If you have 4 (or
    > more) ranks of memory modules (per channel) operating at current speeds,
    > you need the registering/buffering somewhere. It makes sense to move it
    > closer to the channel interface of the collective DIMMs than to have it
    > working independently on each DIMM. I'm not sure there's necessarily any
    > increased latency for that situation.

    I'm not either. It works one way, and not the other. DOes that
    mean the way it *works* is slower?

    --
    Keith
  37. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <c68l3m$s6i$1@nntp.webmaster.com>,
    davids@webmaster.com says...
    >
    > "George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
    > news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
    > > On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz"
    > > <davids@webmaster.com>
    > > wrote:
    >
    > >>> No necessarily. Addresses can be broadcast to the entire memory
    > >>> hierarchy simultaneously. The first to answer wins. If it's
    > >>> cached, it's fast. If not, there is no penalty in asking the
    > >>> cach if it's there and being answered in the negative.
    >
    > >> Consider two back-to-back addresses. We start broadcasting the first
    > >>address on the memory bus but the cache answers first. Now we can't
    > >>broadcast the second address onto the memory bus until we can quiesce the
    > >>address bus from the first address, can we?
    >
    > > The look aside vs. look through cache. It depends... on all the relative
    > > timings. First a cache does not have to be "searched" - from the lookup
    > > you can get a hit/miss answer in one cycle.
    >
    > That would still be a one cycle delay while the cache was searched,
    > whether or not you found anything in it.

    Oh, NO! The cache is *not* "searched". THe answer is yes/no and
    that answer is quick. In addition the request can be sent in
    parallel to the next level of hierarchy and canceled if satisfied
    at a lower level. The load/store queues must be coherent for
    other reasons, this is a minor architectural complication.

    > > Assuming look aside cache, if
    > > the memory requests are queued to the memory controller, there's the
    > > question of whether you can get a Burst Terminate command through to the
    > > memory chips past, or before, the 2nd memory access.
    >
    > Even so, it takes some time to terminate the burst.

    The burst hasn't even started. Sheesh!
    >
    > >>>> > I never seen a CPU that gets slower in accessing data when it can
    > >>>> > cache
    > >>>> > and has a good hit/miss
    > >>>> > ratio.
    >
    > >>>> Except that we're talking about memory latency due to buffers. And
    > >>>> by
    > >>>> memory latency we mean the most time it will take between when we ask
    > >>>> the
    > >>>> CPU to read a byte of memory and when we get that byte.
    >
    > >>> Buffers <> caches. IIRC, the issue here was about buffers.
    >
    > >> Buffers must increase latency. Caches generally increase worst case
    > >>latency; however, unless you have a pathological load, they should improve
    > >>average latency.
    >
    > > Two points here. I don't think we're talking about data buffering - more
    > > like "electrical" buffering, as in registered modules.
    >
    > No difference. There is not a data buffer in the world whose output
    > transitions before or at the same time as its input. They all add some delay
    > to the signals.

    Sure. If the signals don't get there they're hardly useful
    though.
    >
    > > If you have 4 (or
    > > more) ranks of memory modules (per channel) operating at current speeds,
    > > you need the registering/buffering somewhere. It makes sense to move it
    > > closer to the channel interface of the collective DIMMs than to have it
    > > working independently on each DIMM. I'm not sure there's necessarily any
    > > increased latency for that situation.
    >
    > If you have so many modules per channel that you need buffering, then
    > you suffer a buffering penalty. That's my point. Whether that means you need
    > faster memory chips to keep the same cycle speed or you cycle more slowly,
    > you have a buffering delay.
    >
    > I'm really not saying anything controversial. Buffers and caches
    > increase latency, at least in the worst case access.

    Certainly anything that adds latency, adds latency (duh!), but
    you're arguing that buffers == caches *and* that caches increase
    latency. This is just not so!

    --
    Keith
  38. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Thu, 22 Apr 2004 07:33:16 -0700, "David Schwartz" <davids@webmaster.com>
    wrote:

    >
    >"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
    >news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
    >> On Wed, 21 Apr 2004 19:42:35 -0700, "David Schwartz"
    >> <davids@webmaster.com>
    >> wrote:
    >
    >>>> No necessarily. Addresses can be broadcast to the entire memory
    >>>> hierarchy simultaneously. The first to answer wins. If it's
    >>>> cached, it's fast. If not, there is no penalty in asking the
    >>>> cach if it's there and being answered in the negative.
    >
    >>> Consider two back-to-back addresses. We start broadcasting the first
    >>>address on the memory bus but the cache answers first. Now we can't
    >>>broadcast the second address onto the memory bus until we can quiesce the
    >>>address bus from the first address, can we?
    >
    >> The look aside vs. look through cache. It depends... on all the relative
    >> timings. First a cache does not have to be "searched" - from the lookup
    >> you can get a hit/miss answer in one cycle.
    >
    > That would still be a one cycle delay while the cache was searched,
    >whether or not you found anything in it.

    On current CPUs, a memory channel cycle is 10-15 or so cache cycles - get
    things aligned right, call it a coupla cache clocks, and there's no need to
    shove the address on the memory bus (AMD) or FSB (Intel). Accurate info is
    elusive on this kind of thing now but I believe that look-aside caches are
    just considered unnecessary now.

    >> Assuming look aside cache, if
    >> the memory requests are queued to the memory controller, there's the
    >> question of whether you can get a Burst Terminate command through to the
    >> memory chips past, or before, the 2nd memory access.
    >
    > Even so, it takes some time to terminate the burst.

    I doubt that it's going to need to be terminated - IOW the 1st cache
    hit/miss result (not necessarily the cache data) should be available before
    the memory address has passed out of the CPU.

    >>> Buffers must increase latency. Caches generally increase worst case
    >>>latency; however, unless you have a pathological load, they should improve
    >>>average latency.
    >
    >> Two points here. I don't think we're talking about data buffering - more
    >> like "electrical" buffering, as in registered modules.
    >
    > No difference.

    D'oh!

    > There is not a data buffer in the world whose output
    >transitions before or at the same time as its input. They all add some delay
    >to the signals.

    But it's not data that's being buffered - it's simply a (near)zero-gain
    amplifier to keep all the modules talking in unison.

    >> If you have 4 (or
    >> more) ranks of memory modules (per channel) operating at current speeds,
    >> you need the registering/buffering somewhere. It makes sense to move it
    >> closer to the channel interface of the collective DIMMs than to have it
    >> working independently on each DIMM. I'm not sure there's necessarily any
    >> increased latency for that situation.
    >
    > If you have so many modules per channel that you need buffering, then
    >you suffer a buffering penalty. That's my point. Whether that means you need
    >faster memory chips to keep the same cycle speed or you cycle more slowly,
    >you have a buffering delay.

    That's what the damned thing is for - large memory systems. Currently you
    put registered DIMMs, with their latency penalty, in such a system and even
    there you run into problems with the multi-drop memory channel of DDR.
    What I'm saying is that the buffering of FB-DIMMs is not necessarily any
    worse and you get the DIMMs to "talk" consistently to the channel.

    > I'm really not saying anything controversial. Buffers and caches
    >increase latency, at least in the worst case access.

    You seem to be stuck in data buffers!

    Rgds, George Macdonald

    "Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
  39. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Thu, 22 Apr 2004 18:29:08 GMT, "Felger Carbon" <fmsfnf@jfoops.net>
    wrote:

    >"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in
    >message news:ii0f80dqbek4nhtiiduk9fpk3308l051aa@4ax.com...
    >>
    >> Two points here. I don't think we're talking about data buffering -
    >more
    >> like "electrical" buffering, as in registered modules. If you have
    >4 (or
    >> more) ranks of memory modules (per channel) operating at current
    >speeds,
    >> you need the registering/buffering somewhere. It makes sense to
    >move it
    >> closer to the channel interface of the collective DIMMs than to have
    >it
    >> working independently on each DIMM. I'm not sure there's
    >necessarily any
    >> increased latency for that situation.
    >
    >I think the "increased latency" is with respect to the usual (in PCs)
    >one or two unbuffered DIMMs. In this case, the FB-DIMMs do indeed
    >have a greater latency.

    Sure but compared with registering independently on every DIMM and hoping
    that they all talk on the same edges... or close enough so that it works??
    It's not clear to me if, in a large memory system, say 8 ranks per channel,
    accesses to the farthest DIMMs are going to have extra cycles of latency
    added but if the clock frequency can be jacked up significantly, does it
    matter much?

    >Keep in mind that there will really be no choice once you've bought
    >your mobo. The CPU socket will either be for a CPU to use traditional
    >DIMMs or (with 66% fewer memory pins) to use FB-DIMMs. You will never
    >ever stand there with both types of memory modules in hand and have to
    >decide which to plug in.

    CPU socket? Oh we're talking AMD as the "standard" now?:-) Daytripper
    mentioned he's not sure the FB-DIMM is going to make it to the desktop
    anyway. Makes me wonder how the pricing is going to fall out with the
    market fragmentation - to date we've all -- desktop through to server --
    benefited from that model till now. Could get awkward for CPU on-die
    memory controllers too if we need to have different CPUs according to the
    memory type.

    Rgds, George Macdonald

    "Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
  40. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "chrisv" <chrisv@nospam.invalid> wrote in message
    news:8t8g801v6krtjm5q7kc7gl8jn5dtrn4oen@4ax.com...
    > "Felger Carbon" <fmsfnf@jfoops.net> wrote:
    >
    > >I think the "increased latency" is with respect to the usual (in
    PCs)
    > >one or two unbuffered DIMMs. In this case, the FB-DIMMs do indeed
    > >have a greater latency.
    > >
    > >Keep in mind that there will really be no choice once you've bought
    > >your mobo. The CPU socket will either be for a CPU to use
    traditional
    > >DIMMs or (with 66% fewer memory pins) to use FB-DIMMs. You will
    never
    > >ever stand there with both types of memory modules in hand and have
    to
    > >decide which to plug in.
    >
    > "OE quotefix", dude.

    "Huh?" asks Felger, who is easily puzzled. ;-)
  41. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
    news:eujh80h59vq3ho5pdl4sr8uelpfrrcak3e@4ax.com...
    > CPU socket? Oh we're talking AMD as the "standard" now?:-) Daytripper
    > mentioned he's not sure the FB-DIMM is going to make it to the desktop
    > anyway. Makes me wonder how the pricing is going to fall out with the
    > market fragmentation - to date we've all -- desktop through to server --
    > benefited from that model till now. Could get awkward for CPU on-die
    > memory controllers too if we need to have different CPUs according to the
    > memory type.

    Oh, I think it's all much ado. We'll keep the desktop to server DRAM
    interface commonality for a long time. Afterall, how many new types of DRAM
    come out in a given amount of time? I'd say a new standard every 3 to 5
    years? Hardly a breakneck frequency. People will continue to design new DRAM
    controllers based on upcoming standards, and they will also put backwards
    compatibility into these controllers for previous generations of DRAM.

    Yousuf Khan
  42. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in
    message news:eujh80h59vq3ho5pdl4sr8uelpfrrcak3e@4ax.com...
    >
    > Daytripper
    > mentioned he's not sure the FB-DIMM is going to make it to the
    desktop
    > anyway. Makes me wonder how the pricing is going to fall out with
    the
    > market fragmentation - to date we've all -- desktop through to
    server --
    > benefited from that model till now. Could get awkward for CPU
    on-die
    > memory controllers too if we need to have different CPUs according
    to the
    > memory type.

    Intel may see this as an opportunity to increase the ASPs on "Xeon"
    CPUs - in other words, on CPUs for servers. I think it'll be a cold
    day in hell when this shows up on personal desktops.
  43. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Fri, 23 Apr 2004 18:28:36 GMT, "Yousuf Khan"
    <news.tally.bbbl67@spamgourmet.com> wrote:

    >"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
    >news:eujh80h59vq3ho5pdl4sr8uelpfrrcak3e@4ax.com...
    >> CPU socket? Oh we're talking AMD as the "standard" now?:-) Daytripper
    >> mentioned he's not sure the FB-DIMM is going to make it to the desktop
    >> anyway. Makes me wonder how the pricing is going to fall out with the
    >> market fragmentation - to date we've all -- desktop through to server --
    >> benefited from that model till now. Could get awkward for CPU on-die
    >> memory controllers too if we need to have different CPUs according to the
    >> memory type.
    >
    >Oh, I think it's all much ado. We'll keep the desktop to server DRAM
    >interface commonality for a long time. Afterall, how many new types of DRAM
    >come out in a given amount of time? I'd say a new standard every 3 to 5
    >years? Hardly a breakneck frequency. People will continue to design new DRAM
    >controllers based on upcoming standards, and they will also put backwards
    >compatibility into these controllers for previous generations of DRAM.

    On the "frequency", od standards, we had a close call with DRDRAM. Was it
    a "standard" or not?... it came close at least. I don't see how backwards
    compatibility is something they can even think of - different signalling is
    just different.

    Rgds, George Macdonald

    "Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
  44. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
    news:rnhl80lp7m98o5vofunh3orlpnclecoea1@4ax.com...
    > >Oh, I think it's all much ado. We'll keep the desktop to server DRAM
    > >interface commonality for a long time. Afterall, how many new types of
    DRAM
    > >come out in a given amount of time? I'd say a new standard every 3 to 5
    > >years? Hardly a breakneck frequency. People will continue to design new
    DRAM
    > >controllers based on upcoming standards, and they will also put backwards
    > >compatibility into these controllers for previous generations of DRAM.
    >
    > On the "frequency", od standards, we had a close call with DRDRAM. Was it
    > a "standard" or not?... it came close at least. I don't see how backwards
    > compatibility is something they can even think of - different signalling
    is
    > just different.

    Well, they've had chipsets in the past which implemented compatibility with
    both EDO and SDR rams. Then later we had chipsets which did both SDR and
    DDR1 compatibility. Why should it be difficult to put dual DDR1 and DDR2
    capabilities? It detects which type of ram it's connected to and switches to
    the circuitry for that particular type of RAM.

    Yousuf Khan
  45. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Sun, 25 Apr 2004 19:53:01 GMT, "Yousuf Khan"
    <news.20.bbbl67@spamgourmet.com> wrote:
    >Well, they've had chipsets in the past which implemented compatibility with
    >both EDO and SDR rams. Then later we had chipsets which did both SDR and
    >DDR1 compatibility. Why should it be difficult to put dual DDR1 and DDR2
    >capabilities? It detects which type of ram it's connected to and switches to
    >the circuitry for that particular type of RAM.

    Different voltage swings, different (IO) voltages, and I don't think DDR1 used
    ODT. And then you have the whole dimm socket keying issue.

    In a heavily commoditized market it'd unnecessarily drive up the chipset and
    platform implementation costs to accommodate both technologies with a single
    solution...

    /daytripper
  46. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <Xaaic.8576$e4.3166@newsread2.news.pas.earthlink.net>,
    "Felger Carbon" <fmsfnf@jfoops.net> wrote:
    | "Huh?" asks Felger, who is easily puzzled. ;-)

    For the love of all that is good and holy, immediately download
    and install this:

    http://home.in.tum.de/~jain/software/oe-quotefix/

    before you mangle another quote on Usenet again!

    I believe that is what chrisv meant.
  47. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    ihoc-a-attbi-d-com wrote:
    > In article <Xaaic.8576$e4.3166@newsread2.news.pas.earthlink.net>,
    > "Felger Carbon" <fmsfnf@jfoops.net> wrote:
    >> "Huh?" asks Felger, who is easily puzzled. ;-)
    >
    > For the love of all that is good and holy, immediately download
    > and install this:
    >
    > http://home.in.tum.de/~jain/software/oe-quotefix/
    >
    > before you mangle another quote on Usenet again!
    >
    > I believe that is what chrisv meant.

    Working okay, so far. Let's try this on some really complicated quotes. :-)

    Yousuf Khan
  48. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    On Sun, 25 Apr 2004 19:53:01 GMT, "Yousuf Khan"
    <news.20.bbbl67@spamgourmet.com> wrote:

    >"George Macdonald" <fammacd=!SPAM^nothanks@tellurian.com> wrote in message
    >news:rnhl80lp7m98o5vofunh3orlpnclecoea1@4ax.com...
    >> >Oh, I think it's all much ado. We'll keep the desktop to server DRAM
    >> >interface commonality for a long time. Afterall, how many new types of
    >DRAM
    >> >come out in a given amount of time? I'd say a new standard every 3 to 5
    >> >years? Hardly a breakneck frequency. People will continue to design new
    >DRAM
    >> >controllers based on upcoming standards, and they will also put backwards
    >> >compatibility into these controllers for previous generations of DRAM.
    >>
    >> On the "frequency", od standards, we had a close call with DRDRAM. Was it
    >> a "standard" or not?... it came close at least. I don't see how backwards
    >> compatibility is something they can even think of - different signalling
    >is
    >> just different.
    >
    >Well, they've had chipsets in the past which implemented compatibility with
    >both EDO and SDR rams. Then later we had chipsets which did both SDR and
    >DDR1 compatibility. Why should it be difficult to put dual DDR1 and DDR2
    >capabilities? It detects which type of ram it's connected to and switches to
    >the circuitry for that particular type of RAM.

    Obviously it depends onhow big a jump there is between the technology of
    the two memory channels - SDRAM and DDR-SDRAM are not too far apart in
    terms of signalling - a few extra pins for source synch clocking and a few
    others which were used slightly differently. OTOH we never saw a dual
    DRDRAM and SDRAM chipset - too different... would certainly require
    independent pins. I'm not up on the details of FB-DIMM interfacing but I'd
    think it'd be different enough.

    Rgds, George Macdonald

    "Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
  49. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    George Macdonald wrote:
    >> Well, they've had chipsets in the past which implemented
    >> compatibility with both EDO and SDR rams. Then later we had chipsets
    >> which did both SDR and DDR1 compatibility. Why should it be
    >> difficult to put dual DDR1 and DDR2 capabilities? It detects which
    >> type of ram it's connected to and switches to the circuitry for that
    >> particular type of RAM.
    >
    > Obviously it depends onhow big a jump there is between the technology of
    > the two memory channels - SDRAM and DDR-SDRAM are not too far apart in
    > terms of signalling - a few extra pins for source synch clocking and
    > a few others which were used slightly differently. OTOH we never saw
    > a dual DRDRAM and SDRAM chipset - too different... would certainly
    > require independent pins. I'm not up on the details of FB-DIMM
    > interfacing but I'd think it'd be different enough.

    Well, there were different voltages for SDR and DDR, so it wasn't exactly
    the simple jump that from SDR to DDR that you describe. Plus you needed
    different sockets for each type. And in most cases you couldn't use both
    types of RAM at the same time because of the voltage issue.

    As for SDR and RDR together, must I remind you of the infamous Intel MTH?
    Okay, I didn't say that it had to be a successful chipset (or even a good
    chipset), but you did see the capability of using either type of memory at
    one point in time. :-)

    Yousuf Khan
Ask a new question

Read More

CPUs DRAM Intel