xcom_cheetah

Distinguished
Oct 24, 2001
71
0
18,630
from the inquirer
our own sources have also done some memory benchmarking on the upcoming Intel 7200 (Granite Bay) dual-channel PC2100 DDR chipset for the Pentium 4 and, guess what, its initial bandwidth results are not any better than the i850E with dual-channel PC1066.

remember that debate..??
 

eden

Champion
Yeah and see, that was PC 2100.
2.1GB x 2 equals what? The same exact thing as Rambus.

Part of the reason at the same bandwidth it didn't do better, is the RDRAM latency being now fairly similar. I didn't think we were going to use PC2100 for Dual-Channel, anyways. I think it's DDR333 or DDR400 at CAS2 which would definitly kick ass.
The current result from the inquirer are of no surprise, and I just explained why.

--
Meow
 

Matisaro

Splendid
Mar 23, 2001
6,737
0
25,780
I actually dont remember the debate, but 2x pc2100+ is the same as 1066rdram, and since p4 is designed to minimise the impacts of latency of course dual channel ddr will perform no better/slightly better than dual channel rdram.

HOWEVER, that having been said, from a sterile non platform specific view, dual channel ddr has much better latency figures than rdram, and on a platform which is not specifically designed to ignore latency(because it was designed for high latency memory systems) dual ddr would beat out competing dual rdram systems.

Also, price is still a factor as iirc a stick of pc2100 is signigfiganly cheaper than a stick of pc1066.

I have always said rdram was a better memory for the p4, my stance for ddr is for the ram technology itself.

I feel dual channel pc2700 would be easier to impliment and better performing than dual channel pc1200 with the caveat that the dual ddr would require more trace space and slightly more expensive motherboards.

However the price disparity between those ram technologies(as it stands today) would more than make up for the additional motherboard cost.

Latency is a critical factor in memory performance, when your system is designed to stream data thus hiding latency(as in the p4 with its prefetch etc) it is not as critical, however in a system with man random accesses of data and less cache hits on data memory latency is a big deal.


A better test of dual channel ddr would be in a system where the bus is not the limiting factor, dual pc2100+ is the same as dual 1066rdram iirc, and remember this blurb is only talking about bandwidth, whereas latency is also important, wait for benchmarks, you will find that all else being equal the dual channel ddr(if both systems have the same bandwidth) will perform slightly better due to its latency.

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
 

xcom_cheetah

Distinguished
Oct 24, 2001
71
0
18,630
from matisaro
wait for benchmarks, you will find that all else being equal the dual channel ddr(if both systems have the same bandwidth) will perform slightly better due to its latency.
i believe this.... but my point was that dumping RDRAM by intel is not a good thing.. technology itself was a pretty good.. and secondly atleast it keeps the options for the consumer

i read somewhere .. i don't know its write but it is interesting..
rdram has more granular subsystem providing true read/write concurrency thus yielding better scalability

if this statement is true than this means in multi threaded prescott .. dual channel RDRAM will provide better efficiency and performance
 

eden

Champion
On a side note, that was Intel's chipset. I say wait for SiS, they have been a very good P4 high-performing DDR chipset maker, and I trust they won't fail us.

i believe this.... but my point was that dumping RDRAM by intel is not a good thing.. technology itself was a pretty good.. and secondly atleast it keeps the options for the consumer

Intel should not dump RDRAM, but it is purely and purely (FatBurger agrees with this entirely) Rambus' fault for dragging their asses down in RDRAM. It's been over 2 years we've had the exact same RDRAM technology, there has been no improvement, PC1066 is not yet mainstream at all. When should we expect PC1200 to come out then? When are we to get Dual Channel-functional 32-bit RIMM modules and higher? Blame Rambus I say!

if this statement is true than this means in multi threaded prescott .. dual channel RDRAM will provide better efficiency and performance
True, I remember seeing a bench where the Read was very powerful for RDRAM PC800. <A HREF="http://www.x86-secret.com/popups/articleswindow.php?id=41" target="_new">http://www.x86-secret.com/popups/articleswindow.php?id=41</A> Skip the benchs if you want, until the RDRAM Read\Write. You will see RDRAM is much more powerful in either Write or Read, by a fairly good margin against DDR400. I suppose it is a specialty to it other than just having small datapaths and then label them as a different technology.
I suppose scalability is also thanks to the smaller bit paths.

BTW as I was reading previously on the Athlon architecture, and though I may be wrong on this, but its EV6 bus architecture is not limited to 2.1GB but to 3.2GB. If that holds true, the Athlon is not designed to stop at 2.1GB only and that it can go further. I could be wrong of course, but anyone who's up for some old THG article reading, go ahead.
The point of what I said here, is that if it is true, then the P4 is not designed to stop at 3.2GB/sec of bandwidth, unless Intel used a crappy bus architecture, compared to the EV6 which scales from DDR200 to DDR400.



--
:smile: Intel and AMD sitting under a tree, P-R-O-C-E-S-S-I-N-G! :smile:
 

eden

Champion
It's even more of a shame Ray was debating you on how PC800 pricing was the same or better than PC2100, when now DDR is over 30% less while RDRAM is the same since a year!

--
:smile: Intel and AMD sitting under a tree, P-R-O-C-E-S-S-I-N-G! :smile:
 

Matisaro

Splendid
Mar 23, 2001
6,737
0
25,780
if this statement is true than this means in multi threaded prescott .. dual channel RDRAM will provide better efficiency and performance

Multithreaded processing increases latency dependancy, it is much more difficult for a processors prefetch to work accurately and efficiently for 2 process's than for a single process. Thus less needed data will be in the cache(not to mention the cache is sharing data for 2 threads) and you get more unpredicted hits to system memory, which will cause rdrams latency defficiency to be magnified.

Its too early to tell what will happen in the coming months/years.

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
 

lhgpoobaa

Illustrious
Dec 31, 2007
14,462
1
40,780
wouldnt dual channel PC2700 be overkill though, as the latest P4 bus speed is 533 not 666?

2 x PC2100 would fit the 533 system bus exactly.


<font color=purple>All advice I offer has been
Audited by Arthur Anderson.</font color=purple>
 

Matisaro

Splendid
Mar 23, 2001
6,737
0
25,780
Yep it would, so at a given bus speed the equavalent dual channel ddr will always slightly outperform dual channel rdram due to its latency advantage.



:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
 

xcom_cheetah

Distinguished
Oct 24, 2001
71
0
18,630
from matisaro
Latency is a critical factor in memory performance, when your system is designed to stream data thus hiding latency(as in the p4 with its prefetch etc) it is not as critical, however in a system with man random accesses of data and less cache hits on data memory latency is a big deal.

if i remember correctly i provided a link where they showed that PC 2100 has higher latency ( ABit BD7 board + DDR 266 vs IWill P4R533 + PC 1066) ... and i don;t think that in dual setup its latency will decrease...
http://www.vr-zone.com/reviews/Iwill/P4R533/page6.htm

so if bandwidth is same and so is latency than i wonder if there will be performance difference much...

secondly with prescott we do need good RAM which can have concurrent read/write possibility.. and if RDRAM is better in that thing than IMHO RDRAM should perform better..

ps. now keep this discussion limited to dual DDR 2100 and dual RDRAM 1066...
 

FatBurger

Illustrious
<i>Matisaro says:</i>
I feel dual channel pc2700 would be easier to impliment and better performing than dual channel pc1200

Have you been sniffing glue again? Why would PC2700 be easier to implement than PC2100?

<i>Matisaro says:</i>
Latency is a critical factor in memory performance, when your system is designed to stream data thus hiding latency(as in the p4 with its prefetch etc)

Hammer's on-die MCH should decrease latency significantly, meaning RDRAM might be the better technology for it? Hmm...

<i>Eden says:</i>
PC800 pricing was the same or better than PC2100

When Northwood came out, PC800 was cheaper than PC2700, the competing memory. That was around the time and the nature of the debate, I believe.

<i>Matisaro says:</i>
Yep it would, so at a given bus speed the equavalent dual channel ddr will always slightly outperform dual channel rdram due to its latency advantage.

I disagree, at least for the P4. RDRAM's latency goes down for sequential reads, so when everything is set up well (part programming, part MCH design, part luck), RDRAM would have much lower latency than SDRAM could ever hope for. It of course wouldn't be that way 100% of the time, or even close, but on a well designed platform, RDRAM's latency can be lowered quite a bit.

<font color=blue>Hi mom!</font color=blue>
 

Matisaro

Splendid
Mar 23, 2001
6,737
0
25,780
if i remember correctly i provided a link where they showed that PC 2100 has higher latency ( ABit BD7 board + DDR 266 vs IWill P4R533 + PC 1066) ... and i don;t think that in dual setup its latency will decrease...
http://www.vr-zone.com/reviews/Iwill/P4R533/page6.htm

Latency on the best ddr chipset(kt266a) is better than pc1066's best chipset. You could say that pc1066 is better when the ddr is on a alimagik ddr chipset.

As for the naked technical truth check out <A HREF="http://arstechnica.com/paedia/r/ram_guide/ram_guide.part3-1.html" target="_new">Truth</A>.

On an archetectural level rdram has huge latency penalties compared to ddr. Also both systems latency decreases on relativly the same scale, so the gain from pc800 to pc1066(rdr) is roughly the same from pc1600 to pc2100(ddr).

Also, rdrams latency his hidden quite a bit, hugely infact from its dual channel setup, and the same is true for dual channel ddr, with dual channel you have twice the pages open at the same time lowering average access times.

The benifits of dual channel affect ddr as well as rdram.


secondly with prescott we do need good RAM which can have concurrent read/write possibility.. and if RDRAM is better in that thing than IMHO RDRAM should perform better..

I think the much lower latency will help more than that, but again, you cannot predict what will happen in a years time with any certainty.

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
 

Matisaro

Splendid
Mar 23, 2001
6,737
0
25,780
Hammer's on-die MCH should decrease latency significantly, meaning RDRAM might be the better technology for it? Hmm...

Most of rdrams latency comes from the memory itself, most of ddrs comes from the memory controler, rdram would perform badly on the hammer and the gains from on die controler would be lost waiting for the ram to send the data.


I disagree, at least for the P4. RDRAM's latency goes down for sequential reads, so when everything is set up well (part programming, part MCH design, part luck), RDRAM would have much lower latency than SDRAM could ever hope for. It of course wouldn't be that way 100% of the time, or even close, but on a well designed platform, RDRAM's latency can be lowered quite a bit.


Sure and if we put sse2 into every app the world will be brighter and cleaner. Making an app have all sequential reads is very difficult I would imagine, and furthermore the same app would run faster on ddr as well because ddr also hides latency on sequential reads.

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
 

FatBurger

Illustrious
I didnt mention pc2100, I said pc 1200, rdram, put the glue down and step away from the keyboard.

....

:redface:

My bad

Most of rdrams latency comes from the memory itself, most of ddrs comes from the memory controler, rdram would perform badly on the hammer and the gains from on die controler would be lost waiting for the ram to send the data.

My point is that putting the MCH on-die would be like greatly improving the data prefetch, it should make the MCH more intelligent. That would lower RDRAM's latency.

an app have all sequential reads is very difficult I would imagine

Most likely, which is why I'd like to hear more about it from a programmer's point of view.

furthermore the same app would run faster on ddr as well because ddr also hides latency on sequential reads.

But nowhere near as much. RDRAM's latency goes down much more for sequential reads.

<font color=blue>Hi mom!</font color=blue>
 

xcom_cheetah

Distinguished
Oct 24, 2001
71
0
18,630
Latency on the best ddr chipset(kt266a) is better than pc1066's best chipset. You could say that pc1066 is better when the ddr is on a alimagik ddr chipset.

wot shuld i say to this..?? i think the first basic principle is that keep the platform as much as possible same...
anyway i m surely surprised that via suddenly lost their ability to lower the latency in P4X266 and P4X266A chipset.. this link as all the chipsets latencies except p4x333 and if u put pc 1066 in the equation it will have the least latency...
http://www.xbitlabs.com/mainboards/p4-chipsets-comparison/index2.html

Anyway atleast if u want to back ur arguement u should have given any DDR chipset for P4...

[joking]
via kt 266a = lowest latency ( for athlon)
via p4x266a = higher latency than RDRAM(intel i850) ( for P4)
fromt this we can deduce if intel develops a RDRAM chipset for Athlon XP it should have much lower latency than kt 266a..... wow wot a chipset it will be..:)
secondly y not then check the latency of Sony PS2...
[/joking]

i have just gone through the arstechnica article (not read completely.. i will do it later)
and i didn;t find any think which state that RDRAM latency is high and will remain high...
wot i found interesting is ..

The system latency issues that surround the RAMBUS channel and that I've pointed out here in the RAMBUS channel discussion are by no means the whole story when it comes to latency and/or overall performance. In particular, system latency, especially in a RAMBUS system, is a complex issue that's affected by numerous factors, some of which we've covered in earlier parts of this article and some of which we'll cover in the next section. For instance, as we discussed at the beginning of this piece, RAMBUS' high bank count can reduce system read latencies significantly because more rows can remain open at a time. Also, system read latencies will be reduced in some upcoming systems that include RAMBUS memory controllers integrated onto the CPU die.
In summation, I've tried to show how different parts of a RAMBUS system affect read latencies, for good or for ill, as we examine each individual part of the RAMBUS technology. I've done this in order to give you a feel for the complexity of the issue and the number and nature of the factors that must be taken into account when discussing it.

This to me look like that RDRAM latency can be reduced handsomely if properly implemented... anyway its my initial assesment.. it is bound to change when i will dig deeper into the article... but i will appreciate if u can guide me to a particular spot in the article where it is stated something on the lines that RDRAM latency cannot become lesser than DDR.... i will really appreciate it..
 

Matisaro

Splendid
Mar 23, 2001
6,737
0
25,780
This to me look like that RDRAM latency can be reduced handsomely if properly implemented... anyway its my initial assesment.. it is bound to change when i will dig deeper into the article... but i will appreciate if u can guide me to a particular spot in the article where it is stated something on the lines that RDRAM latency cannot become lesser than DDR.... i will really appreciate it..
No one said cannot, I said IS, baseline comparing pc2100 to pc800 shows pc2100 with lower latency, (this is with a dual channel rdram system, there are no readily available dual ddr benchmarks at this time).

Each speed bump lower latency by a roughly equavalent % for both rdram and ddr, thus you have rdram never overtaking ddr.

You are fond of comparing pc1066 with pc2100, and of comparing dual rdram with single channel ddr, both of which are invalid comparisons.


In short I wont do your homework for you, the technological paper is clear as to rdrams latency issues, it must be taken as a whole, in fact you should start with their first part.(sram versus sdram) for a greater grasp on memory technology in general.\


wot shuld i say to this..?? i think the first basic principle is that keep the platform as much as possible same...
anyway i m surely surprised that via suddenly lost their ability to lower the latency in P4X266 and P4X266A chipset.. this link as all the chipsets latencies except p4x333 and if u put pc 1066 in the equation it will have the least latency...
Not when we are looking at the memory technology itself, the p4 was DESIGNED for rdram, I am not advocating comparing rdram and ddr in a real world test environment, but when figures like latency are tossed about one must not limit onesself to a single platform, especially when the platform was designed for rdram. By showing latency figures on a ddr chipset which exceed the figures you have for a p4 chipset I have shown that ddr the TECHNOLOGY is capable of far lower latency than you give it credit for. Just as when asked about rdram I will not give i820(the p3 rdram chipset) results as it is single channel but more importantly, was for a processor which DID NOT us rdrams advantage, and was designed with expectations which put the rdram at a disadvantage.


This debate will continue, I ask you to remember I am not debating rdram versus ddr on the p4, I am debating rdram versus ddr as a technological whole, and platform consistancy has nothing to do with that, it leaves us open to prediction and interpretation, but thats the best part of a theoretical technological discussion isnt it.

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
 

Matisaro

Splendid
Mar 23, 2001
6,737
0
25,780
but i will appreciate if u can guide me to a particular spot in the article where it is stated something on the lines that RDRAM latency cannot become lesser than DDR.... i will really appreciate it..



heres some select quotes for everyone not just you.


For fatburger on my comments about pc1200 implementation difficulties, and quad channel rdram issues


"While the long, thin RAMBUS channel pumps a lot of bandwidth over a small number of traces, it's nonetheless one of RAMBUS' most controversial features. Operating at up to 400MHz, it's very fast, and since it makes for a minimal number of signal traces that have to be etched into the motherboard, it's simpler overall than SDRAM's interleaving of data buses. However, it still carries with it some drawbacks. One problem with the long, fast bus is its effect on cost. Some of the savings in cost that RAMBUS gets from using fewer traces are cancelled out by the fact that the RAMBUS channel is a long series of wires that have to run at a whopping 400MHz. To get the bus speed up that high, the board has to be manufactured to a very high standard of quality in order to reduce noise, stray capacitance, variations in line impedance, and other problems associated with rising bus speeds. In some cases, you may even have to add more layers to the motherboard just to be able to provide a clean enough signal."


Because of the need to be able to delay the output of read requests so that reads from different RDRAM chips can arrive at the chipset together and in the right order, a RAMBUS system has to go through an elaborate initialization ritual on boot-up in order to determine the amount of delay that needs to be inserted into each RDRAM. The read delay value for each individual RDRAM chip is programmed via the control pins into one of those control registers that we met in the previous section. These read delays effectively slow down the entire system so that each device has the same latency as the outermost RDRAM. As you add more devices to a RAMBUS system, the entire system has higher and higher read latency. So, while individual RDRAM chips might have a read latency (access time) of 20ns, which is about the same read latency as some SDRAMs, once you stick them in a system with three full RIMMs the overall system latency (which is the total amount of time from when the CPU sends out the read command and the data arrives back at it) will be either slightly better or significantly worse than the system latency for an SDRAM system, depending on a myriad of factors. (More on these factors in a second.)

Further aggravating the read latency situation is the fact that RAMBUS doesn't support critical word first bursting. When the CPU asks for 8 bytes of data from a conventional SDRAM, the memory system sends it back 16 bytes data along with under the presumption that it'll probably need those extra 8 bytes shortly. Nevertheless, the 8 bytes that were specifically asked for-- the critical word--arrive at the CPU first, with the other freebie bytes coming next. RDRAM doesn't do this. It just sends you a whole 16 byte train of data, and if the 8 bytes you asked for are at the end of that train, then you'll just have to wait until they get there.

Finally, since the bus is so long and passes through so many devices, the capacitances added in by the loads of all of the attached devices significantly increase bus signal propagation time. So again, the more devices you stick on the RAMBUS channel, the worse the latency gets. However, RAMBUS' signaling layer, high quality packaging, and strict specifications for producing RIMMs are aimed at reducing these types of unwanted electrical effects.


on latency


:wink: The Cash Left In My Pocket,The BEST Benchmark :wink:
 

FatBurger

Illustrious
You are fond of comparing pc1066 with pc2100, and of comparing dual rdram with single channel ddr, both of which are invalid comparisons.

The second is a valid comparison in that it's what is available right now. It is not a valid comparison in that SDRAM's datapath is 64-bit, compared to RDRAM's 32-bit.

<font color=blue>Hi mom!</font color=blue>
 

xcom_cheetah

Distinguished
Oct 24, 2001
71
0
18,630
first of all if u have read my first post than u must have seen that this post has to do with the performance of dual channel PC2100 and dual channel PC 1066, both of which will provide same bandwidth for 4.2GB/s for P4 bus..

secondly Pc1066 boards and RAM are available (although at a little higher price) and dual channel board and RAM will be available within couple of month .. and in this atleast i was trying to make an educated guess (which have every possiblity to be way off than the debuting dual channel ddr chipset)..

Secondly i nowhere condenm or bashed the DDR technology i was just trying to make a point that in reference with P4 dual channel DDR board is doesnot give a clear cut advantage and dual channel RDRAM chipset should have been continued, i nowhere argued only on the technology until or unless it directly related to the performance of P4... and nowhere my point was to win this debate..

u feel its not rite to bring the dual channel issue of RDRAM chipset vs single channel throughput of ddr chipset and for u its rite to bring the kt266a latency vs the P4 RDRAM chipset latency... where both of these r totally different platforms... and skipping my question y they have not been able to bring this low level of latency even in their P4X333 chipset.. ??

i repeat, this discussion was meant to be speculating ( although one can be wrong, but i don't think there is any harm in making an educated guess abt wot the future holds but it should be educated.. and in some above post i even mentioned that i think that dual channel ddr chipset will have IMHO will have not any less latency than single channel chipset until or unless Intel doesnot use any advance method which r used in nForce chipset) the performance of these two chipset (dual channel ddr vs dual channel RDRAM)...
 

Matisaro

Splendid
Mar 23, 2001
6,737
0
25,780
first of all if u have read my first post than u must have seen that this post has to do with the performance of dual channel PC2100 and dual channel PC 1066, both of which will provide same bandwidth for 4.2GB/s for P4 bus..

You make a thread calling me in specifically, and you ask me if I remember the debate, I never have debated rdram versus ddr for the p4 specifically, all of the debates I have been involved in were about the technology itself, so if you wouldnt mind changing the title of the post it would be good.

If you wish to debate given strict rules such as only real world pc2100 versus 1066 etc etc, then dont call me out specifically, cause I dont debate memory technolgy on specific platforms, and never have.

:wink: The Cash Left In My Pocket,The BEST Benchmark :wink: