Latency - what everybody has and nobody wants

FatBurger

Illustrious
This is a thread for discussion of memory latency. I will be posting different results tomorrow (too tired tonight), but until then, as many people as possible please download Latency2 from this link: <A HREF="http://personalwebs.myriad.net/roelof/setispy/latency2.exe" target="_new">http://personalwebs.myriad.net/roelof/setispy/latency2.exe</A> and post your results. It should be interesting to see what some real-world tests turn up.

Also, does anyone have any information on the validity of this program? I've never heard of it before, so I don't know how accurate it is.
Thanks

<font color=blue>If you don't buy Windows, then the terrorists have already won!</font color=blue> - Microsoft
 

lagger

Distinguished
Jan 19, 2001
1,922
0
19,780
soyo p4s dragon ultra (sis 645 chipset)
stock clocks 3:5 divider, Fast setting
NW 2.0a
Kingmax bga pc 2700 ddr333 @ 333mhz, 2 sticks @ 256MB ea.

Array 4 Byte 64 Byte
Size stride stride
(kb) (ticks) (ticks)
1 2.0 2.0
2 2.0 2.0
3 2.0 2.0
4 2.0 2.0
6 2.0 2.0
8 2.1 2.2
10 2.9 18.6
12 2.6 16.6
14 2.6 18.8
16 2.7 18.7
18 2.7 18.3
20 2.7 18.8
22 2.7 18.9
24 2.8 18.8
26 2.7 18.8
28 2.8 18.3
30 2.8 18.9
32 2.8 18.9
36 2.8 18.9
40 2.8 18.9
44 2.8 18.9
48 2.8 18.9
52 2.8 18.9
56 2.8 18.4
60 2.8 18.9
64 2.8 19.1
72 2.8 19.2
80 2.8 19.1
88 2.8 19.0
96 2.8 19.0
104 2.8 19.1
112 2.8 19.1
120 2.8 19.1
128 2.8 18.6
144 2.8 18.8
160 3.6 32.0
176 3.7 30.8
192 3.5 30.2
208 3.5 28.9
224 3.5 30.4
240 3.7 30.1
256 3.4 30.3
288 3.4 29.4
320 3.4 30.2
352 3.6 34.0
384 3.8 35.7
416 3.7 36.6
448 4.3 39.5
480 4.6 50.7
512 4.7 54.3
576 5.2 63.7
640 6.0 78.8
704 6.0 80.2
768 6.1 83.0
832 6.2 84.3
896 6.4 86.9
960 6.4 86.7
1024 6.3 86.9
1536 6.4 86.8
2048 6.2 86.0
2560 6.3 86.0
3072 6.3 84.4
3584 6.3 86.6
4096 6.3 86.1


<b><font color=blue>Checking under my North<font color=red> AND</font color=red> South bridges for <font color=green>Trolls</font color=green></font color=blue>
 
G

Guest

Guest
Dell Dimension 8100 (ugh)
Dell custom mb. i850 chipset
1st channel 2x 128 mb PC800 Toshiba 8 device RDRAM
2nd channel 2x 128 mb PC800 Kingston (comes up as samsung in Sisoft) 4 device RDRAM
p4 @ 1.7

Array 4 Byte 64 Byte
Size stride stride
(kb) (ticks) (ticks)
1 2.0 2.0
2 2.0 2.0
3 2.0 2.0
4 2.0 2.0
6 2.0 2.0
8 2.6 3.4
10 4.7 18.6
12 3.8 16.7
14 3.5 18.3
16 3.8 18.3
18 4.1 18.3
20 3.9 18.4
22 4.2 18.4
24 4.2 18.4
26 4.1 18.4
28 4.3 18.4
30 4.5 18.4
32 4.4 18.4
36 4.5 18.5
40 4.4 18.5
44 4.2 25.5
48 4.4 28.5
52 4.5 27.9
56 4.4 26.9
60 4.6 26.3
64 4.7 25.9
72 4.6 25.1
80 4.5 24.3
88 4.6 23.7
96 4.6 23.6
104 4.6 23.1
112 4.6 22.9
120 4.5 22.6
128 4.5 22.2
144 4.5 21.8
160 4.5 21.5
176 4.2 21.3
192 4.4 21.1
208 4.4 20.8
224 4.4 20.5
240 4.5 23.6
256 4.9 28.9
288 6.0 69.1
320 6.3 76.9
352 6.2 77.1
384 6.3 77.0
416 6.2 77.0
448 6.2 77.9
480 6.2 77.1
512 6.2 77.1
576 6.3 77.2
640 6.3 76.9
704 6.2 76.8
768 6.2 76.9
832 6.3 76.7
896 6.2 77.0
960 6.2 77.8
1024 6.2 77.3
1536 6.3 76.8
2048 6.2 76.8
2560 6.2 76.9
3072 6.2 76.8
3584 6.2 77.8
4096 6.3 77.0
 

jclw

Distinguished
Dec 31, 2007
1,255
0
19,290
Roelof devloped the proggy to help benchmark processors running SETI@Home. He wrote <A HREF="http://personalwebs.myriad.net/roelof/setispy/" target="_new">SETIspy</A> and I believe has a fair bit to do with the <A HREF="http://www.teamlambchop.com/bench/index.htm" target="_new">Team Lamb Chop</A> SETI benchmarking pages.

- JW
 

Jake75

Distinguished
Aug 30, 2001
2,770
0
20,780
Pentium III @ 500Mhz, 100Mhz FSB
128Mb "Don´t know the brand RAM"
(This is my computer where I work)

Array 4 Byte 64 Byte
Size stride stride
(kb) (ticks) (ticks)
1 3.0 3.0
2 3.0 3.0
3 3.0 3.0
4 3.0 3.0
6 3.0 3.0
8 3.0 3.0
10 3.0 3.0
12 3.0 3.0
14 3.0 3.0
16 3.0 3.0
18 4.3 13.6
20 5.3 22.0
22 5.3 22.0
24 5.3 22.0
26 5.3 22.0
28 5.3 22.0
30 5.3 22.0
32 5.3 22.0
36 5.3 22.0
40 5.3 22.0
44 5.3 22.0
48 5.3 22.0
52 5.3 22.0
56 5.3 22.0
60 5.3 22.0
64 5.3 22.0
72 5.3 22.0
80 5.3 22.0
88 5.3 22.0
96 5.3 22.0
104 5.3 22.0
112 5.3 22.0
120 5.3 22.0
128 5.3 22.0
144 5.3 22.0
160 5.3 22.0
176 5.3 22.0
192 5.3 22.0
208 5.3 22.0
224 5.3 22.0
240 5.3 22.0
256 5.3 22.0
288 5.3 22.1
320 5.3 22.1
352 5.3 22.1
384 5.3 22.1
416 5.7 24.8
448 5.6 24.6
480 6.2 28.3
512 7.1 34.5
576 9.7 52.7
640 11.7 64.9
704 12.0 67.3
768 11.9 67.6
832 12.3 69.0
896 12.3 69.2
960 12.5 70.2
1024 12.5 70.3
1536 12.6 71.1
2048 12.6 71.1
2560 12.6 71.3
3072 12.6 71.3
3584 12.6 71.3
4096 12.6 71.3

<font color=red>...</font color=red><font color=blue>STOP EVERYTHING</font color=blue><font color=red>...</font color=red><P ID="edit"><FONT SIZE=-1><EM>Edited by Jake75 on 04/02/02 08:08 AM.</EM></FONT></P>
 

jclw

Distinguished
Dec 31, 2007
1,255
0
19,290
My results are <A HREF="http://forumz.tomshardware.com/hardware/modules.php?name=Forums&file=viewtopic&p=508269#508269" target="_new">here</A>.

To compare mine against Jake75s (because both are PIII/SDRAM):

Jake75 has a latency of 25.2 (4 byte) / 142.6 (64 byte) ns when accesing his memory (above 512k, his L2 cache size).
I have a much lower latency (16/92).

If you do the math you'll see that my latency is almost exactly 2/3 of his.

I'm runnning my memory at 2-2-2 so I'm going to go out on a limb and guess that Jake75 has an i440BX board (same as me) with his memory running at 3-3-3.

- JW
 

FatBurger

Illustrious
Well, I emailed all my benchmarks to myself, and now I can unRAR them :frown:
Guess I'll post them tonight.

<font color=blue>If you don't buy Windows, then the terrorists have already won!</font color=blue> - Microsoft
 

Jake75

Distinguished
Aug 30, 2001
2,770
0
20,780
Here is the results for my own computer.
XP1600-266 FSB*, Epox Kha+*, 256Mb CAS2(Normal/Safe settings)*

Isn´t lower better?, in that case...these numbers don´t look that good.

Array 4 Byte 64 Byte
Size stride stride
(kb) (ticks) (ticks)
1 3.0 3.0
2 3.0 3.0
3 3.0 3.0
4 3.0 3.0
6 3.0 3.0
8 3.0 3.0
10 3.0 3.0
12 3.0 3.0
14 3.0 3.0
16 3.0 3.0
18 3.0 3.0
20 3.0 3.0
22 3.0 3.0
24 3.0 3.0
26 3.0 3.0
28 3.0 3.0
30 3.0 3.0
32 3.0 3.0
36 3.0 3.0
40 3.0 3.0
44 3.0 3.0
48 3.0 3.0
52 3.0 3.0
56 3.0 3.0
60 3.0 3.0
64 3.0 3.0
72 3.4 8.6
80 3.6 13.2
88 3.9 16.9
96 4.1 20.0
104 4.1 20.0
112 4.1 20.0
120 4.1 20.0
128 4.1 20.0
144 4.1 20.0
160 4.1 20.0
176 4.1 20.0
192 4.1 20.0
208 4.1 20.0
224 4.1 20.0
240 4.1 20.0
256 4.1 20.0
288 4.1 20.0
320 5.6 44.7
352 13.3 170.7
384 13.3 170.9
416 13.3 170.6
448 13.3 170.7
480 13.3 170.7
512 13.3 171.0
576 13.3 170.9
640 13.3 170.7
704 13.3 170.8
768 13.3 171.2
832 13.3 170.8
896 13.3 171.3
960 13.3 171.0
1024 13.3 171.3
1536 13.3 171.6
2048 13.3 171.7
2560 13.3 171.8
3072 13.3 171.8
3584 13.3 172.2
4096 13.3 171.5

* NEW ADDITIONS

<font color=red>...</font color=red><font color=blue>STOP EVERYTHING</font color=blue><font color=red>...</font color=red><P ID="edit"><FONT SIZE=-1><EM>Edited by Jake75 on 04/04/02 01:39 AM.</EM></FONT></P>
 

FatBurger

Illustrious
I tried emailing them to myself last night and my email was down. Then I forgot the CD I burned them to, and Geocities wouldn't let me upload them. I'm getting screwed over everywhere I turn.

<font color=blue>If you don't buy Windows, then the terrorists have already won!</font color=blue> - Microsoft
 

jclw

Distinguished
Dec 31, 2007
1,255
0
19,290
!!!THINGS YOU MUST POST FOR VALID COMPARISONS!!!

#1 - CPU speed (NOT PR rating) at time of test
#2 - Chipset
#3 - Memory type (SDRAM, DDR, RDRAM)
#4 - Memory speed at time of test
#5 - Memory settings at time of test
#6 - The results

- JW
<P ID="edit"><FONT SIZE=-1><EM>Edited by JCLW on 04/03/02 09:41 PM.</EM></FONT></P>
 

jclw

Distinguished
Dec 31, 2007
1,255
0
19,290
Jake: Remember the results are given in processor ticks.

Here's your results in ns (assuming 1400MHz processor speed):

Latency in L1 cache = 2.1/2.1 ns
Latency in L2 cache = 2.9/14.3 ns
Latency in main memory = 9.5/122.5 ns

- JW
 

jclw

Distinguished
Dec 31, 2007
1,255
0
19,290
Lagger's P4-2.0A/DDR333: 3.1/43 ns

LosingStreak's P4-1.7/PC800: 3.7/45 ns

Remember the P4s have data-prefetch which is probably one reason why these scores are so good.

- JW
 

phsstpok

Splendid
Dec 31, 2007
5,600
1
25,780
I don't understand what I am getting. Can you explain what the Latency2 numbers mean? I thought they were clock ticks but if that were true I wouldn't think the latency would change with higher FSB speeds. The amount of time would change, yes, but not the clock ticks.

I noticed, for the larger arrays sizes, the number of ticks goes down with higher FSB speed. Does this mean SDRAM gets more efficient with higher speeds?

Here are some my results (abbreviated for simplicity).

<b>Tbird 1.2ghz (for testing), KT133A chipset-SDRAM, CAS2-agressive settings</b>

<b>100mhz*12</b>
1 3.0 3.0
72 3.4 8.6
4096 15.2 195.3

<b>133mhz*9</b>
1 3.0 3.0
72 3.4 8.6
4096 17.0 166.5

<b>150mhz*8</b>
1 3.0 3.0
72 3.4 8.6
4096 15.1 163.2


<b>I have so many cookies I now have a FAT problem!</b><P ID="edit"><FONT SIZE=-1><EM>Edited by phsstpok on 04/03/02 11:16 PM.</EM></FONT></P>
 

jclw

Distinguished
Dec 31, 2007
1,255
0
19,290
I know the KT133x chipset can run the FSB and memory bus at different speeds (ie: 100FSB and 133memory), I'm assuming for your test you had the memory bus running the same speed as the FSB.

[CPU]---FrontSideBus---[NorthBridge]---MemoryBus---[Memory]

You're keeping the processor speed the same in all three test, and raising the memory speed. As the memory speed increases, the latency should be reduced. In theory the latency at 150MHz should be 2/3 of the latency at 100MHz, because the memory is running 3/2 faster. If you kept the multiplier on your CPU constant while you changed bus speeds the latency (measured in clock ticks) would stay the same.

For example: Say you had your 1200MHz cpu and ran your memory at 100MHz and came up with 150 clock ticks as your latency. Then, if you doubled your memory speed to 200MHz while keeping your processor at 1200MHz, The memory would appear to be twice as fast from your CPUs point of view, and the latency should drop (theoretically) to 75 clock ticks.

-JW
 

FatBurger

Illustrious
P4 1.6a at stock and 155MHz FSB
Abit TH7II-RAID with i850 chipset
2x 256MB sticks of double-sided Samsung RDRAM
Bios version 77

<b>Stock (100MHz):</b>
<pre>Array 4 Byte 64 Byte
Size stride stride
(kb) (ticks) (ticks)
1 2.0 2.0
2 2.0 2.0
3 2.0 2.0
4 2.0 2.0
6 2.0 2.0
8 2.1 2.2
10 2.9 18.4
12 2.6 15.5
14 2.6 17.6
16 2.7 18.1
18 2.7 18.1
20 2.7 18.2
22 2.7 18.2
24 2.7 18.3
26 2.7 18.2
28 2.8 18.3
30 2.8 18.4
32 2.8 18.4
36 2.8 18.3
40 2.8 18.4
44 2.8 18.3
48 2.8 18.4
52 2.8 18.4
56 2.8 18.3
60 2.8 18.5
64 2.8 18.4
72 2.8 18.5
80 2.8 18.3
88 2.8 18.3
96 2.8 18.4
104 2.8 18.3
112 2.8 18.3
120 2.8 18.3
128 2.8 18.3
144 2.8 18.3
160 2.8 18.4
176 2.8 18.3
192 3.1 18.3
208 2.8 18.3
224 2.8 18.3
240 2.8 18.3
256 3.1 18.8
288 3.1 19.1
320 3.2 19.0
352 3.2 19.0
384 3.2 19.0
416 3.2 18.9
448 3.3 19.0
480 3.3 21.2
512 3.4 23.5
576 5.9 71.2
640 6.0 72.6
704 6.0 72.6
768 6.0 72.8
832 6.0 72.5
896 6.0 72.7
960 6.0 72.8
1024 5.9 72.7
1536 6.0 72.4
2048 6.0 72.7
2560 6.0 72.6
3072 6.0 72.8
3584 6.1 72.6
4096 6.0 72.4
</pre><p>
<b>OCed to 155, 3/4 RAM:</b>
<pre>Array 4 Byte 64 Byte
Size stride stride
(kb) (ticks) (ticks)
1 2.0 2.0
2 2.0 2.0
3 2.0 2.0
4 2.0 2.0
6 2.0 2.0
8 2.1 2.2
10 2.9 18.4
12 2.6 16.6
14 2.6 17.2
16 2.7 18.1
18 2.7 18.1
20 2.7 18.3
22 2.7 18.2
24 2.7 18.2
26 2.7 18.2
28 2.8 18.2
30 2.8 18.3
32 2.8 18.3
36 2.8 18.3
40 2.8 18.3
44 2.8 18.3
48 2.8 18.4
52 2.8 18.3
56 2.8 18.3
60 2.8 18.3
64 2.8 18.3
72 2.8 18.5
80 2.8 18.3
88 2.8 18.3
96 2.8 18.3
104 2.8 18.4
112 2.8 18.3
120 2.9 18.3
128 2.8 18.3
144 2.8 18.4
160 2.8 18.3
176 2.8 18.3
192 3.1 18.3
208 2.8 18.3
224 2.8 18.3
240 2.8 18.4
256 3.0 18.7
288 3.1 18.9
320 3.1 19.0
352 3.1 19.0
384 3.1 19.0
416 3.1 19.0
448 3.1 19.0
480 3.3 21.5
512 3.5 24.3
576 6.5 83.2
640 6.6 85.1
704 6.6 84.8
768 6.5 84.8
832 6.6 84.9
896 6.6 84.6
960 6.7 84.7
1024 6.6 84.9
1536 6.6 85.0
2048 6.6 84.4
2560 6.6 84.7
3072 6.7 83.9
3584 6.6 82.6
4096 6.6 84.7
</pre><p>
<font color=blue>If you don't buy Windows, then the terrorists have already won!</font color=blue> - Microsoft
 

phsstpok

Splendid
Dec 31, 2007
5,600
1
25,780
Yes, I had the memory running synchronously with the bus.

OK, I see what you mean. I kept the CPU at a constant speed (by finnagling both FSB and multiplier), meanwhile I was increasing the speed of the memory and thus lowering the latency. The Latency2 numbers are in CPU clock ticks, not bus clock ticks. I get it!

<b>I have so many cookies I now have a FAT problem!</b><P ID="edit"><FONT SIZE=-1><EM>Edited by phsstpok on 04/04/02 03:29 PM.</EM></FONT></P>
 

FatBurger

Illustrious
BTW, interesting that my RAM's relative latency went up when I increased the speed, which is the opposite of what Ray has said.

<font color=blue>If you don't buy Windows, then the terrorists have already won!</font color=blue> - Microsoft
 

jclw

Distinguished
Dec 31, 2007
1,255
0
19,290
Remember those numbers are clock ticks.

In terms of actual time:
Stock = 3.8/45 ns
155FSB = 2.7/34 ns

So in realtime your latency does go down.

Getting back to the latency increases in terms of clock ticks: this is most likely caused by the asynchronous FSB and memory bus speeds. When they are running at the same speed you get a steady flow of data going back and forth but when one is running at 155MHz and the other at 116.3MHz the timing is a little off and you actually gain latency. This was first seen in the VIA 133 series chipsets which would allow a 100FSB and 133 memory bus. In many cases people ended up with faster systems running their memory at 100MHz.

- JW

[edit]
It's interesting to note that the L2 cache of the northwood processors has considerably less latency then that of their 0.18 brothers.
[/edit]
<P ID="edit"><FONT SIZE=-1><EM>Edited by JCLW on 04/04/02 03:40 PM.</EM></FONT></P>
 

Raystonn

Distinguished
Apr 12, 2001
2,273
0
19,780
Set your external clock at 133MHz and your RDRAM multipler at 4x and check your latency.

-Raystonn


= The views stated herein are my personal views, and not necessarily the views of my employer. =
 

Raystonn

Distinguished
Apr 12, 2001
2,273
0
19,780
His latency actually is lower in terms of real time. It is slightly higher per clock, but he is running at a much higher clockspeed. So each clock takes much less time. The reason for the slight penalty is the asynchronous memory-to-FSB transfers. Occasionally the memory must wait for the next FSB cycle since they are out of sync.

-Raystonn


= The views stated herein are my personal views, and not necessarily the views of my employer. =
 

FatBurger

Illustrious
Set your external clock at 133MHz and your RDRAM multipler at 4x and check your latency.

Will do. I didn't think about asynchronous busses, thanks JCLW.

Ray, is the lowered L2 latency only because of the increased size, or did Intel take the opportunity to tweak the cache as well?

<font color=blue>If you don't buy Windows, then the terrorists have already won!</font color=blue> - Microsoft
 

FatBurger

Illustrious
As a side note, I posted a request for more results over at the HardOCP forums <A HREF="http://www.hardforum.com/showthread.php?s=&threadid=368917" target="_new">here</A>.

<font color=blue>If you don't buy Windows, then the terrorists have already won!</font color=blue> - Microsoft
 

Raystonn

Distinguished
Apr 12, 2001
2,273
0
19,780
Where is the lowered L2 latency being shown? I missed those numbers.

-Raystonn


= The views stated herein are my personal views, and not necessarily the views of my employer. =