Performance vs Number of DIMMS

nikkopt

Distinguished
Aug 15, 2011
6
0
18,510
Hey guys.. I've been googling for an answer: is it faster to use more DIMMS with less capacity or less DIMMS with more capacity?
90% of the answers say that it's best to use less DIMMS but real-world benchmarks on my PC say different than theory.
I have 2x1GB OCZ DDR2 800MHz 4-4-4-14 and i thought of testing the performance in windows by adding 2 more gigs i borrowed from my sisters' PC (Kingston 2x1GB DDR2 667MHz 5-5-5-15). I was astonished with the performance gain, even with lower frequency and higher latencies.
I did read somewhere that a system will be faster with more DIMMS if the memory controller can fully interleave all the slots (and my motherboard can, according to the manual), but the majority of people doesn't agree with it..

The difference isn't really THAT much but the faster tests are also running with a slower clock speed and higher latencies.
Am i doing something wrong or using more DIMMS does help on performance?
Sorry for my english

PICS:
http://i396.photobucket.com/albums/pp42/nikkopt/667vs800.png
http://i396.photobucket.com/albums/pp42/nikkopt/667vs800-2.png

 
Generally speaking, when it comes to performance, you're going to always hear/read varying opinions. Clearly you've established that your mobo has potential gains when using more capacity.

Timing/latency is arguably the most important spec of RAM. The timing is defined by how many clock cycles the RAM needs to transfer data. Frequency comes in close at second place, but only because the frequency is relative to your base clock/FSB/reference clock (these terms all refere the CPU Bus).

Your current RAM speed is the bus speed multiplied by the RAM multiplier (or ~ratio). For example, AMD chips have an FSB of 200 MHz by default. With the 800 MHz RAM, if set to run at 800 MHz in the BIOS, the ratio would be 4:1 (RAM:CPU). So, if the CPU is left at stock (No OC), then the RAM would be running at 800 MHz. 4 x 200 MHz = 800 MHz. So, as you can see, the RAM to CPU ratio is what really determines the frequency (not necessarily speed) of your RAM and how well it can communicate with the CPU.

In relation to the capcity, think of the mobo as an open freeway (no traffic). Think of the DIMMs as the car; think of the max supported frequency as the speed limit; and finally, think of the data that needs to be processed as cargo. Now, if you had to transfer 100 3FT² boxes, you could do this with multiple trips using a Ferrari (in this case the Ferrari would be the 800 MHz DIMM) or you could to the same job with more cars with slower speeds (let's say a Mazda3, as to not offend senstive Honda drivers). You see, in this example, although the Mazda3 is clearly slower than the Ferrari (any model), the job was to transport 100 boxes. The problem here was that the Ferrari didn't have enough capcity to the job alone. I know, this is a crude example in which I don't really reference RAM timing, but this kinda explains why lower frequencies, but larger capacities can mean better performance.

In short, larger capacity typically trumps lower latency.
 

nikkopt

Distinguished
Aug 15, 2011
6
0
18,510
Thanks for your reply.
I always thought differently about the way they work, i thought bus width was what conditioned the amount of data sent at the same time (apart from other things like DDR signaling, etc), in this case 2 buses 64 bit wide. Using an analogy, let's think of the buses as highways with 64 lanes each, one lane can only have one car at a time (2 cars in the case of DDR but let's use one as an example) and the speed limit is 800 miles per hour, it also has tolls (one per lane) in the beginning and in the end (these are the latency), cars are electricity (or bits) and the dimm capacity would be the size of a big parking lot at the end of the highway, also, when inside the parking lot the cars cannot park at the same time (this will depend on the size of the storage cells on the chips), although they arrive at the same time, they have to park by groups (let's call this latency too and lets say that 32 cars can park at the same time for this example) .
In case of single channel, if the goal was to make 265 cars go to the parking lot and park, they had to be sent in 4 groups of 64, each group had to wait until the previous one had parked (for the tolls to open and for the two groups of 32 cars to park).
In case of dual channel, they would also go in 4 groups of 64 but 2 groups at the same time. Similar to RAID 0, they would not be parked at the same parking lot but across the 2 dimms.
Thinking like this, it doesn't make sense that adding more dimms (adding more parking lots) would make any difference in performance. This is where i think interleaved (or dynamic paging) vs linear (or normal dual channel) makes the difference.
If the motherboard can fully interleave all the slots you would gain a performance increase because of the way it works. In the same way dual channel works like RAID 0, by interleaving dimms of the same bank you would get a similar way of working in the way the bits are stored. In this case, paging sequence doesn't necessarily needs to be 0-1-2-3..
But i could be totally wrong :)

Also, before clicking "submit" i googled a bit again and found this:

"Dynamic mode is enabled in all cases by default as long as you have matching pairs or 4 identical DIMMs, only if you are pairing single and double-sided DIMMs with each other, it will fall back into "normal mode" which is the standard 4 bank interleaving without extended page boundaries through combining the page size on the two "linear" Ranks."

I do have 2 dimms now and my mobo says interleaved mode so i don't know..

Same guy that posted the qoute above also posted this analogy:

"You have one book and by default you have two pages open (the one on the left and the other one on the right)
If you have two identical books, you can open them at consecutive pages like page 2+3 and 4+5 and put them one above the other, in this case, your reading pattern will be 2-3-4-5 in a "Z" pattern.
You can also rip the two books apart and combine pages 2 (from book #1) with page 3 (from book #2) by glueing page 3 to the bottom of page 2 and then do the same for pages 4 and 5.
In that case, your "page size" increases and you don't have to switch that often. So you are going to be reading pages 2+3 and the 4+5. That's all there is to that "
 
I think you're on the right track here, but your analogy of tolls and parking lots describes the bottleneck at the memory controller. Dual channel was designed to alleviate the bottlenecking that occurs in the memory controller when the Bus speed of the CPU is faster than the RAM. Dual channel is simply doubling the theoretical bandwidth.

Here's a complicated explanation of RAM:

Each RAM module is made up of 8 DRAM chips (for non-ECC; 9 chips for ECC). Each of these DRAM chips has an address bus. Collectively, the 8/9 DRAM chips on each side of the module is called Rank. The address size of the Rank on all DDR generations is 64-bit. With dual channel DDR, the address is now 128-bit wide.

Let's clear up your analogy a little.

With DDR, we have a 64-bit memory address. If you translate this into your highway, then sure, we have 64 lanes. If DDR can only accomodate one car per lane, then DDR2 would allow two cars per lane; four cars per lane with DDR3.

Now, if you add a dual channel configuration to your analogy, then the highway becomes 128 lanes wide, because DC is supposed to double the bandwidth.

As for the parking lot problem and DIMM capacity, if the cars have to be sorted by groups, wouldn't a larger lot still be more efficient, as opposed to a faster parking system? We eventually come to the same problem that even though the process is fast to fill capacity, capacity is still the issue.


 

nikkopt

Distinguished
Aug 15, 2011
6
0
18,510
Like you said in your first post, opinions vary allot and since it's not really my area of expertise i can't agree or disagree.
What i meant with 2 cars on one lane in DDR memory was that it transfers two times per clock cycle, just like DDR2 and DDR3.
Thx for your replies ;)