DDR3-1333 Speed and Latency Shootout

Test Settings: Lowest Stable Latencies

Because of the previously mentioned "Boot Strap" limitations, we had to select different FSB speeds to test DDR3-1333 and DDR3-1600 data rates. But how could we do that without throwing the rest of our speeds off?

Lacking any 5:3 DRAM to FSB clock ratio required to test DDR3-1333 with an FSB-1600 processor, we instead must compare DDR3-1333 to DDR3-1066 using FSB-1333, and also compare DDR3-1600 to DDR3-1066 using FSB-1600.

Only two CPU speeds correspond to both FSB-1333 and FSB-1600: These are 2.00 GHz CPU clock and 4.00 GHz CPU clock. Since this started out as an overclocking article, the 4.00 GHz speed was selected. The CPU multipliers needed to reach 4.00 GHz at FSB-1333 and FSB-1600 is 12 x 333 MHz and 10 x 400 MHz, respectively.

Swipe to scroll horizontally
Latency Test System Hardware
MotherboardAsus Maximus Extreme Rev. 2.01GIntel X38, BIOS 0501 (10/30/2007)
Socket 775 ProcessorIntel Core 2 Extreme QX9770 "Yorkfield"(FSB-1600, 45 nm, 3.20 GHz, 12 MB L2 Cache)
Hard DriveWestern Digital WD1500ADFD-00NLR1, Firmware: 20.07P20150 GB, 10,000 RPM, 16 MB cache, SATA/150
Graphics CardFoxconn GeForce 8800GTX, P/N: FV-N88XMAD2-ODNVIDIA GeForce 8800GTX - 768 MB
Power SupplyOCZ GameXStream OCZ700GXSSLI - 700W
System Software & Drivers
OSWindows XP Professional 5.10.2600, Service Pack 2
DirectX Version9.0c (4.09.0000.0904)
Platform DriversIntel INF
Graphics DriverNVIDIA Forceware 163.75

Since the Asus Maximus Extreme proved instrumental in diagnosing the boot strap issue, it was retained for memory latency testing.

Four-core processors use memory a little more effectively than dual cores and our highest latency test speed of DDR3-1600 matches the highest memory ratio afforded FSB-1600 processors. We used the only FSB-1600 processor available, which was Intel's Yorkfield-based Core 2 Extreme QX9770.

Game benchmarks are significantly limited by graphics performance, so we included a powerful GeForce 8800GTX from Foxconn.

Thomas Soderstrom
Thomas Soderstrom is a Senior Staff Editor at Tom's Hardware US. He tests and reviews cases, cooling, memory and motherboards.
  • dv8silencer
    I have a question: on your page 3 where you discuss the memory myth you do some calculations:

    "Because cycle time is the inverse of clock speed (1/2 of DDR data rates), the DDR-333 reference clock cycled every six nanoseconds, DDR2-667 every three nanoseconds and DDR3-1333 every 1.5 nanoseconds. Latency is measured in clock cycles, and two 6ns cycles occur in the same time as four 3ns cycles or eight 1.5ns cycles. If you still have your doubts, do the math!"

    Based off of the cycle-based latencies of the DDR-333 (CAS 2), DDR2-667 (CAS 4), and DDR3-1333 (CAS8), and their frequences, you come to the conclusion that each of the memory types will retrieve memory in the same amount of time. The higher CAS's are offset by the frequences of the higher technologies so that even though the DDR2 and DDR3 take more cycles, they also go through more cycles per unit time than DDR. How is it then, that DDR2 and DDR3 technologies are "better" and provide more bandwidth if they provide data in the same amount of time? I do not know much about the technical details of how RAM works, and I have always had this question in mind.
  • Latency = How fast you can get to the "goodies"
    Bandwidth = Rate at which you can get the "goodies"
  • So, I have OCZ memory I can run stable at
    7-7-6-24-2t at 1333Mhz or
    9-9-9-24-2t at 1600Mhz
    This is FSB at 1600Mhz unlinked. Is there a method to calculate the best setting without running hours of benchmarks?
  • Sorry dude but you are underestimating the ReapearX modules,
    however hard I want to see what temperatures were other modules at
    a voltage of ~ 2.1v, does not mean that the platinum series is not performant but I saw a ReapearX which tended easy to 1.9v(EVP)940Mhz, that means nearly a DDR 1900, which is something, but in chapter of stability/temperature in hours of functioning, ReapearX beats them all.
  • All SDRAM (including DDR variants) works more or less the same, they are divided in banks, banks are divided in rows, and rows contain the data (as columns).
    First you issue a command to open a row (this is your latency), then in a row you can access any data you want at the rate of 1 datum per cycle with latency depending on pipelining.

    So for instance if you want to read 1 datum at address 0 it will take your CAS lat + 1 cycle.

    So for instance if you want to read 8 datums at address 0 it will take your CAS lat + 8 cycle.

    Since CPUs like to fill their cache lines with the next data that will probably be accessed they always read more than what you wanted anyway, so the extra throughput provided by higher clock speed helps.

    But if the CPU stalls waiting for RAM it is the latency that matters.
  • what is on pc3-10600s "s" ?