9x333, 8x375, or 7x428 on a Q6600 - Which is faster?

graysky

Distinguished
Jan 22, 2006
546
0
18,980
What is a better overclock?

Good question. Most people believe that a higher FSB and lower multiplier are better since this maximizes the bandwidth on the FSB. Or is a low bus rate and higher multiplier better? Or is there no difference? I looked at three different settings on my Q6600:

9x333 = 3.0 GHz (DRAM was 667 MHz)
8x375 = 3.0 GHz (DRAM was 750 MHz)
7x428 = 3.0 GHz (DRAM was 856 MHz)

The DRAM:CPU ratio was 1:1 for each test and the voltage and timings were held constant; voltage was 2.25V and timings were 4-4-4-12-4-20-10-10-10-11.

After the same experiments, at each of these settings, I concluded that there is no difference for real world applications. If you use a synthetic benchmark, like Sandra, you will see faster memory reads/writes, etc. with the higher FSB values -- so what. These high FSB settings are great if all you do with your machine is run synthetic benchmarks. But the higher FSB values come at the cost of higher voltages for the board which equate to higher temps.

I think that FSB bandwidth is simply not the bottle neck in a modern system... at least when starting at 333. Perhaps you would see a difference if starting slower. In other words, a 333 MHz FSB quad pumped to 1333 MHz is more than sufficient for today’s applications; when I increased it to 375 MHz (1500 MHz quad pumped) I saw no real-world change; same result when I pushed it up to 428 MHz (1712 MHz quad pumped). Don’t believe me? Read this thread wherein x264.exe (a video encoder) is used at different FSB and multiplier values. Have a close look at the 3rd table in that thread and note the FPS (frames per second) numbers are nearly identical for a chip clocked at the same clockrate with different FSB speeds. This was found to be true of C2Q as well as C2D chips.

You can do a similar test for yourself with applications you commonly use on your machine. Time them with a stop watch if the application doesn’t report its own benchmarks like x264 does.

Some "Real-World" Application Based Tests

Three different 3.0 GHz settings on a Q6600 system were tested with some apps including: lameenc, super pi, x264, winrar, and the trial version of photoshop. Here are the details:

Test O/C 1: 9x333 = 3.0 GHz
ooey0.gif


Test O/C 2: 8x375 = 3.0 GHz
8x375vv7.gif


Test O/C 3: 7x428 = 3.0 GHz
7x428de3.gif


Result: I could not measure a difference between a FSB of 333 MHz, 375 MHz, or 428 MHz using these application based, "real-world" benchmarks.

Since 428 MHz is about 28 % faster than 333 MHz, you’d think that if the FSB was indeed the bottle neck, the higher values would have given faster results. I believe that the bottleneck for most apps is the hard drive.

Description of Experiments and Raw Data

Lame version 3.97 – Encoded the same test file (about 60 MB wav) with these commandline options: [code:1:b72aeb1e47]lame -V 2 --vbr-new test.wav[/code:1:b72aeb1e47] (which is equivalent to the old –-alt-preset fast standard) a total of 10 times and averaged play/CPU data as the benchmark.

Super Pi version 1.1 – Ran both the 1M and 2M tests and compared the reported total number of seconds to calculate as the benchmark.

x264 version 0.54.620 – Ran a 2-pass encode on the same MPEG-2 (480x480 DVD source) file twice and averaged the FPS1 and FPS2 numbers as the benchmark. In case you’re wondering, here is the commandline options for this encode, pass1: [code:1:b72aeb1e47]x264 --pass 1 --bitrate 1000 --stats "C:\work\test-NEW.stats" --bframes 3 --b-pyramid --direct auto --subme 1 --analyse none --vbv-maxrate 25000 --me dia --merange 12 --threads auto --thread-input --progress --no-psnr --no-ssim --output NUL "C:\work\test-NEW.avs"[/code:1:b72aeb1e47]

And for pass2:[code:1:b72aeb1e47]x264 --pass 2 --bitrate 1000 --stats "C:\work\test-NEW.stats" --ref 3 --bframes 3 --b-pyramid --weightb --direct auto --subme 6 --trellis 1 --analyse all --8x8dct --vbv-maxrate 25000 --me umh --merange 12 --threads auto --thread-input --progress --no-psnr --no-ssim --output "C:\work\test-NEW.264" "C:\work\test-NEW.avs"[/code:1:b72aeb1e47]

The input avisynth script was:[code:1:b72aeb1e47]global MeGUI_darx = 4
global MeGUI_dary = 3
DGDecode_mpeg2source("C:\work\test-new.d2v")
AssumeTFF()
Telecide(guide=1,post=2,vthresh=35) # IVTC
Decimate(quality=3) # remove dup. frames
crop( 2, 0, -10, -4)
Spline36Resize(640,480) # Spline36 (Neutral)[/code:1:b72aeb1e47]

RAR version 2.63 – Had rar run my standard backup batch file which generated about 0.98 G of rars (1,896 files totally). Here is the commandline I used: [code:1:b72aeb1e47]rar a -u -m0 -md2048 -v51200 -rv5 -msjpg;mp3;tif;avi;zip;rar;gpg;jpg "e:\Backups\Backup.rar" @list.txt[/code:1:b72aeb1e47] where list.txt a list of all the dirs I want it to back up. I timed how long it took to complete with a stop watch. I ran the backup twice and averaged it as the benchmark.

Trial of Photoshop CS3 – I used the batch function in PSCS3 to batch bicubic resize 10.1 MP to 0.7 MP (3872x2592 --> 1024x685), then applied an unsharpen mask (60 %, 0.8 px radius, threshold 12), and finally saved as quality 8 jpg. In total, 57 jpg files were used in the batch. I timed how long it took to complete two runs, and averaged them together as the benchmark.

Here are the raw data if you care to see them:
datarawuv7.gif
 

Labs23

Distinguished
Feb 17, 2007
48
0
18,530
definitely, i would choose that one using 8 multi... The higher the fsb used the higher the frequency thus resulting in a much faster data execution.. But those are not that tangible, maybe mostly noticed in benchmark numbers...
 

morerevs

Distinguished
May 19, 2007
373
0
18,780
Hi graysky, nice write-up, but i was wondering.. what memory settings did you use? Did you link the memory to tthe FSB or ran asynchronous, and if so, did you have to alter timings? Also, what were the temps/voltage requirements for either overclock. Were they higher for the 375 FSB and if so, by how much. These are things worth knowing for people interested in OC-ing their systems if as you say the realworld speeds show no gain it would be better to go for low voltage/temp. In your opinion what would be easiest to get completely stable on the q6600: 8x375 or 9x333?
TY
 

graysky

Distinguished
Jan 22, 2006
546
0
18,980
definitely, i would choose that one using 8 multi... The higher the fsb used the higher the frequency thus resulting in a much faster data execution.. But those are not that tangible, maybe mostly noticed in benchmark numbers...

Well, if you run a synthetic benchmark like Sandra, you will get faster results with the higher bus, but as I demonstrated, using "real world" apps, there is no difference. I'd challenge you to find an app that isn't just measuring theoretical throughput that does show a difference.
 

graysky

Distinguished
Jan 22, 2006
546
0
18,980
Hi graysky, nice write-up, but i was wondering.. what memory settings did you use? Did you link the memory to tthe FSB or ran asynchronous, and if so, did you have to alter timings? Also, what were the temps/voltage requirements for either overclock. Were they higher for the 375 FSB and if so, by how much. These are things worth knowing for people interested in OC-ing their systems if as you say the realworld speeds show no gain it would be better to go for low voltage/temp. In your opinion what would be easiest to get completely stable on the q6600: 8x375 or 9x333?
TY

Memory timings were unchanged for each run (4-4-4-12-4-20-10-10-10-11). I ran the DRAM:CPU @ 1:1 for each run. So it was 667 for the first, and 750 for the 2nd.

To answer your last question: my q6600 is completely stable at either. I have to up the vcore to run the 9x375 as you can see in the screenshot of CPU-Z. I didn't minimize it but it doesn't run at the same level that my 9x333 O/C runs at. Also I had to up the other voltages (NB, SB, and ICH) to support it as well.

You can see some screenshots of the 9x333 BIOS settings in this thread if you want more.
 

morerevs

Distinguished
May 19, 2007
373
0
18,780
Thanks for the info. It's strange though, I would have imagined that running a higher FSB at the same timings would have some positive influence on applications. In theory this should increase memory throughput right? Or does the CPU multiplier have an influence on that aswell. This does seem to support your theory of applications not being bandwith limited at this time other than benchmarking progs.
Are you going to do some more testing with other apps like games and such? would be interesting to see if there's any profit to be had there.
Also, maybe i overlooked this, but what were temps at load with either overclock? Did they rise with the higher FSB?
Thanks
 

Labs23

Distinguished
Feb 17, 2007
48
0
18,530
But those are not that tangible, maybe mostly noticed in benchmark numbers...

That's why i have these lines.... High fsb's definitely have an increase in frequency, but those increase are only tangible in benchmark numbers.. And i hope for future platforms to highlight the effect of higher fsb..
 

graysky

Distinguished
Jan 22, 2006
546
0
18,980
In the interest of overkill, I just completed the same benchmark @ 7x428 (edited first post in thread). Results are the same: no benefit of an even higher FSB.
 

sadness20

Distinguished
Jun 22, 2007
97
0
18,630
I had some question about q6600 and ram.

http://www.newegg.com/Product/Product.aspx?Item=N82E16820231065

http://www.newegg.com/Product/Product.aspx?Item=N82E16820145043

http://www.newegg.com/Product/Product.aspx?Item=N82E16820145043

I want to overclock my q6600 cpu to 3.0 some one told me that if i get ddr2 1100 (PC2 8800)ram is going to better for overclocking the cpu and get more perforamnce, decreasing the ddr2 1100 speed to 800 going to give me lot of perforamnce is this true?

from my aspect Basically there is no advantage for ddr2 1100 (PC2 8800) model I'm right? or wrong? You can't get better performance if overclock ur ram to 800 from 1100 cause you will better of using an regular ddr 800 (PC2 6400) model overclock at 667= 333 mhz FBS, I believe there is no point of increasing the FSB TO 400ghz or higher cause u wont get any better performance on q6600.
 

evilr00t

Distinguished
Aug 15, 2006
882
0
18,980
@sadness - I answered your questions in your other thread.

I don't want to sound picksy but...

a) There are versions of SuperPI that report down to millisecond precision - use them!
http://www.xtremesystems.com/pi/

b) RARing lots of small files can be disk bound. Can you try RARing a single large, defragmented file? There is also a built-in benchmark in WinRAR; according to other benchmarks, WinRAR is extremely memory sensitive and I was not expecting no change in results with different FSB/MEM frequencies.

c) To get CPU use you can use something like Sysinternals Process Explorer which counts how much CPU time was spent per process.
The CPU use, which will be millisecond-precise, will be in a column.

d) Run the benchmark processes in High or Realtime priority so that background processes can't steal CPU cycles.
WARNING: On single core machines with singlethreaded code, Realtime priority can make your system appear to lock up; this may or may not happen with multithreaded code on multicore systems.
 

graysky

Distinguished
Jan 22, 2006
546
0
18,980
a) There are versions of SuperPI that report down to millisecond precision - use them!
http://www.xtremesystems.com/pi/

b) RARing lots of small files can be disk bound. Can you try RARing a single large, defragmented file? There is also a built-in benchmark in WinRAR; according to other benchmarks, WinRAR is extremely memory sensitive and I was not expecting no change in results with different FSB/MEM frequencies.

c) To get CPU use you can use something like Sysinternals Process Explorer which counts how much CPU time was spent per process.
The CPU use, which will be millisecond-precise, will be in a column.

d) Run the benchmark processes in High or Realtime priority so that background processes can't steal CPU cycles.
WARNING: On single core machines with singlethreaded code, Realtime priority can make your system appear to lock up; this may or may not happen with multithreaded code on multicore systems.

a) Yeah, I learned about them after I did the testing and didn't wanna go back. I stand behind the data. Remember 333 --> 375 is 12 % and 333 --> 427 is 29 %. If they differ by a fraction of a second, to me that's within error and certainly not near the same magnitude as the FSB increase.

b) Agreed about the internal benchmark. Wish I knew about it when I tried the test. Also agreed that it can be disk bound and many small files should slow it down. That said, it's a pretty real-world measure in my opinion.

c) True, but I think that real-world benchmarks are the right data to look at here since most of us literally sit in our chair while the thing chugs away.

d) Yeah, if repeated this is a good suggestion.
 

Hatman

Distinguished
Aug 8, 2004
2,024
0
19,780
When I get mine, im going to try for 400BUS and 8 multi, if it doesnt work I'll go with 350BUS and 9 multi.


Gotta keep that RAM 1:1 :D Higher FSB also gives a tiny winy bit of extra performance.
 

Twinsen

Distinguished
Jul 11, 2007
14
0
18,510
Very good article, I've just been debating this with some mates.

Though..on an unlocked chip, the general motto is th oc by small increments and then test and repeat until norm is found.

Now I pose this question; an increment of 1 to the multiplier on a default 266 FSB, say on the qx6700 adds another 260 mhz to it; Is this too large a step to jump into right away?.

I was planning on taking the FSB to Circa 300 and then takn the multiplier up one, thus giving an overall clock of somewhere between: 3.0 to 3.6.

So, which one would be better to change first to get the most stable speed, increase the multiplier first then start hacking away at the FSB by 5 Mhz increments?
 

graysky

Distinguished
Jan 22, 2006
546
0
18,980
I just read the FSB1333 Intel Processors & New 2007 CPU Charts article over at TH.com and am happy to see that the testers over there have drawn the same conclusion that I have about fixed final core speeds with higher and higher FSB speeds: faster FSB speeds w/ a C2Q/C2D don't equate to faster real-world benchmarks.

Have a look at page 8 from their article comparing the "old" 1066 MHz FSB to the "new" 1333 MHz FSB chips: average gain <1 %.
 

Hatman

Distinguished
Aug 8, 2004
2,024
0
19,780
I imagine for the core2quads teh icnreased FSB speeds will increase teh connectivity speed between the 2x dual cores on there. Which may improve certain things not listed on those.


May be wrong of coarse.
 

graysky

Distinguished
Jan 22, 2006
546
0
18,980
@hatman: it may, but I don't think that connection is the bottleneck for performance at all... although AMD fanboys out there are always quick to point out this perceived "weakness" of C2D/C2Q (I'm not referring to you by the way)!
 

leadbottom

Distinguished
Aug 1, 2007
13
0
18,510
graysky

I have a question inre: the temp on the cpu with the different multipliers. Was it cooler running at 8 than at 9 ? And if it was, wouldn't it be better to have a faster fsb, and use better cooling on that..... better preserving the cpu ?
 

graysky

Distinguished
Jan 22, 2006
546
0
18,980
@leadbottom: I didn't compare the temps in any documented fashion, but as I recall there weren't any differences. I will say that the NB temp was siginificantly higher @428 than @333.