Why the scaling of my core i7 920 is so poor, even worse than Q6600?

jsmith001

Distinguished
Nov 21, 2008
2
0
18,510
Update:

Some of your guys are pathetic. I don't give a damn who make the CPU. In fact, I built three Intel-CPU-computers in last two years. I posted my test results in some forums to share my info and hope to find why the scaling my core i7 computer is so poor. I really had a high expectation for core i7.

OK. I redid my tests on windows XP, with/without HT. I used similar compiler flags as SPEC 2006.
I also overclock Core i7 from 2.66G to 3.5G. Below are the setup and comparison.

Core i7 computer
CPU: core i7 920 at 2.66G
MB: ASUS P6T
memory: 6G DDR3 1600
OS: Fedora 10 64bit
Compiler: Intel Fortran Compiler Professional 11.0.069
Compiler flags: -r8 -xSSE4.2 -ip -openmp -opt-prefetch

OS: Window XP 64bit
Compiler: Intel Visual Fortran Compiler Professional 11.0.066
Compiler flags: /Qautodouble /QxSSE4.2 /Qip /Qopenmp /Qopt-prefetch /F1000000000

Core 2 Quad computer
CPU: Core 2 Q6600 at 2.4G
MB: gigabyte p35
memory: 4G DDR2 800
OS: fedora 9 64bit
Compiler: Intel Fortran Compiler Professional 11.0.069
Compiler flags: -r8 -ip -openmp

Dual Xeon computer
CPU: dual Xeon E5410 at 2.33G
MB: Tyan S5396A2NRF
memory: 8G DDR2 667 ECC Fully Buffered
OS: fedora 9 64bit
Compiler: Intel Fortran Compiler Professional 11.0.069
Compiler flags: -r8 -ip -openmp

The testing code is a openmp CFD(computational fluid dynamics) code. The code used about 500MB memory.

file.php


*For Core i7, Turbo mode is OFF.

For four threads, Core i7 920@3.55G is slower than Q6600@2.4G!
The scaling of Xeon E5410 is the best.



---------------------------------------------------------------------------------------------------------------------------------------
I read a lot reviews and forums about the new Intel CPU core i7. The new Intel CPU looks great. For example, according to spec cfp2006,
http://www.amdzone.com/phpbb3/viewtopic.php?f=52&t=135802, Core i7 is about 100% faster than core 2 for float point performance. Although someone suggested that Intel might optimize the CPU for the benchmark codes, I still decided to upgrade my computer from core 2(Q6600) to core i7(920). Because I thought the better bandwidth of core i7 would help the multithread computation. Well, I was wrong. My testing results made me very disappoint to the new core i7 CPU.

My new computer
CPU: core i7 920 at 2.66G
MB: asus p6t
memory: 6G DDR3 1600
OS: fedora 10 64bit

My old computer
CPU: core 2 Q6600 at 2.4G
MB: gigabyte p35
memory: 4G DDR2 800
OS: fedora 9 64bit

compiler: Inter ifort 11

The testing code is a openmp CFD(computational fluid dynamics) code. The code used about 500MB memory. The testing results:

1 thread , 202.7s (core i7 920) , 213.1s (q6600), 5.13%(core i7 advantage)
2 threads, 109.0s (core i7 920) , 109.3s (q6600), 0.27%(core i7 advantage)
4 threads, 96.1s (core i7 920) , 68.3s (q6600), -28.93%(core i7 advantage)

(core i7 HT is off, it is slower if HT is on. the clock of core i7 920 is about 10% higher than core 2 Q6600)

The scaling of core i7 for 4 threads is really bad. Considering the bandwidth of core i7 920 is about twice of core 2 Q6600, this result is really strange. I don't know why, maybe the OS didn't use all the potentials of the new CPU?
 

WR

Distinguished
Jul 18, 2006
603
0
18,980
Or it could be the processor overheating; did you check temperatures for the 4-core run?

Turbo mode would not explain a 25%+ discrepancy.

I am curious what 2-core and 4-core run times would be if you took out the last stick of 2G RAM, to force the i7 into dual-channel. It could be that the CFD code you have does not need the bandwidth.

The AMDZone benchmarks you linked to are of bandwidth, not core computation. They are the same benchmarks AMD used to advertise when K10 had a weak core but strong bandwidth. That worked well for servers, but flopped for DT/workstation.
 

kg4icg

Distinguished
Mar 29, 2006
506
0
19,010
he aske the same question in anand forum and got the same answer, because of the 2 different versions of fedora. by the way he is an amdzoner. he's been forum hoping.
 

werxen

Distinguished
Sep 26, 2008
1,331
0
19,310



thats exactly what it is. i ran ubuntu and it was noticeably slower than windows with my E8500.


but ill give fedora credit and just say its still a silly quad core in a single threaded world :kaola:
 

yipsl

Distinguished
Jul 8, 2006
1,666
0
19,780
Well, from what we can see in benchmarks, i7 isn't as good as Core 2 quad overall. It's good in some apps only. It's a stepping stone and the second version of the architecture will probably get things right.

Like B3 Phenom fixed B2 errata issues and added a bit more clock speed; and a die shrink with Deneb will improve things even further. It's not just AMD fanboys, it's the nature of a new architecture. Core 2 wasn't as new as people thought, it built upon Core architecture found in notebooks. It was highly successful because Netburst was so lame in it's last days.

Anyone who expected i7 to be to Core 2 as Core 2 was to Netburst was in for a disappointment. There's nothing wrong with i7 that the next modification won't fix. As is, I don't see it as a good upgrade for anyone with a Q6600 or better Intel quad.
 


What benchies are you talking about? All the ones I've seen show an improvement over Penryn in most areas, esp. encoding and gaming where the system is not GPU bound.
 

roofus

Distinguished
Jul 4, 2008
1,392
0
19,290
considering guys that went from the Qx series are seeing gains, that makes this guy look even less genuine. either he is a troll or he really screwed up something when he built his system.
 
Plus the fact that he turned simultaneous multithreading off, despite the fact that his app is multithreaded. Something is seriously wrong if that degrades performance, because that's one of i7's biggest advantages over core 2.
 


Thats the problem with Open Software really. I mean its great because its free and normally much more stable. But the drivers needed to run everything at its best normally take much longer to get because they don't have the giant buildings worth of people working on them or the update support like Windows gives.
 

spud

Distinguished
Feb 17, 2001
3,406
0
20,780


So XP uses 4 cores now when did MS add that?

Word, Playa.
 

WR

Distinguished
Jul 18, 2006
603
0
18,980
Can you use the same non-SSE 4.2 non-prefetch binary code to bench both the Q6600 and i7 920? As well, could you show a 4-thread i7 run in Task Manager with the cores on individual graphs? And as I mentioned before, a CoreTemp shot somewhere around the middle of the run, to rule out load-induced throttling.

As usual I would like to rule out compiler optimizations that turn into speed bumps for specific source code, bottlenecks outside the CPU, and heat-induced throttling or hardware misconfiguration.
 


Since quite a while. It just doesn't allocate programs across each core as well as Vista does. It leaves that up to the program.

But Vista trys to allocate all processes across all the cores available to keep the load per core lighter and try to get the work done faster.

Xp, if I remember correctly, will load the first core all with processes until it is at about 100% then start loading the next core and so on and so forth.
 

spud

Distinguished
Feb 17, 2001
3,406
0
20,780


You are correct I got the licensing mixed up. My bad.

Word, Playa.