Sign in with
Sign up | Sign in
Your question

An analysis on the peformance impact of HT speed

Tags:
Last response: in CPUs
Share
January 13, 2007 9:53:45 AM

I decided to post this in a seperate thread so as not to go too off topic in the thread we have been discussing this issue in.

First of all, I will explain my system setup, and my analysis method. My system is a S939 X2 based system. I reset the bus speed to 200MHz for this experiment, which reduced my processor back to it's stock speed of 2.2GHz. Since we are looking at the performance of the HT bus, it's also important for me to point out the devices that will be accessed over this bus:

Radeon X1950 Pro 256MB, PCIe 16x
1xSATA300 HD
2xSATA150 HDs in RAID 1
PCI Wireless Card

I decided to use two applications to run these tests. Sisoft Sandra was chosen because this gives a detailed analysis of individual components, and is often much more sensitive to changes than real world benchmarks. It's weakeness however is that it tests components in isolation, making it difficult to saturate the HT bus. Because of this, I also used 3DMark06. This seemed like a good choice because it stresses multiple components simultaneously, and since gamers are most likely to be affected by lower HT speeds (since the graphics card is the most bandwidth heavy device using the bus).

I started out with the default multiplier of 5x, giving a HT speed of 1000MHz. I then dropped down to 2x and 1x, keeping all other settings constant.

My results for the Sisoft Sandra benchmarks I ran showed no statistically significant drops in performance with lower HT speeds. This is perhaps understandable because it tests components in isolation. As such, I will not show these results here. The 3DMark06 results were more interesting:

5x: 4747
2x: 4693 (-1.2% of 5x)
1x: 4462 (-6.1% of 5x)

What this shows is, that on my system, you have to reduce the HT bus to 40% of it's normal speed before any performance loss is observed. Lowering it further below this results in a significant loss in performance.

If we assume therefore, that my system is utilising approximately 40% of its available bandwidth, then in the worst case, an equivalent system that had a K8 quad core instead would utilise 80% of the available bandwidth. Since this still leaves 20% headroom, then with my system, I would have to be running a quad core CPU at above 2.64GHz (2.2 x 1.2) in the worst case before I saw any significant loss in performance.

There are some caveats with these results, when translating the performance across to K8L. Firstly, it is possible that the use of DDR2 memory would affect these results. I think this is unlikely however, since memory access is independant of the HT bus on single socket systems. More significantly however, more demanding graphics cards than mine would place more strain on the PCIe bus. Therefore, a Crossfire or SLi system may well saturate the bus faster.

What this does show is that verndewd may indeed have a point that K8L quad cores may suffer a substantial performance hit on AM2. On my system, given that I would start to see performance loss on a 2.64GHz quad core, it is reasonable to assume that a 2.9GHz K8L would also show a loss, even if K8L has similar I/O characteristics to K8. This loss would be greater with more intensive graphics setups than mine.

In light of this evidence, I now agree with verndewd that quad core K8L may well have some significant performance limitations on AM2, in games. Further tests would have to be run to see if this also holds for other applications. I would hypothesise however, that dual core K8L based systems will not suffer the same problem.
January 13, 2007 10:13:31 AM

Interesting analysis, thanks for taking the time to do the testing. 8)

It pretty much confirmed my estimation that it would require a QC K8L to saturate the current HT 1.0 standard.

What we don't know is how the higher IPC on K8L will affect the bandwith requirements compared to K8, but it does seem HT 3.0 will be required for QC K8L to perform to its full potential.
January 13, 2007 10:55:05 AM

Quote:
Interesting analysis, thanks for taking the time to do the testing. 8)

It pretty much confirmed my estimation that it would require a QC K8L to saturate the current HT 1.0 standard.

What we don't know is how the higher IPC on K8L will affect the bandwith requirements compared to K8, but it does seem HT 3.0 will be required for QC K8L to perform to its full potential.


So that's why AMD will move to HTT 3.0 with K8L :wink:
But for dual-cores, the HTT 1.0 will not be the bottleneck :wink:
Related resources
January 13, 2007 1:16:10 PM

Let me get this analysis straight:
- K8L quad will be bottlenecked by ht1.0
- But, core2quad in NOT bottlenecked by its lesser 1066 fsb

If this is correct, than K8L might be much better than the general concensus around here.... We will have a new champion....
January 13, 2007 1:29:14 PM

Quote:
Let me get this analysis straight:
- K8L quad will be bottlenecked by ht1.0
- But, core2quad in NOT bottlenecked by its lesser 1066 fsb

If this is correct, than K8L might be much better than the general concensus around here.... We will have a new champion....


K8L quad bottlenecking on HT 1.0 is just speculation at this point, but I do think it will be slightly bottlenecked, yes, but I repeat that is SPECULATION at this point only.

C2Q is slightly bottlenecked by the 1066MHz FSB IMO. 1333MHz would've been better, but not significantly.

I know what you're alluding to here, but these are vastly different architectures and you really can't 'estimate' performance by looking at bandwith requirements of the two.

FWIW, AMD would better hope K8L crushes the current QX6700, because it will be competiting against 45nm quads at ~3.5GHz next year, not a 2.66GHz one. :wink:
January 13, 2007 1:31:55 PM

I'm only extrapolating based upon what I've observed on my own machine. Bear in mind that my calculations assumed the worst case - that is, twice as many cores will result in twice as many I/O operations. In reality, this may not be the case.

Intel and AMD's architectures are very different, and as I'm not an expert I'm not really in a position to comment on the differences between them. I do seem to remember reading on one of AMD's presentation slides though that HT1 has about equivalent bandwidth (but lower latency) to Intel's current FSB. HT3 will pull ahead of this again. If this is true then it isn't unreasonable to assume that FSB1066 is at the very edge of what it can handle with quad core, and if K8L has a higher IPC than C2, then that may push it over.

Also remember that at stock, C2Q runs at 2.66GHz (about the point at which performance would suffer with a quad core K8 I believe). I havn't seen any benchmarks that have increased the multiplier when overclocking past this, so its hard to say what effect there would be beyond this point. If by raising the multiplier the performance increase is not linear then that would show that indeed the FSB is bottlenecking the chip.
January 13, 2007 1:43:02 PM

Quote:
Let me get this analysis straight:
- K8L quad will be bottlenecked by ht1.0
- But, core2quad in NOT bottlenecked by its lesser 1066 fsb

If this is correct, than K8L might be much better than the general concensus around here.... We will have a new champion....


K8L quad bottlenecking on HT 1.0 is just speculation at this point, but I do think it will be slightly bottlenecked, yes, but I repeat that is SPECULATION at this point only.

C2Q is slightly bottlenecked by the 1066MHz FSB IMO. 1333MHz would've been better, but not significantly.

I know what you're alluding to here, but these are vastly different architectures and you really can't 'estimate' performance by looking at bandwith requirements of the two.

FWIW, AMD would better hope K8L crushes the current QX6700, because it will be competiting against 45nm quads at ~3.5GHz next year, not a 2.66GHz one. :wink:

This is definitely intresting CPU times we live in.... I have been watching the progress since my commodore 64 1Mhz
January 13, 2007 1:52:23 PM

Thanks for the analysis.... I can't wait until K8L benchmarks come out AM2 .vs AM2+ to see how correct you are....
!