Sign in with
Sign up | Sign in
Your question

AMD or Intel for the highest CPU-Memory Bandwidth?

Last response: in Memory
Share
May 19, 2007 1:44:31 PM

Hi,

I am looking for the highest possible native bandwidth between CPU and Memory (latency not important) .

I need to read only about 250Mb of data from Memory to CPU, but it needs to churn in and out very rapidly. The data itself does not change. I continuously loop through the whole 250Mb data set doing a string of mathematical computations.

What is a good CPU/Mobo/RAM combination to use for this task without overclocking?

From my investigations - a Core2Duo would suit 533MHz DDR-2. I heard that AMD currently supports 800MHz DDR2 - is this correct?
To me - this would indicate that AMD is the 'faster' choice for my limited scenario. I don't do gaming.

Should I seriously consider overclocking - could I double the bandwidth?

I realise that I could wait for an AMD chip with HyperTransport 3.0 to be released - but when will these actually be released? Will HT3.0 work with DDR2 chips or will it require DDR3?

Thanks,

Joel
May 20, 2007 1:25:40 AM

It would help if you supplied more details. For example, how do you know your app is memory bandwidth-limited? If it's extremely specialized, it may be worth looking into running it on a graphics processor, as they tend to be specialized for high-bandwidth operations.
May 20, 2007 8:29:52 AM

Quote:
Hi,

I am looking for the highest possible native bandwidth between CPU and Memory (latency not important) .


Why?

Quote:

From my investigations - a Core2Duo would suit 533MHz DDR-2. I heard that AMD currently supports 800MHz DDR2 - is this correct?
To me - this would indicate that AMD is the 'faster' choice for my limited scenario. I don't do gaming.


Dosent matter what u do, buying an AMD CPU will give u horrible performance in any case.
And Core 2 Duo supports any memory, there is no limit.
If u want, u can buy DDR2-1265 memory and have 10,100MB/s of bandwidth.
Quote:


Should I seriously consider overclocking - could I double the bandwidth?


Should u consider overclocking? Of course.. u dont have to buy DDR2-1265 memory, but DDR2-1000 memory and overclock to DDR2-1265.

Quote:

I realise that I could wait for an AMD chip with HyperTransport 3.0 to be released - but when will these actually be released? Will HT3.0 work with DDR2 chips or will it require DDR3?


I wouldnt wait for HyperTransport 3.0 if I were u. AMD probably isnt going to release it any time soon.. if at all.
Related resources
May 20, 2007 7:56:06 PM

Quote:
...
Dosent matter what u do, buying an AMD CPU will give u horrible performance in any case.

Now this is just silly. :wink:

Quote:
...
And Core 2 Duo supports any memory, there is no limit.
If u want, u can buy DDR2-1265 memory and have 10,100MB/s of bandwidth.

Come now. Since the memory bus feeds into the FSB, the max useful memory bandwidth is pretty much limited to the FSB bandwidth, as the OP implied.

Quote:
Should I seriously consider overclocking - could I double the bandwidth?

There's a good chance you could increase the bandwidth by 50% (to 400MHz FSB). Newer C2D's don't overclock as well as the originals, so YMMV.
May 20, 2007 8:26:31 PM

due to hypertransport and integrated memory controller AMD will give you the most/fastest bandwidth
May 20, 2007 8:58:03 PM

Everything you post is trash; we should nickname you the forum garbage man. :lol: 
May 20, 2007 9:12:46 PM

lol this thread is just lame. I usualy dont say things like this but some one who knows enough to say "I have such an amount of data and it runs in a loop" should know enough about CPU's and bandwidth to know whats going to work best for them...


Edit: Just a thought if the data set isnt changeing then why bother running it ? just save it to a disk and be done with it... unless your just interested in a benchmark program in that case AMD would be the winner.
May 20, 2007 9:16:16 PM

You need to get some server chips then like that quad core Xeons.
May 20, 2007 10:18:40 PM

Quote:
You need to get some server chips then like that quad core Xeons.

amd server chips have more bandwidth and it's goes up as you add cpus.
quad xeons are slowed down by the shared bus as each quad is 2 dual-cores.
May 20, 2007 10:24:46 PM

Well, I assume you are programming the app yourself and you have enough understanding of the caching scheme of both AMD and Intel systems, so you should decide based on the raw data:

Intel's fsb goes to 1333Mhz @ 64bits = 10.6GBs
AMD's IMC @ DDR2-800 dual channel = 12.8GBs

Of course these are theoretic limits and keep in mind the much lower latency of the AMD system. With the intel system it makes no sense to use modules faster than DDR2-666 because the system is bottlenecked by the FSB but you can use faster modules and choose a lower latency still at 666 to gain some speed by keeping 1:1 fsb/ram ratio(seems to work).

With these facts you would say the amd will win: maybe not....
If your app is only reading the data and not modifying it in an iterative way it becomes important if you are doing this with sse2 or not. AMD K8 doesn't load 128bits the same way as c2d does and if you are using sse2 assembly, the whole gain of the amd system could get lost (bottlenecked) by amd's inferior sse2 performance compared to c2d.

More specifics about the algo 'd shed some light about the last..

Salud!!!!
May 20, 2007 10:42:01 PM

Quote:
It would help if you supplied more details. For example, how do you know your app is memory bandwidth-limited? If it's extremely specialized, it may be worth looking into running it on a graphics processor, as they tend to be specialized for high-bandwidth operations.


In fact the graphics engine could give a times better performance depending on if the algo he is running can be implemented in a GPU. viva GPGPU!!

Quote:
Edit: Just a thought if the data set isnt changeing then why bother running it ? just save it to a disk and be done with it... unless your just interested in a benchmark program in that case AMD would be the winner.


Maybe the data (source) is not changing but the result does..... I guess he is iterating ala FFT or something similar.
May 21, 2007 1:10:51 AM

Hi,

thanks for the answers. As quartzlock says.. I am iterating through the same data with varying calculations to arrive at different results.

I am not the programmer .. but if I was I would spend time trying implement the GPGPU solution.

In answer to jonathan - Just because I know what I want to do.. doesn't mean I know about the ins-and-outs of current processors. Henry Ford said he could find the answer to any question - and he did this by surrounding himself with experts in different fields. That's one of the great advantages of this forum.

I have decided to go the Intel route. I have not overclocked before - but from the information I read on the different sections of this forum - with a decent cooling fan, and slightly faster rated memory... I can experiment with increasing the FSB speed.


Thanks,

Joel
May 21, 2007 2:20:05 AM

AMD hands down has the the highest speed connection with the ram. Thanks to the IMC, data has less distance to travel from the memory to the CPU. Intel may have an fsb of 1333, but it still cant compare to an AMD mem bandwidth,
May 21, 2007 2:22:00 AM

Quote:
Hi,

I am looking for the highest possible native bandwidth between CPU and Memory (latency not important) .


Dosent matter what u do, buying an AMD CPU will give u horrible performance in any case.
And Core 2 Duo supports any memory, there is no limit.
If u want, u can buy DDR2-1265 memory and have 10,100MB/s of bandwidth.

I wouldnt wait for HyperTransport 3.0 if I were u. AMD probably isnt going to release it any time soon.. if at all.

*sigh* :roll:
May 21, 2007 2:40:09 AM

Quote:
Everything you post is trash; we should nickname you the forum garbage man. :lol: 


No, the garbage man is the person who takes out the trash, not brings it in.
Then again, ur still here so i must not be doing a very good job.. :lol: 
May 21, 2007 2:45:05 AM

If you’re interested there is a new chipset in town the G35/P35 Intel chipset. Although not new to this forum or other sites the venders are slow to stock this motherboard.
http://www.asus.com/products.aspx?l1=3&l2=11&l3=534&l4=...
I can’t do the reviews justice so I'll just link a site for now.
The bandwidth is increasing proportionally to the FSB eliminating the walls and straps seen by its predecessors.
With DDR3 on the horizon bandwidth bottlenecks will most defiantly not be an issue
http://www.xtremesystems.org/forums/showthread.php?t=14...
http://www.xtremesystems.org/forums/showthread.php?t=14...
May 21, 2007 8:29:42 AM

Quote:
...
Dosent matter what u do, buying an AMD CPU will give u horrible performance in any case.

Now this is just silly. :wink:


Why is it silly? Youre not going to get Core 2 Duo performance with an Athlon 64 X2 no matter how fast ur memory is.
May 21, 2007 2:29:11 PM

I guess you should analyze the app deeply and find the real hotspot. For pure cpu performance at the moment c2d is the answer, but if the hotspot is the membandwidth you should go AMD without hesitation, barcelona is around the corner anyway which means you can invest in an am2/+ in advance. The whole latency issue is realy important here to get a system bandwidth closer to the theoretic limit.
May 21, 2007 7:00:58 PM

What you really need to do is run your application on a C2D system and an X64/X2 system. If you can do that and post some times here, we can then tell you which system it runs better on.

If your not concerned about cost, the P35 would most likely give the best results. (7000~8000 MB/s Sandra) But if you want a good cheap $50 mobo that is going to still perform admirably (6300 MB/s Sandra), go with the AMD. AMD's design is much more cost effective.
May 21, 2007 9:33:44 PM

As far as i know amd still offers higher memory bandwith than intel, not on the theoretical side but on the practical side.
May 21, 2007 9:36:24 PM

As far as i know amd still offers higher memory bandwith than intel, not on the theoretical side but on the practical side.
May 21, 2007 9:59:43 PM

yes thats why C2D run faster then AMDlol. C2D are better then anyAMD right now. until AMD comes with somthing new
May 21, 2007 10:47:41 PM

Keep in mind that latency will affect bandwidth. Since the CPU can't hold 250mb of data in it the read transaction is actually broken up into pieces. Inbetween each transaction you take a penalty for latency (even if it's a "sequential" read I think because it's a matrix not a linear storage medium like a disk drive).

In very general terms the CPU will pull a few MB from the RAM, process it, possibly write some results back to the ram, and then read the next few MB. If there is a write transaction than you take a 2*latency penalty to bandwidth.

AMD has the fastest peak theoretical memory bandwidth that is separate from it's HT bus but it doesn't sound like the HT buss will affect your application. At stock the AMD solution with some very nice DDR2-800 should provide the best memory performance. However C2D systems have been known to clock up to 500mhz FSB and run with DDR2-1000 which would put a stock AMD system to shame but with reliability and accuracy obviously being very important to you this might not be the best thing to bank on.

Depending on how much pull your company has you should ask some OEMs to send you some evaluation units of different configurations and benchmark your application on them. If you could get Dell and HP to each send you 1 AMD and 1 Intel system then you could tell US the answer to your question and we would all appreciate it ;) 

AMD isn't in any hurry to roll out DDR3 and have no plans to implement it "until OEMs are asking for it". They will let Intel plow the way as they did with DDR2. The first round of DDR3 products will be expensive, have no performance advantage, and use less electricity. Think of it as a repeat of DDR2. Also, I don't think HT3 will benefit your application. It's more for multi-socket and/or IO intensive systems. So I wouldn't worry about it if I were you.

If you are sure that an particular AMD chip will provide enough crunching power and that memory bandwidth is the biggest determining factor of performance for you application AND you can't afford an OEM server-class rig then I would get the AMD chip and slightly OC it if I were you. Outside of latest (really expensive) Intel hardware and/or extreme OCing it should be the fastest.
May 21, 2007 11:01:39 PM

Quote:
yes thats why C2D run faster then AMDlol. C2D are better then anyAMD right now. until AMD comes with somthing new


i was only refering to memory bandwith, not overall processing power. Right now for pure processing power C2 has the upper hand.
May 22, 2007 1:08:22 PM

okay then what would you do with memory bandwidth if your processor can't crunch the data
May 22, 2007 2:06:34 PM

Wow!

Thanks for all the answers guys.
Thought I had it worked out - and now I don't!

This question wan't intended to start a 'fanboy' war - it's about finding the fastest platform to perform a specific computational task.

I think I'll need to approach a supplier and ask them to lend me 2 machines.

Joel
May 22, 2007 3:28:20 PM

Quote:
As far as i know amd still offers higher memory bandwith than intel, not on the theoretical side but on the practical side.


And it is indeed that way, its just that for example, in a memcopy you just read 128 bits chunks and write them out and you see the amd system fly, no intel system can match that atm; but if you load the data and then you begin to make heavy calculations with this data by using sse2 eg you will loose the bandwidth gain in the alu operations because k8 doesnt work 128bit at once but divides it into 2 64bit ops. Thats why I mentioned the importance of knowing what the algo is doing and understand the real hotspot.

About the theoretic BW and the real one, AMD has a better practical or effective BW due to the IMC, so an intel system with FSB 1600 wouldn't beat an AMD k8 @ DDR2-800 because the latency of the intel NB realy hurts( p35 MBs seem to have a big improvement but it is still not IMC).

The issue is: where is the time being spent? ALU? or memory Stall cycles?
Knowing the ratio between these two will tell you if the ALU or the MemBW is what you must use as criteria to choose the platform you will run on.
May 22, 2007 5:29:18 PM

Quote:
yes thats why C2D run faster then AMDlol. C2D are better then anyAMD right now. until AMD comes with somthing new


i was only refering to memory bandwith, not overall processing power. Right now for pure processing power C2 has the upper hand.

Yeah... except for float operations where AMD also still has the upper hand.

Don't let the C2D kiddies bully you like that. ;)  If all they understand is 3dMark scores either correct them or just ignore them. Don't agree with them when they bather on about "C2D rulez all OMGWTF". It's simply not true.

quarzlock is on the right track. High-end computing is complicated business.

Joel, if you really do get two machines plz post back and let us know how it went.
May 22, 2007 5:35:34 PM

Quote:
okay then what would you do with memory bandwidth if your processor can't crunch the data


You are sort of missing the point of this thread, the OP asked which platform gave the highest practical memory bandwith and that would be AMD, regardless of having less processing power than C2D. Wether or not he should go with an AMD or Intel platform would depend on the particular app he/she needs to run, as it has allready been said in this thread.

Quote:
yes thats why C2D run faster then AMDlol. C2D are better then anyAMD right now. until AMD comes with somthing new


i was only refering to memory bandwith, not overall processing power. Right now for pure processing power C2 has the upper hand.

Yeah... except for float operations where AMD also still has the upper hand.

Don't let the C2D kiddies bully you like that. ;)  If all they understand is 3dMark scores either correct them or just ignore them. Don't agree with them when they bather on about "C2D rulez all OMGWTF". It's simply not true.

quarzlock is on the right track. High-end computing is complicated business.

Joel, if you really do get two machines plz post back and let us know how it went.

True that, it was just a bit late here and i didnt feel like bothering to go into that much detail :) 
!