Clovertown compared to Athlon 64 2800+ (1.8GHZ, socket 754, 130nm, single channel DDR)
Intel showed off Clovertown quad-core server CPUs running on the Bensley platform with FB-DIMM memory at Spring IDF Taipei. Clovertown is basically two 65nm Conroe CPUs stacked together, with total of 8MB L2 cache. This page contained the benchmark scores for a 2P Clovertown. The clockspeed was 2GHZ. For single threaded test, it got a Cinebench 9.5* score of 362. Daniel J. Casaletto, Intel Vice President, Digital Enterprise Group Director, Microprocessor Architecture and Planning, was running the demo. For 2P 8 cores, the score scaled to 1723, or 4.7x. Adding 7 cores led to 3.7x more performance. I think this is quite poor, you get only about half a core's worth when you add a core -- FSB bottleneck.
Let's pay more attention to this photo here, which shows the 2P Clovertown in action and is quite exciting. Look at the upper left corner, it reads Cinebench 64 Bit Edition. Finally, we can see Intel got 64 bit working, it's running the 64 bit version of Cinebench 9.5!
In comparison, a 3 year old 2GHZ single core Opteron 246 achieves a score of 366 in single threaded test, 1.1% faster than the NGMA Core at the same clockspeed. Clock for clock, Intel CORE (Merom/Conroe) is slower than Hammer.
On my old Athlon 64 2800+ (1.8GHZ, Socket 754, 130nm), I got a Cinebench 9.5 score of 294. My ClawHammer is a bit slower than Conroe CORE, but only a little. If you consider my CPU is only 1.8GHZ and only uses single channel DDR, and my old PC only has integrated S3 UniChrome graphics which eats some memory, it's quite good. I managed to overclock it to 1.9GHZ and got a score of 312. I expect the old ClawHammer to get a score 0f 294*2/1.8= 327 at 2GHZ.
I am interested in seeing some Clovertown and Sempron socket 939 comparisons. If you have such a machine running Windows x64, please submit your results in the comments. Don't under estimate AMD desktop CPUs, check out this Athlon 64 and Xeon comparison.
The Conroe performance analysis is here. I pointed out that when working set is larger than Conroe's cache (4MB), Conroe performs slower than Athlon64. The Cinebench 9.5 needs over 150MB to run, as a result, Clovertown's 8MB cache didn't help.
Clovertown compared to Athlon 64 2800+ (1.8GHZ, socket 754, 130nm, single channel DDR)
Intel showed off Clovertown quad-core server CPUs running on the Bensley platform with FB-DIMM memory at Spring IDF Taipei. Clovertown is basically two 65nm Conroe CPUs stacked together, with total of 8MB L2 cache. This page contained the benchmark scores for a 2P Clovertown. The clockspeed was 2GHZ. For single threaded test, it got a Cinebench 9.5* score of 362. Daniel J. Casaletto, Intel Vice President, Digital Enterprise Group Director, Microprocessor Architecture and Planning, was running the demo. For 2P 8 cores, the score scaled to 1723, or 4.7x. Adding 7 cores led to 3.7x more performance. I think this is quite poor, you get only about half a core's worth when you add a core -- FSB bottleneck.
Let's pay more attention to this photo here, which shows the 2P Clovertown in action and is quite exciting. Look at the upper left corner, it reads Cinebench 64 Bit Edition. Finally, we can see Intel got 64 bit working, it's running the 64 bit version of Cinebench 9.5!
In comparison, a 3 year old 2GHZ single core Opteron 246 achieves a score of 366 in single threaded test, 1.1% faster than the NGMA Core at the same clockspeed. Clock for clock, Intel CORE (Merom/Conroe) is slower than Hammer.
On my old Athlon 64 2800+ (1.8GHZ, Socket 754, 130nm), I got a Cinebench 9.5 score of 294. My ClawHammer is a bit slower than Conroe CORE, but only a little. If you consider my CPU is only 1.8GHZ and only uses single channel DDR, and my old PC only has integrated S3 UniChrome graphics which eats some memory, it's quite good. I managed to overclock it to 1.9GHZ and got a score of 312. I expect the old ClawHammer to get a score 0f 294*2/1.8= 327 at 2GHZ.
I am interested in seeing some Clovertown and Sempron socket 939 comparisons. If you have such a machine running Windows x64, please submit your results in the comments. Don't under estimate AMD desktop CPUs, check out this Athlon 64 and Xeon comparison.
The Conroe performance analysis is here. I pointed out that when working set is larger than Conroe's cache (4MB), Conroe performs slower than Athlon64. The Cinebench 9.5 needs over 150MB to run, as a result, Clovertown's 8MB cache didn't help.
The thing is I don't see how you can justifiably compare benchmarks from completely different websites and test setups. The most glaring thing is that the 2CPU review uses Cinebench 2003 which is based on Cinema 4D R8 while the Intel benchmarks use Cinebench 9.5. We also have no idea how the Intel system is set-up. It has also been pointed out by someone else in the other thread that Cinebench may not scale 100% to 8 cores anyways. In other words this doesn't prove anything one way or another.
It should also be noted that this is still an early engineering sample with the real launch not until 2007 or Christmas at the earliest. The current Cloverton supposedly uses a 1066MHz FSB while Intel is tweaking the CPU and the northbridges to expand it to 1333MHz. That should aid in the scaling issue.
As well, if you are the same person that wrote the blog, I really don't agree with your analysis of the cache thrashing issue. What you seem to have done is compared Yonah, Sossaman, and Core's shared cache with that of an Hyperthreading enabled Netburst. The thing is there is a major difference between the two. The caches in Netburst are completely passive and have no control over their contents and allocation which is why thrashing can occur. The entire point of the shared cache in Yonah and Core is to avoid this issue which is why they are labelled "Smart". The can actively control cache allocation to each core preventing thrashing from occuring. If both cores are heavily loaded and needs lots of cache, each will be assigned 50%, while if one is heavily loaded and the other lightly loaded the cache will be divided appropriately. A core cannot simply take control and eject the other core's data like in Netburst. Granted Intel may be totally incompetent and have flawed sharing logic, but I think it's better to assume it at least kind of works instead of being completely no-existent until some real tests from shipping products are done.
In regards to whether Intel is wasting transistors on the large 4MB cache, it really isn't if you compare it to their previous designs or even AMD's. Smithfield on 90nm had 2x1MB, so doubling the cache size while implementing the process shrink to 65nm isn't unreasonable. AMD processors also have 2x1MB on the 90nm process. I'm also not sure if Intel production is really facing "limited capacity" as you put it in the blog.
1) Clovertown (double Conroe) 2GHZ doing Cinebench 9.5 64 bit edition. The result was just reported from IDF Taipei. . The Clovertown (Intel Core) got a score of 362.
These are additional proof that Intel CORE won't demonstrate any IPC advantage over current AMD64 implementation.
On cache thrashing, it's a possibility, depends on how Intel manages the cache.
On cache size, 4MB will bring some benefit, but it won't help much in today's compute environment, where the big apps which really need performance are also memory intensive. Adding cache is not an architectural solution.
On cache size, 4MB will bring some benefit, but it won't help much in today's compute environment, where the big apps which really need performance are also memory intensive. Adding cache is not an architectural solution.
...It only shows how greatly bandwidth starved are Intel upcoming processors. I expect Conroe to be a bad multitasker thanks to its shared L2 cache.
1) Clovertown (double Conroe) 2GHZ doing Cinebench 9.5 64 bit edition. The result was just reported from IDF Taipei. . The Clovertown (Intel Core) got a score of 362.
These are additional proof that Intel CORE won't demonstrate any IPC advantage over current AMD64 implementation.
On cache thrashing, it's a possibility, depends on how Intel manages the cache.
On cache size, 4MB will bring some benefit, but it won't help much in today's compute environment, where the big apps which really need performance are also memory intensive. Adding cache is not an architectural solution.
They are all relatively close to each other in score regardless of its its a Netburst, K8, or Core I fail to see any issues at this moment in time.
I don't see any bandwidth starvation problem for a 2GHZ Clovertown running one instance of Cinebench 9.5. But it's slower than single core Opteron 246 at the same clockspeed. What Intel has is a low IPC problem.
I don't see any bandwidth starvation problem for a 2GHZ Clovertown running one instance of Cinebench 9.5. But it's slower than single core Opteron 246 at the same clockspeed. What Intel has is a low IPC problem.
It's not, you're comparing the score of Cinebench 2003. In 9.5, an 2GHz Opteron 270 scores 334 in 64-bit mode. It's already faster, and its likely Maxxon hasn't implemented optimizations yet to take advantage of its increased SIMD capabilities.
I don't see any bandwidth starvation problem for a 2GHZ Clovertown running one instance of Cinebench 9.5. But it's slower than single core Opteron 246 at the same clockspeed. What Intel has is a low IPC problem.
Look, and read the post first. That way you wont look like such an idiot. The benches were all done on 9.5.
What is worse is that Cinebench is highly Intel floptimized.
Look, and read the post first. That way you wont look like such an idiot. The benches were all done on 9.5.
What is worse is that Cinebench is highly Intel floptimized.
The Clovertown score is in 9.5. The quoted score for the Opteron 246 is from Cinebench 2003. His own scores in 9.5 show that Clovertown is faster at the same clock and that the 2GHz Opteron will score in the 330 range.
3) Someone did the same test on Opteron 246, got a score of 366.
What? 1.9GHz scores 312, scale it to 2GHz and you get a score of ~330, which is less than 362 that Clovertown scored. The 366 score of the 246 comes from a 2cpu review using Cinebench 2003. The scores aren't comparable.
3) Someone did the same test on Opteron 246, got a score of 366.
ment the same test.
Doesn't change the fact that a very old Amd single channel desktop chip, with bad timings almost scores as well (on one of Intel's favorite benches to boot).
4 Cores ? 2 Cores is more then anyone will ever need.... heheheh all jokes aside at this time 4 cores seems to be more of a pissing contest 99% of the programs out still do not utilize multiple threads hmmm any difference in programs being optimized for 2 or 4 or 8 processors ? if there is no difference then I guess more IS better, if code has to be updated from 2 cores to 4 then it will be a long time before that happens. Anyway the tech for this chip does seem impressive that unified cache sounds awesome ! I wonder if you could turn off 3 cores have a huge cache and OC the remaining core for games ? (note: this would be most usefull for current games that do not support more then one core)
the point is that Clovertown, using Conroe's core, should "perform like a beast", according to Victor Wang. however, here we only see Clovertown performs like a puppy, and gets serverly outperformed by Opteron 146.
That Opteron 146 was running at 3GHz with a 300MHz base HTT which would explain why the results are so high. In fact, if you take that 502 and divide by 2/3 to get the score at 2 GHz you only have 335. And that's still including the increase in HTT speed.
Besides, the point is that it's really hard to compare systems from different sites without having things side-by-side on a workbench.
Clovertown compared to Athlon 64 2800+ (1.8GHZ, socket 754, 130nm, single channel DDR)
Intel showed off Clovertown quad-core server CPUs running on the Bensley platform with FB-DIMM memory at Spring IDF Taipei. Clovertown is basically two 65nm Conroe CPUs stacked together, with total of 8MB L2 cache. This page contained the benchmark scores for a 2P Clovertown. The clockspeed was 2GHZ. For single threaded test, it got a Cinebench 9.5* score of 362. Daniel J. Casaletto, Intel Vice President, Digital Enterprise Group Director, Microprocessor Architecture and Planning, was running the demo. For 2P 8 cores, the score scaled to 1723, or 4.7x. Adding 7 cores led to 3.7x more performance. I think this is quite poor, you get only about half a core's worth when you add a core -- FSB bottleneck.
Let's pay more attention to this photo here, which shows the 2P Clovertown in action and is quite exciting. Look at the upper left corner, it reads Cinebench 64 Bit Edition. Finally, we can see Intel got 64 bit working, it's running the 64 bit version of Cinebench 9.5!
In comparison, a 3 year old 2GHZ single core Opteron 246 achieves a score of 366 in single threaded test, 1.1% faster than the NGMA Core at the same clockspeed. Clock for clock, Intel CORE (Merom/Conroe) is slower than Hammer.
On my old Athlon 64 2800+ (1.8GHZ, Socket 754, 130nm), I got a Cinebench 9.5 score of 294. My ClawHammer is a bit slower than Conroe CORE, but only a little. If you consider my CPU is only 1.8GHZ and only uses single channel DDR, and my old PC only has integrated S3 UniChrome graphics which eats some memory, it's quite good. I managed to overclock it to 1.9GHZ and got a score of 312. I expect the old ClawHammer to get a score 0f 294*2/1.8= 327 at 2GHZ.
I am interested in seeing some Clovertown and Sempron socket 939 comparisons. If you have such a machine running Windows x64, please submit your results in the comments. Don't under estimate AMD desktop CPUs, check out this Athlon 64 and Xeon comparison.
The Conroe performance analysis is here. I pointed out that when working set is larger than Conroe's cache (4MB), Conroe performs slower than Athlon64. The Cinebench 9.5 needs over 150MB to run, as a result, Clovertown's 8MB cache didn't help.
3) Someone did the same test on Opteron 246, got a score of 366.
What? 1.9GHz scores 312, scale it to 2GHz and you get a score of ~330, which is less than 362 that Clovertown scored. The 366 score of the 246 comes from a 2cpu review using Cinebench 2003. The scores aren't comparable.
I know, the Clovertown is still 10% faster than my Athlon 64 at the same clock, but compare the specs:
1) Clovertown: 65nm, 2x4MB cache, multi-channel FB-DIMM, cost $xxx
2) My Athlon 64: 130nm, 0.5MB cache, socket 754, rev CG, no SSE3, single channel DDR400, cost $50. Its score is within the the striking distance of the Clovertow