Does Rambus Help UMA? Is It The Best Solution?
Rambus seems to be able to demonstrate some performance advantage over SDRAM for UMA applications. A UMA system should be better able to realize a performance advantage from Rambus due to its faster burst, and because the data stream used by the graphics controller does not have to synchronize with the CPU bus.
In any UMA system (and with AGP) there is some probability that the CPU may begin a new DRAM access just at the moment that the graphics controller is also reading from main memory. This is called an arbitration conflict, resulting in a longer CPU stall and a reduction in CPU performance. Rambus improves performance by allowing the graphics controller to complete its burst a little faster than SDRAM. This allows the CPU to regain access to DRAM a little sooner.
Rambus accomplishes this by trimming one or two clocks from the burst cycle. But the same effect can be accomplished by trimming one or two clocks from latency as well. ESDRAM, for example, trims about four clocks from latency. This approach delivers a direct performance benefit to the CPU, in addition to offering UMA arbitration delays that are shorter to Rambus.
When I ran ESDRAM against Rambus in the performance model, it produced the numbers below.
Low latency SDRAM not only outperforms Rambus for UMA systems, but for standard architecture systems as well.
Now would be a good time to mention the kinds of applications that this performance model simulates (in order to ensure everyone's expectations are in order). The 2D load is a synthetic business computing load characterized by the ZD labs CPUmark32 benchmark. The multimedia load is an approximation of software motion video decode. It assumes full CPU utilization, which is very difficult to do in real applications but it does happen in multimedia benchmarks. The 3D load would be typical of a cache thrashing D3D game with advanced game logic, user interaction, audio, communications, etc. The simulation would not correlate to 3D Winbench. 3D Winbench 98 is void of any game logic, user interaction or audio load. It is pure geometry stream processing and accelerator test. Maybe the new version will change that.
I believe this is a broad enough representation to be considered "viable," but you can be certain that Intel and Rambus will scour the earth for a few benchmarks that can show a performance advantage for Rambus. Or worse yet, if they can t find one, they will write a new one in order to satisfy their promotional goals. This would be a definite red flag.
Why do you think they call it "Bench-Marketing"?