Local and Global Data Share
With the RV770, the AMD engineers didn’t stop at optimizing their architecture to only slightly increase the die real-estate— they also borrowed a few good ideas from the competition. The G80 had introduced a small, 16-KB memory area per multiprocessor that’s entirely under the programmer’s control, unlike a cache. This memory area, accessible in CUDA applications, can share data among threads. AMD has introduced its version of this with the RV770. It’s called Local Data Share and is exactly the same size as its competitor’s Shared Memory. It also plays a similar role by enabling GPGPU applications to share data among several threads. The RV770 goes even further, with another memory area (also 16 KB) called Global Data Share to enable communication among SIMD arrays.
While the ALUs haven’t undergone a major modification, the texture units have been completely redesigned. The goal was obvious – as with the rest of the GPU, it was to increase performance significantly while maintaining as small a die area as possible. The engineers set fairly ambitious goals, aiming for an increase of 70% in performance for an equivalent die area. To do that, they focused their efforts largely on the texture cache. The bandwidth of the L1 texture cache was increased to 480 GB/s.
But that’s not all; the L1 cache that was shared by all the SIMD arrays has been broken down into 10 cache memories, one per SIMD array, and each contains only data exclusive to the corresponding SIMD array. Shared data are now stored in an L2 cache, which has also been completely redesigned, now having a bandwidth 384 GB/s to the L1 cache. In order to reduce latency, this L2 cache has been positioned near the memory controllers.Let’s see what the results of these improvements are in practice:
Compared to its direct competitor, the 9800 GTX, the Radeon HD 4850 showed first-rate performance with single and dual texturing, while not giving up any performance in terms of raw fill rate – which is to be expected considering the 40 texture units for 16 ROPs (to simplify, “2.5 texture units per pixel,” as they used to say in another era). On the other hand, with triple and quad texturing, the RV770, logically enough, can’t compete with the G92’s 64 texture units (the equivalent of “4 texture units per pixel”); but in all cases the RV770 proved to be closer to its theoretical performance than its competitor.
$450 in Best Buy for a GTX 260.
And the 4850 is pretty close to the 280.
Ouu the 4870 is going to give Nvidia a run for there money
for the first time in a while.
P.S. +1000 -> 2222
MaxSmoothedFrameRate=62 in the Engine.GameEngine section
"it was unavailable due to the sloppy handling of this launch"
Seriously? AMD can't control if their retail partners screwed the pooch on the release date, because they were so anxious to get people this great product. They made sure the product was readily available well before the launch date.
They should be praised for not having a paper launch, not told that it was a sloppy launch, very poor form saying that.
Hell i went to best buy and bought 2 4850's on sunday, when the cards weren't even supposed to be available yet, the guy told me "they have been in stock for over a month in the back, they aren't supposed to be available yet but i can get two for you." Were the AMD police supposed to come and smack best buy on it's hand and keep me from giving them profits?
Sorry if i'm ranting, just put the blame where it belongs.
In french, but the graphs talk by themselves. Ho, and if you want a short translation = impressive and incredibly more efficient than Nvidia (if you compare the size of the GPU, yes it's A LOT more efficient)