ATI's Radeon X1900 Heats Up With 48 Shader Units

New Additions

The core was redesigned to add more features, and to remove limitations that were present in previous models. The addition of 60 million more transistors brings the pixel pipeline total count to 48, or using the term that the industry is quickly changing to, 48 pixel processors. The almost 20% increase in transistor count brings the total to over 384 million in the X1900 (R580).

Radeon X1900 Core Diagram

ATI continues to cluster four pixel engines together to form a "quad." Each quad is assigned its own thread to work on; alternately, work can be batched to group of quads. By keeping the pixel block sizes small, the Ultra-Threading dispatch processor makes sure that all of the pixel shaders have work to do, keeping the throughput as fast as possible. While pixel shaders are busy doing their part, memory latencies and certain lookups can be masked, as long as the dispatch processor continues to order the steps efficiently.

A fifth instruction unit was added to each of the pixel processors with the X1800 series and this continues in the X1900 Series, with the Branch Execution Unit. Conditional statements are passed down into the pixel shader processors when certain pixels need specialized processing. By having the flow control units in the processors, this can relieve overhead placed on the dispatch processor. Under this structure, each pixel processor can manage one to five instructions per clock, based on their ALU types. The diagram above shows the makeup of each pixel shader.

ATI made an addition of 50% more cache for Hi-Z. This additional cache allows the R580 expedite Z testing - the ability to predict what surfaces are not in the viewing area and remove them from further rendering calculations - at greater speeds at high resolutions (such as 2048x1536). This push, plus the raw pixel shader horsepower, makes it possible to play games at familiar frame rates, but much higher resolutions.

In the core diagram, you should notice the 3:1 ratio of texture to pixel shader units. ATI feels that in the current direction that games are headed in, more and more emphasis is being placed on the shaders.

To further accelerate the process, ATI has added a feature called Fetch4, which is intended to aid in the creation of soft shadows. In the real world, shadows do not have sharp edges; normally there are various light sources that diffuse the edges where the shadows fall.

In the 3D graphics world, shadows have been rendered with a hard edge between light and dark. To create the illusion of the real world, a solution was needed to filter the shadow maps to create soft shadows.

ATI's Fetch4 works by sampling four adjacent values from the shadow map simultaneously, which theoretically should improve the texture sampling rate by a factor of four. ATI claims that via Ultra-Threading, with fast flow control and Fetch4, that the X1900 will "render attractive soft shadows at speeds approaching those of traditional hard-edged shadow mapping techniques." Obviously the results remain to be proven out, but as you will see in the benchmark results, ATI has made serious improvements to the ability to play F.E.A.R. with soft shadows enabled.