Another place for open interpretation came into play with the approaches ATI and Nvidia have taken under DirectX 9. Nvidia has stuck to a fixed function design for its architectures, while ATI decided to fragment the traditional pipeline in order to provide greater functionality with the same limited resources.
There is nothing wrong with designing X vertex shaders to Y pixel shaders with Y ROPs (Raster Operation Processors), but as ATI has proven with its current DirectX 9.0c graphics processors, there are performance gains to be made with a dynamic core. Many game developers have shifted their titles to becoming pixel shader dependent with a fragmented design; ATI was able to capitalize on this as it added pixel shader units to new core designs. The layout for Direct3D 10 hardware could follow fixed function shader units, but what would be the advantage?
While Nvidia has not publicly commented on what structure its G80 processor is based upon, ATI has stated it would go with a unified shader core. Unified shaders makes the most sense, as it means shader units can change function on the fly. This means that if a frame has more vertex shader needs or more pixel shading needs, the core can designate more processing units or clusters to accomplish the task that is in most demand.
Below are a few examples of workload needs in a "traditional" processing environment (these are crude to keep it simple).
Example 1: Heavy vertex with light pixel processing needs.
In this example, the Pixel Shader is underutilized; it could be doing more in the frame, but is capped by the Vertex Shader output.
Example 2: Light vertex with heavy pixel processing needs.
In this example, the Pixel Shader is underutilized; it is capped by the Vertex Shader.
Example 3: Light vertex with heavy pixel processing needs.
In this example, the Pixel Shader can render the same amount as in Example 1, but there are 12 more vertex processors that can be utilized. This brings the graphics core to maximum capacity.
Example 4: Light vertex with heavy pixel processing needs.
In this last example, the Vertex Shader workload is the same as in example 2, but there are 6 more pixel processors that can be utilized.
The point to this demonstration is that shaders are merely programs running on floating point processors. With the new standard in Direct3D 10, it does not make sense to design fixed function shader architecture. With intelligent logic and a strong core, we should see significant yields on unified architectures, with the same number of floating point units as we would from fixed function architecture with the same total number of FPUs.