HighTech And Vertex Juggling  NVIDIA's New GeForce3 GPU

Page 1:Introduction

Page 2:The General Features Of GeForce3

Page 3:GeForce3's New Vertex Shader  A Poor Name For A Great Set Of Features

Page 4:What Is A Vertex?

Page 5:Lighting

Page 6:Vertex Shader Details

Page 7:Programming The Vertex Shader

Page 8:Programming The Vertex Shader, Continued

Page 9:Programming The Vertex Shader, Continued

Page 10:Procedural Deformation

Page 11:Setup For Dot Product Bump Mapping (Per Pixel Bump Mapping)

Page 12:Reflection And Refraction

Page 13:More Effects

Page 14:The Programmable Pixel Shader Of GeForce3

Page 15:What Happens In The 3DPipeline Before The Pixel Shader? Continued

Page 16:The Basics Of GeForce3's Pixel Shader

Page 17:2 Textures Per Clock Cycle, But 4 Textures Per Pass?

Page 18:Pixel Shader Programming, Continued

Page 19:Advances And Advantages Of The Pixel Shader

Page 20:Shadow Mapping

Page 21:Isotropic BRDF Based Lighting

Page 22:Blinn Bump Mapping = True Reflective Bump Mapping

Page 23:AntiAliasing  Removing The 'Jaggies'

Page 24:Quincunx ! Samples

Page 25:Higher Order Surfaces

Page 26:Higher Order Surfaces, Continued

Page 27:Higher Order Surface
Vertex Shader Details
The above diagram shows that the vertex shader is able to compute vertices with up to 16 data entries. Each entry consists of 4 32 bit floatingpoint numbers. 16 entries are quite a lot. It easily fits an average vertex with its position coordinates, weight, normal, diffuse and specular color, fog coordinate and point size information, leaving plenty of space for the coordinates of several textures.
Inside the vertex shader, the data is computed in form of entries. We just learned that each entry is a set of four 32 bit numbers. This makes the vertex shader a SIMD (single instruction multiple data) processor, as you are applying one instruction and affect a set of four variables. This makes perfect sense, because most transform and lighting operations are using 4x4 or 3x3 matrix operations. Each data is treated as floating point value, which shows that all computations executed by the vertex shader are actual floatingpoint calculations. Basically, the vertex shader is a very powerful SIMD FPU, barely touched by Pentium 4's SSE2 unit.
The next important feature of the vertex shader is its 12 SIMDregisters that can also contain four 32 bit floatingpoint values. Those 12 registers are what the vertex processor can juggle with. Besides the 12 registers, which can be used for load as well as store, the vertex shader offers a set of 96 4 x 32 bit SIMD constants that are loaded with parameters defined by the programmer before the program starts. Those constants can be applied within the program and they can even be addressed indirectly, but only one constant can be used per instruction, which is a bit of a bummer. If an instruction should require more than one constant, one has to be loaded in one of the registers with a previous loadinstruction. The typical use of this large set of constant data would be things like matrix data for the transform (usually a 4x4 matrix), light characteristics, procedural data for special animation effects, vertex interpolation data (for morphing/key frame interpolation), time (for key frame interpolation or particle systems) and more. There is a special kind of vertex programs called 'vertex state program', which is actually able to write to the parameter block. Normal vertex programs are only able to read from it.
The instructions itself are very simple, but therefore also easily understandable. The vertex shader does not allow any loops, jumps or conditional branches, which means that it executes the program linearly one instruction after the other. The maximal length of a vertex program is 128 instructions. After that the vertex should be changed to what the developer intended and it's got to be transformed and lit. If more instructions should be required the vertex can enter the vertex shader once more.
The final result that comes out of the vertex shader is yet another vertex, transformed to the 'homogenous clip space' and lit. It is important to note that the vertex shader is not able to create vertices or to destroy them. One vertex goes in and one vertex comes out.
 Introduction
 The General Features Of GeForce3
 GeForce3's New Vertex Shader  A Poor Name For A Great Set Of Features
 What Is A Vertex?
 Lighting
 Vertex Shader Details
 Programming The Vertex Shader
 Programming The Vertex Shader, Continued
 Programming The Vertex Shader, Continued
 Procedural Deformation
 Setup For Dot Product Bump Mapping (Per Pixel Bump Mapping)
 Reflection And Refraction
 More Effects
 The Programmable Pixel Shader Of GeForce3
 What Happens In The 3DPipeline Before The Pixel Shader? Continued
 The Basics Of GeForce3's Pixel Shader
 2 Textures Per Clock Cycle, But 4 Textures Per Pass?
 Pixel Shader Programming, Continued
 Advances And Advantages Of The Pixel Shader
 Shadow Mapping
 Isotropic BRDF Based Lighting
 Blinn Bump Mapping = True Reflective Bump Mapping
 AntiAliasing  Removing The 'Jaggies'
 Quincunx ! Samples
 Higher Order Surfaces
 Higher Order Surfaces, Continued
 Higher Order Surface