High-Tech And Vertex Juggling - NVIDIA's New GeForce3 GPU

This information is only for the ones of you who are interested and have a bit of programming experience as well as a minimal understanding of 3D math.

These are the instructions:

Instruction Parameters Action
nop do nothing
mov dest, src move
mul dest, src1, rc2 Set dest to the product of src1 and src2
add dest, src1, rc2 Add src1 to src2. [And the optional negation creates subtraction]
mad dest, src1, rc2, rc3 Multiply src1 by src2 and add src3 - into dst
rsq dest, src Reciprocal square root of src (much more useful than straight 'square root').
dest.x = dest.y = dest.z = dest.w = 1/sqrt(src)
dp3 dest, src1, src2 3 Component dot product
dp4 dest, src1, src2 4 Component dot product
dst dest, src1, src2 Calculate distance vector. src1 vector is (NA,d*d,d*d,NA) and src2 is (NA,1/d,NA,1/d).
dest is set to (1,d,d*d,1/d)
lit dest, src Calculates lighting coefficients from two dot products and a power. src is:
• src.x = n o l (unit normal and light vectors)
• src.y = n o h (unit normal and halfangle vectors)
• src.z is unused
• src.w = power (in range +128 to -128)
min dest, src1, src2 Component-wise min operation
max dest, src1, src2 Component-wise max operation
slt dest, src1, src2 dest = (src1 < src2) ? 1 : 0
sge dest, src1, src2 dst = (src1 >= src2) ? 1 : 0
expp dest, src.w
• dest.x = 2 ** (int)src.w
• dest.y = fractional part (src.w)
• dest.z = 2 ** src.w
• dest.w = 1.0
log dest, src.w dest.x = exponent((int)src.w)
dest.y = mantissa(src.w)
dest.z = log2(src.w)
dest.w = 1.0
rcp dest, src.w dest.x = dest.y = dest.z = dest.w = 1 / src

These few instructions are already quite powerful. To make the handling easier, NVIDIA added a few more features, 'costless' negation and swizzling...

Summary