ATi Radeon256 Preview

Lessons Learned

Aside from the recent release of the Rage Fury MAXX, ATi hasn't really been in touch with the high-end 3D consumer/gamer. Cards produced by it were mainly for the average Joe and didn't appeal to the mainstream 3D fanatic who needs the latest and greatest. Those needs were mainly filled by 3dfx, Matrox and NVIDIA and although ATI continued to do extremely well where it really counts (making money selling tons of cards), it didn't touch the competition in that part of the market. Well things have changed and ATi is full steam ahead on their attempt to produce a leading edge 3D graphics card for the desktop. This was apparent with its first big thrust into this market using the Rage Fury MAXX. From seemingly nowhere they produced a card that nearly reached the performance of the competition. Although the card didn't have on-board T&L, it managed to have extremely good fill-rate and memory performance. Unfortunately, it came up short of the competition's performance and the price was in most people's opinions too high, so it didn't exactly take over the market. The thing to note here is that ATi went from supplying the world with low-priced and average performing video cards, to producing a high performance graphics card and with its first attempt did pretty well. With the introduction of this latest technology, it shows that ATi is paying attention to what the market is asking for and what developers are looking to take advantage of right now and in the near future.

ATi's Next Generation Technology Feature Set

So what's all this fuss about? Features, features, features! ATi looked at the current trend and did a little research with developers to come up with a rather complete list of features. Let's take a look.

Important Features

  • Hardware T&L (features Vertex Skinning that blends up to four matrices and Keyframe Interpolation, pretty much equal to NVIDIA's 'vertex blending')
  • Large memory support (launching with 32 MB but capable of 128 MB)
  • High speed memory (launching with 333 MHz DDR but 366-400 MHz DDR a possibility later on)
  • Dual rendering pipelines capable of Tri-Texturing in a single pass (launching between 150-200 MHz core speed)
  • Full featured Pixel Shader support (DX8)
  • Hyper Z (loss less data compression when using Z-buffer data)
  • FSAA/Motion Blur/Depth of Field (hardware supported), all the features that require very high fill rates
  • Texture Compression (supports all DX modes)
  • Emboss, Dot Product 3 and Environmental Bump Mapping
  • Texture Transforming Capabilities
  • HDTV support (18 formats)
  • Adaptive De-Interlacing (enhanced video playback)

The Charisma Engine

The first feature is the Charisma Engine that is a fancy name for ATi to say hardware T&L that is capable of 30M Triangles/sec and up to 8 lights local or infinite. They're calling it TCL for Transform, Clipping and Lighting but the T&L is basically the same with a few enhancements, as we know it from NVIDIA.

Vertex Skinning with more than two matrices allows developers to provide models or objects with joints that appear more realistic. As you can see in the example below, the 2-matrix joint has visual flaws that make the joint look odd while the 3-matricie joint looks well rounded. We've seen all that last year when NVIDIA presented their 'vertex blending'.

Greater than two matrices is possible on a card like the GeForce 256 but it must be done in software and may cost a bit of performance (keep in mind I'm referring to NVIDIA T&L that's currently available and not a future product). Basically what this boils down to is that ATi's solution will be able to create a life-like joint on models that's well rounded when moved in different angles through total hardware.

Keyframe Interpolation will allow developers to interpolate (or insert) frames between "keyframes". A good example of this is where you have a mesh on a face that's a plain expression (our first keyframe) and then a mesh on a face that's smiling (our last keyframe). The hardware is able to create as many frames as a developer wants in between to make the transition from normal to smiling. In our example below, the model only has two frames created by the feature but the developer could request for more to build an animation that's even more convincing. This would be an excellent feature for having characters lip sync their dialogue.

Pixel Tapestry Architecture

The second big feature is ATi's Pixel Tapestry Architecture (PTA) that is a fancy set of texture abilities that the new hardware will posses. There are two rendering pipelines the new chip will have and each pipeline will be able to apply three filtered textures per pixel. This ability comes from the fact that ATi has three texturing units per pipeline. That's awesome for multi-texturing and/or heavily filtered games but keep in mind this will NOT improve unfiltered single textured gaming environments. Consequently this feature might not improve pixel fill-rate but it'll definitely boost that Texel rate through the roof. There are still many games today that don't use multi-texturing, however, almost all games are moving towards it so don't think will be a useless feature. The table below is showing what the Radeon256 dual pipeline will be able to do compared to the GeForce 256's quad pipeline. Keep in mind that the data below is per pipeline and not total.

Swipe to scroll horizontally
Textures Per PixelRadeon256's3 Texturing Unit Graphics PipelineGeForce 256 Graphics PipelineGeForce2GTS Graphics Pipeline
1 Bilinear Filtered1 Pass1 Pass1 Pass
1 Trilinear Filtered1 Pass1-2 Pass1 Pass
2 Bilinear1 Pass2 Pass1 Pass
1 Trilinear +1 Bilinear1 Pass2-3 Pass1-2 Pass
3 Bilinear1 Pass3 Pass2 Pass
Number of Pipelines244
Core Clock200 MHz120 MHz200 MHz
Texel Fill Rate1200 Mtexel/s480 Mtexel/s1600 Mtexel/s
Pixel Fill Rate400 Mpixel/s480 Mpixel/s800 Mpixel/s

An important thing to note about the above information is that apart from the superior performance of NVIDIA's upcoming GeForce2GTS chip, the Radeon256 and the GeForce256 will have an advantage in a given circumstance. In a single textured situation with tons of texture filtering, the GeForce 256 theoretically would provide the better performance while in a heavily textured scene (many textures per pixel) the Radeon256 would take control. Keep in mind that currently there aren't any other graphics solutions that can do three texels in one cycle in a single pipeline. On the other hand NVIDIA's latest chips sport four pipelines vs. the Radeon256's only two.

3D Textures

Another nifty little feature the PTA provides is 3D textures. Have you ever broken apart a tree (or any object for that matter) with a weapon and noticed that the innards look very unrealistic because the chopped tree innards were usually a single colored texture. With 3D textures you can give the tree "rings" inside that no matter how you cut the tree, the rings will accurately be drawn as if the rings were really there inside the tree giving a very realistic appearance. 3D textures aren't just limited to this as developers also may create complex/dynamic light maps, volumetric fog, smoke, and liquid/fluid effects using this feature.

Bump Mapping is something we've known about for some time now thanks to Matrox who's offered bump mapping in their G400 product line. You can check out the G400 MAX review to see greater details on what bump mapping looks like. Due to the triple texturing units of this upcoming hardware, ATi will be able to provide developers with the ability to do various effects like Embossing, Environment Mapped Bump Mapping or Spherical/Dual-Paraboloid/Cubic Environment Mapping at virtually no performance hit.

Priority Buffers

One of the most interesting innovations derived from PTA is a Priority Buffer. A Priority Buffer is like a Z-buffer but uses order of objects over each objects distance from the viewpoint like a Z-buffer does. Priority Buffers have been available by software and ATi has now brought them into their hardware.

Shadow Mapping

Using Priority Buffers, ATi is pushing the use of Shadow Mapping, which is the use of a light source as a viewpoint. A scene is rendered to a Priority Buffer so that the closest shadow casting the object has the highest priority, the next closest has the second-highest priority and so on. This method of adding shadows to a game is easier, faster and removes issues that Volumetric Shadows have (i.e. objects unable to cast a shadow on themselves).

FSAA/Motion Blur/Depth Of Field

Some features that I was surprised to find out about are Full Screen Anti-Aliasing (FSAA), Motion Blur and Depth of Field. The AA is full screen and is supported in hardware through a special memory space in the frame buffer (similar to 3dfx's T-Buffer). The framerate performance of the FSAA is said to be excellent up to 800x600x32 and at 1024x768x32 in some games. I'll wait to be the judge of that before I buy it. Motion Blur and Depth of Field are both supported but no details were given about the performance aspects when these features are used. All those three features require very high fill rates, because each feature involves the rendering of either one very large or several different frames to create the frame that finally will be displayed. 3dfx used to create some mysticism around those three features, which are supposed to make up for the lack of TnL-support of the upcoming VSA100, but high fill rate is really what it takes to offer them. Let's see if the 400 Mpixel/s of Radeon256 will be enough. I have my doubts about that.