<<<<<To Teasy: As for the overdraw problem, I realised I kind of repeated myself with the pipeline statement after I posted, but was too tired to bother changing it before going to bed
Still, what I meant was, possibly a hardware solution that would force either front to back or back to front sorting of objects/layers/whatever to maximise use of the pipelines... Kyro's 8 is sufficient for maybe the next 2 years, but once games start coming out that beging using more passes, more textures, etc, then you will still need a way to reduce the overdraw... or am I getting myself confused again... ah heck! That's what you get when a vet has a hobby interest in computers
>>>>>
Yeah you are getting confused between overdraw and texture layers.
The Kyro II can put 8 texture layers on each pixel in one pass. Which means the pixels with upto 8 textures on each only have to be sent from the chip to ram once even though it only has one texture mapping unit on each pixel pipe.
Usually if a card has only one texture mapping unit on each pixel pipe and wants more then 1 texture layer on the pixel its working on it first puts 1 texture on the pixel being rendered and that pixel passes to the framebuffer in ram. Then in the next clock cycle the card needs to make a second pass for the same pixel to add another texture. This also means the poly being worked on needs to be resent over the AGP port to the card for each extra texture layer. So for 8 layers of textures on each pixel a normal card with one texture mapping unit per pipe would need to do what I just described 4 times (8 passes to the frambuffer altogether) which would kill memory bandwidth completely and could also clog the AGP port depending on how many polys are in the scene being rendered. Now the Kyro II does only have one texture mapping unit on each pipe but what is does is this. It adds the first texture layer to the pixel and instead of sending it out to ram it uses its small on-chip cache to hold that pixel inside the chip. Then in the next clock cycle it adds the second texture layer while still keeping the pixel inside the chip and then is can add another layer and another and so on until it has all 8 layers on the pixel (at this point the pixel has never left the chip) Then it sends the 8 layered pixel out to the framebuffer in ram only once, so there's no wasted memory bandwidth at all.
Now overdraw is a different thing. Overdraw is when a card renders pixels over pixels its already rendered in the framebuffer. The standard way of doing things is this. The card is sent each poly in turn and renderes every pixel. Since the card doesn't know which pixels will actually be sceen when the full frame is rendered and only checks after the pixel has been rendered (depth testing) this leads to lots of pixels being overdrawn. What the Kyro II does is first collect all the polys in the scene and then cuts the scene up into tiles. The tiles are then sent to a on-chip cache one tile at a time were the Kyro II checks which polys will be seen on the monitor when the frame is finished and fully renders that tile after each pixel has been depth tested. Once the tile has been fully rendered its sent to the framebuffer in ram and the next tile is sent to the chip. So because the Kyro II checks which pixels will be seen before rendering it never renders over pixels its already rendered. This saves a massive amount of fillrate and also saves a massive amount of memory bandwidth. Incase anyone wants to know the on-chip z-buffer can check 32 pixels in 1 clock cycle. Each tile is 32x16 pixels in size so each tile needs 16 clock cycles to test the whole tile. The Kyro II can render the scene at the same time as depth testing and since it can test 32 pixels for each clock cycle and can only render 2 pixels in each clock cycle the depth testing is 16 times fater then the rendering speed. So the depth testing doesn't slow down rendering. There doesn't need to be any optimisations for this method of rendering, it gets rid of 100% of overdraw. Most newish games have an overdraw average of at least 2 (this is an average and not a constant, because each frame will have different amounts of overdraw) which means that on average each pixel in the frames your seeing on the screen has been rewritten three times. A game like Serious Sam has an overdraw level significantly higher then that as does tribes 2 and many other games.
Warden:
<<<<<Are you sure about every card (except the GF3 of course) using software T&L in Aquanox? I understand your reasoning, in that since they don't have Vertex Shaders they can't do the native T&L that Aquanox is written for. But I am unclear if this requires ALL T&L duties to be done by the CPU. I have read rumors that the CPU only has to do part of the work (the vertex shader modifications) but that the hardwired T&L unit still does some or most of it. Do you have some more info on this?>>>>>
Anand and others Aquanox tests show that each card including the Kyro II is showing the same poly throughput. The only card that showed a much higher poly throughput was the Geforce 3. 3Dmark2001 uses a custom skinning technique for all the characters when being used with a DX7 HW T&L unit. On a Geforce 3 or a Kyro II it uses normal vertex shader skinning either done totally in hardware on the Geforce 3 or totally in software with the Kyro II. But when using a card with a DX7 HW T&L unit the skinning is done by the CPU and the HW T&L unit transforms and illuminates the skinned vertices.
Though this method is not used in Aquanox at this point. Whether they will use this method or not in future I have no idea.
Something I have to say about the Aquanox benchmark is that the final game will not be as slow as the benches shown. The benchmark (Aquamark) is made specifically to stress the graphics card and the final game will be allot faster. I got this info from Massive the people making the game when I signed an NDA to get a copy of the benchmark. Yep I had to sign an NDA just to get the bench, there really keeping this bench under wrapps at the moment.<P ID="edit"><FONT SIZE=-1><EM>Edited by Teasy on 04/26/01 10:41 PM.</EM></FONT></P>