Texture units & more than 1 texture?

OneOfMany

Distinguished
Jan 8, 2004
1
0
18,510
I can't remember which article it was, but it claimed something along the lines of a card having 4 pixel pipelines, each with 1 texture unit. And each texture unit being capable of using up to 16 textures.

Huh? Now, I haven't kept up with the graphics world like I did up till the Geforce2, but at that time it was 1 texture per texture unit. And I know that with the shaders it got a bit convoluted (they could apply multiple light maps in a single pass), but this is making it hard for me to determine if I really should upgrade or not.

Currently I've got a Geforce2 32MB video card. 4x2 (pixel x texel units) at 200Mhz core with 166Mhz DDR RAM. Now I know memory speed has gone way up, but a lot of the more current cards I've looked at in the third VGA round up only have 4x1, with a core speed of about double the Geforce2. Wouldn't that make them about the same speed? Or are the TnL engines and shaders (and everything else) really helping that much? I'm only running a dual PIII 700Mhz, so my CPU would definitely be a limiting factor. Any insight would be greatly appreciated.
 

Vimp

Distinguished
Jul 13, 2003
358
0
18,780
Although I can't awnser your question I may be helpfull. You have pretty much the same card I have except mine has 64mb DDR compared to your 32mb but both have a core of 200mhz and memory at 333mhz DDR.

However I believe there are only 3 main determining factors that reveal a cards general value. The first 2 effect a cards performance mainly while the third effects how nice the visuals are.

<b>1.</b> Fill rates. Pixel fillrates and Texel fillrates. These are determined by a cards core speed times the pipelines and its designed result. For instance a 4x2 pipeline design would work like this to determine the fill rates... (I'll use our Geforce cards as an example.)
Pixel fill rate: 200mhz engine x 4 pixel piplines = 800Mpixels/s.
Texel fill rate: 200mhz engine x (4x2 design)8 texture units = 1600Mtexels/s.
Every Program uses different degrees of each resulting in varying results performance wise depending on which is in most demand by that program.

<b>2.</b> Memory bandwidth. Memory bandwidth is calculated by multipling the memory speed by the width of the memory bus.
Todays video cards use a 128bit(our geforce2 cards use 128bit) or a 256bit bus size. However you only multiply by the bytes and not the bits and since theres 8 bits in a byte that means 128bits is 16bytes and 256bits is 32bytes. So, (using our geforce cards as an example again), we multiply the memory speed by the bytes.
333mhz DDR x 16bytes(128bits) = 5328Mbytes/s
I believe memory bandwidth is a large determining factor in the ability to quickly render alot of textures or high quality textures which many games make use of. However the amount of memory would also play a part.

<b>3.</b> Special effects and technology. DirectX effects tend to have a large bearing on what special effects you'll be able to see in games. Our Geforce2 cards only support upto DirectX 7 effects where as the latest cards can show upto DirectX 9 effects alowing for a much nicer looking game if the game makes use of the latest directX features. However both nvidia and ATI also have their own unique special effect features as well as technology. The technology seems to have a lot to do with how effective the videocard is at making full use of its Pixel and Texel fillrates as well as its use of its potential Memory bandwidth.

So to sum things up lets compare our Geforce2 GTS/Pro cards to Nvidia's GeforceFX5700 Ultra.

<b>Geforce2 GTS:</b>
<i>200mhz core,
4 piplines with a 4x2 design,
200 x 4 = <font color=purple>800Mpixel/s</font color=purple>. 200 x(4x2)8 = <font color=purple>1600Mtexels/s</font color=purple>
333mhz memory,
128bits(16 bytes) bus width,
333 x 16 = <font color=purple>5328MB/s</font color=purple>
<font color=purple>DirectX7</font color=purple> effects.</i>

<b>GeforceFX5700 Ultra:</b>
<i>475mhz core,
4 piplines with a 4x1 design,
475 x 4 = <font color=purple>1900Mpixels/s</font color=purple>. 475 x(4x1)4 = <font color=purple>1900Mtexels/s</font color=purple>
900mhz memory,
128bits(16 bytes) bus width,
900 x 16 = <font color=purple>14400MB/s</font color=purple>
<font color=purple>DirectX9</font color=purple> effects.</i>

Comparing these two shows the 5700 Ultra doing alot better then the Geforce2 in most regards and a fair bit better in terms of Texel fillrates.
 

cleeve

Illustrious
<i>OK, first of all, you have to understand what texels are.</i>

The texel fill rate describes the amount of texture-pixels (short 'texels') that can be applied in the rendering process. In case of the original GeForce 256, each pipeline can either apply one texel to each pixel that its rendering or several (up to four, because the geforce256 has 4 pipelines) can apply several texels to one pixel in parallel.

<b>So in a theoretical geforce 256 chip running at 100 Mhz, you have a pixel fillrate of 400 Megapixels/second and a texel fillrate of 400 Megatexels/second . (100 Mhz x 4 pipelines)</b>

The Geforce2 has the same number of pixel pipelines than the Geforce 256, but the difference is that each pipeline could output 2 texels.

Those four pipelines can either produce one pixel per clock each or they can work together for stuff like multi texturing.

<b>Because it has 2 texels per pipeline, in a theoretical geforce2 chip running at 100 Mhz, you have the same pixel fillrate of 400 Megapixels/second (100 Mhz x 4 pipelines)...

BUT you have DOUBLED the TEXEL FILLRATE to 800 Megatexels/second (100 Mhz x 4 pipelines x 2 texels)!!!</b>

<i>Next, a little background on rendering passes:</i>

Now, let's have a look at the Geforce3.
From a brute force hardware point of view, the pixel pipeline strength is pretty similar to the GeForce2. It has 4 pipelines and each pipeline can fetch two texels per clock cycle.
<b>BUT... the geforce 3 can utilize up to four textures per rendering pass.</b>

This difference cannot be seen if you only count clock cycles or check theoretical fill rates.
<b>GeForce3's pixel shader has a significant advantage over GeForce2 once three of four textures are used per pixel.</b>
The GeForce2 can only apply two textures per pixel. If you want to apply more, the pixel has to go through another rendering pass. GeForce3's pixel shader may require 2 clock cycles for 3 or four textures, but still only one pass.

Because textures have to be read every pass, this saves alot of bandwidth in multiple-texture situations on the Geforce3. In a 3 or 4 texture situation, the geforce2 will be reading the textures to memory every clock cycle (using bandwidth), where the geforce4 only has to read the textures in once per rendering pass, which can consist of multiple clock cycles (saving bandwidth).

<i>Finaly, you need to learn about Texture Units:</i>

<b>The 8500 still has the same 4 pipelines... but each pipeline has TWO TEXTURE UNITS, each of which can process 3 TEXELS</b>
It gets more confusing here because as I understand it the 8500 can only access one of the texture units per clock cycle, but it can access 11 separate textures in one pass. This makes it an ideal GPU for multitextured operations.

<b>So a Radeon 8500 running at 100 Mhz would have the exact same Pixel fillrate of the Geforce2 and Geforce3... 400 Megapixels/second (100 Mhz x 4 pipelines)... it has a theoretical texel fillrate of 1200 Megatexels/second
(4 pipelines x 1 texture unit x 3 Texels).</b>

Not only that, but it can handle up to 12 textures in one pass! So in heavily multitextured situations, the Radeon 8500 can complete in one pass what the Geforce3 would need two or three passes to accomplish.


So to conclude, really all I had to say was that you've gotten the <b>TEXELS</b> in the Geforce2 (4 pipelines with 3 texels per unit) confused with <b>TEXTURE UNITS</b> in newer video cards ( like the Radeon 8500 that has 4 pipelines with 2 texture units with 3 texels each).

But I was in the mood to rant a bit.

Remember also the biggest difference in the Geforce2 and newer video cards are DirectX shader capabilities. The Geforce2 has no pixel shaders so it is incapable of pixel shader effects like the nifty water in Morrowind, or all of those neat effects coming in Half Life 2.


________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>
 

Vimp

Distinguished
Jul 13, 2003
358
0
18,780
Starting from the bottom up.
The Geforce2 cards do have pixel shaders and can show the special effects in Morrowind. However they may not look as good as a more recent card since the Geforce2 uses pixel shaders version 0.5, but the difference if any is probably unoticeable in games like Morrowind which probably don't make use of effects that the Geforce2 can't do. I know UT2k3 with its amazing effects dosn't use any special effects that my Geforce2 card can't do and UT2k3 is nicer looking then Morrowind. What the Geforce2 does not have is Vertex shaders, whatever they do I don't know.

Also from other sources, such as from The Tech Report it claims that the Radeon 8500 does not have a pixel fillrate of 400Mpixels/s like you say but rather it has 1100Mpixels/s and 2200Mtexels/s as oppose to what you said is 1200. The results found at The Tech Report coincide with the formula I mentioned of core speed multiplied by the piplines for pixel fill rates and the core speed multiplied by the Textures per clock to get the texel fill rates. The 8500 according to this same source uses the same pipeline design of 4 pipelines times 2 texture units per pipeline just like the Geforce2 does.
However I'm not saying your wrong and they are right but merely pointing out that theres conflicting info going around to what you said.
 

cleeve

Illustrious
Vimp dude, I gotta respectfully disagree with you here.

1. The Geforce2 had something called the Nvidia Shading Rasterizer or something like that. NSR I think it was called. It didn't have what we'd consider to be a pixel shader. It wan't programmable, it just had a suite of pre-made DirectX 7 effects... like bump mapping, transparency, etc.

A pixel shader is programmable. It can run custom effects created by a software developer. Effects like realistic water, blurs, lighting effects... anything a developer can dream up and program.

The Geforce2 can't do pixel-shaded effects like shader-calculated water in morrowind.
The only cards that can do that are DirectX 8 (or better) class cards like the 8500, Geforce3, and better.

If you saw the difference between pixel-shaded water in Morrowind, or Lock On: Modern air combat, or the blur effect in Need for Speed Underground... you would know, there is a HUGE difference. It's why all the hype about pixel shaders goes on here. Pixel Shaders are the future... at least for a while, until the next big thing comes along... but we have hardly seen current software scratch the surface of their capabilities. There have been a few instances up till now, but Doom3 and Half Life are the first games that will really showcase pixel shader use, IMHO.

Unreal Tournament2003 is not a really good example to compare because it's basically a DirectX 7 title with a few DirectX 8 effects thrown in for people who have DX8 class cards... but none of those effects are very striking.


2. In my rant I said if each of the GPU's was running at a <b>theoretical 100 Mhz</b> to compare what their capabilities would be at the same speeds... and because 100 mhz is an easy multiplier.


3. The Geforce2 doesn't have texture units like the Radeon 8500; but the terminoligy sounds so similar, it might be confusing.

The Geforce2 has 4 pipelines that can process 2 <b>texels</b> per pipeline
Could be considered a 4x2 architecture, right?

The Radeon 8500 has 4 pipelines that have 2 <b>texture units</b> per pipeline.
Also sounds like a 4x2 architecture, right?

To the uninformed viewer they look identical. But in actuality, the Radeon 8500 can process <b>3 texels in <i>each</i> one of it's 2 texture units!</b>

So the Radeon 8500 is a 4x2x3 architecture! But since the radeon 8500 and it's texture units, the old way of quoting texels per pipe has become obsolete.

Note that the Radeon 9000 is considered a 4x1 architecture even though each of it's pipes can process 6 texels each! So if we were still quoting the old way, the Radeon 9000 would be considered a 4x6 architecture... which it is never quoted as.

________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>
 

phial

Splendid
Oct 29, 2002
6,757
0
25,780
i think i read somewhere that the gf4ti series were only 4x1 when doing ansio filtering , dunno how many textures it can render per clock tho..

-------


<A HREF="http://www.albinoblacksheep.com/flash/you.html" target="_new">please dont click here! </A>
dhlucke - "Phew...ok my wrists are hurting. I'm taking a break."
 

Vimp

Distinguished
Jul 13, 2003
358
0
18,780
I wont claim to be an expert on graphics cards so its quite possible that I'm ignorant to many things. However in the relatively short while that I have been more deeply learning about how videocards work I have come to know a fair bit. Natrually then I will tend to argue points based on what I have learned intill I have seen evidence to the contrary as I'm sure you can appreciate.

This is a quote from a review made on the Geforce2 GTS when it was still a new card. If cards at that time never had more then one texture per pipeline then they certainly wouldn't have simply gotten confused about what they were saying. Also if it were a mere missprint then the missprint has consistenly been made every other time I see the Geforce2's pipeline design mentioned.
NVIDIA claims "all the engines" in the GeForce 2 have been "signficantly tweaked and optimized." Comparing the GeForce 2 to the original GeForce, NVIDIA emphasizes the GF2's "second generation" T&L engine and its "GigaTexel Shader." Near as we can tell, the GeForce 2 GTS is basically a die-shrunk GeForce DDR with integrated digital flat-panel support, an improved motion video processor, <b>and the added ability to process a second texture per pixel pipeline.</b> It has a higher core clock speed and a slightly higher memory clock.
I can't comment much on Morrowind's water effects since I don't have the game to check out what my card can do. However comparing NFS: Underground is kinda silly since some of its effects are done using DirectX9 including the motionblur effect I believe. But most of NFS's effects are shown on my card and motionblur effects can be seen using my card in other games like GTA3.
 

cleeve

Illustrious
***<b>PIPELINES AND TEXTURES</b>***

If cards at that time never had more then one texture per pipeline then they certainly wouldn't have simply gotten confused about what they were saying. Also if it were a mere missprint then the missprint has consistenly been made every other time I see the Geforce2's pipeline design mentioned.
Vimp, I've said over and over that the geforce2 has four pipelines and can process two texels per pipeline.

A texel is a texture on a pixel.

Four pipelines that can process two textures each. That is exactly what I have been saying all along.

What the Geforce2 does NOT have is <b>texture units</b>.

This is key:
<b>A TEXTURE UNIT is not the same as a TEXEL</b>

EACH pipeline of the Radeon 8500 has two TEXTURE UNITS, which are not texels. Think of a texture unit as hardware that can handle a number of textures, or texels. The radeon 8500's Texture Units can handle three texels each.

So if you're describing them in the format "Pipelines x Textures", then a Geforce2 is a 4x2 architecture, and the Radeon 8500 is a 4x6 architecture.

But like I said, nobody uses this way of describing pixel pipeline power anymore. Now, the first number is used for pixel pipelines, and the second number is NOT texels, but TEXTURE UNITS.

If you describe them in the way hardware reviewers do today, you describe them as "Pipelines x Texture UNITS". Then the Radeon 8500 is a 4x2 architecture, and the Geforce2 is a 4x0 architecture. You dig?

That is why the Radeon 8500 is described as a 4x2 architecture.
And that's why the Radeon 9000 is described as a 4x1 architecture in all the reviews, despite the fact that is has 4 pipes and can process 3 texels per pipe.

Look at this chart of Tom's that shows the Radeon 8500 having 2 texture units per pipe that can process 3 texels (textures) each, and the 9000 having 1 texture unit per pipe that can process 6 texels (textures) each.

<A HREF="http://www6.tomshardware.com/graphic/200207181/index.html" target="_new">http://www6.tomshardware.com/graphic/200207181/index.html</A>


***<b>PIXEL SHADER EFFECTS</b>***

As far as pixel shader effects, here is a screenshot of Morrowind with a DirectX 7 card like the Geforce2... no pixel shaders:

<A HREF="http://www.xgr.com/pic.php?img=/Images/Articles/2523/Morrowind23.jpg" target="_new">http://www.xgr.com/pic.php?img=/Images/Articles/2523/Morrowind23.jpg</A>

Now, here is a screenshot of morrowind on a DirectX 8 card, like the Radeon 8500:

<A HREF="http://www.gamespy.com/asp/image.asp?/articles/march02/morrowind/morrowind1big.jpg" target="_new">http://www.gamespy.com/asp/image.asp?/articles/march02/morrowind/morrowind1big.jpg</A>

Now the water looks noticably better in this screenshot, but in actual play it's MUCH better, because what it doesn't show is that the pixel shaded water is MOVING like real water, REFLECTING beautifully while it does.

It can do this because it's behavior is programmed by a developer who programmed the effect. Developers can do amazing things with programmable pixel shaders, but they can not do those same things with the simple shaders in DirectX 7 class cards because those shaders are not programmable. They can only run the effects they were built to.

Here is a shot of LOMAC's water, which also requires a DirectX 8 card to view in the game:

<A HREF="http://www.lo-mac.com/ss/Mirage-Tomcat.jpg" target="_new">http://www.lo-mac.com/ss/Mirage-Tomcat.jpg</A>

Programmable pixel shaders allow for this, but the Geforce2's NSR (primitive non-programmable shader) is not capable of this sort of thing.


As far as motion blur in GTA VS. NFS: Underground, it is a very different thing, and if you saw both games you'd understand why.

The motion blur in GTA3 is approximated using DirectX 7 class effects... basically, re-rendering simple geometry multiple times for the effect (which is very processor intensive)

The motion blur in NFS: Underground is pixel-shader based. That means that the processor does not have to calculate extra geometry to approximate the blur (he pixel shader does it all), but the blur also looks *much* more realistic.
If the developers only allow the blur to appear on DX9 cards that was a design decision by them, perhaps the added programmability of the DirectX 9 spec made it easier to implement than making a backwards compatible DX8 shader blur.
But regardless, a DirectX 7 card does not have the capability to process custom shader instructions at all.

Here, have a look at Tom's chart that shows how the Geforce2 Ti... the last and greatest version of the Geforce2... does not have what would today be considered a pixel shader:

<A HREF="http://www6.tomshardware.com/graphic/20011218/geforce-ti-01.html" target="_new">http://www6.tomshardware.com/graphic/20011218/geforce-ti-01.html</A>

The first cards to have what we consider "pixel shaders" didn't arrive until the Geforce3 and Radeon 8500.

One of the reasons the Geforce4 MX's got a bad rep is because they don't even have pixel shaders... only the Geforce4 Ti's do, not the Geforce4 MX's...

the Geforce4 MX's (the 420, 440, and 460) are only DirectX 7 hardware, and can't show pixel-shaded effects.

Of course, the Geforce 4 Ti series (The 4200, 4400, 4600, and 4800s) DO have pixel shaders, and are true DirectX 8 class hardware.

________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>
 

cleeve

Illustrious
Did I clear this up for you Vimp? Don't tell me that massive essay was for nothing...

________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>
 

GeneticWeapon

Splendid
Jan 13, 2003
5,795
0
25,780
When I get off work, by balls smell like spoiled eggs.

<b>Athlon XP 2100+ @2.02Ghz
MSI K7N2 Delta-L nForce2 Ultra 400
768mb of Generic DDR266 @310 6-3-3-2
Built by ATi R9800 @410/660</b>
 

cleeve

Illustrious
Me too.

At least that's what the wife tells me. Poor lass.

________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>
 

Vimp

Distinguished
Jul 13, 2003
358
0
18,780
Yeah I now see what your saying with the pixel shaders and concede. I'm actually glad too because I now realize I have been missing out on alot of cool effects that I wasn't aware of, which gives me something to look forward too.

However I still disagree about the pipelines thing. Is Mtexels/s fill rate determined by multipling the core speed times the amount of texels or by the amount of texture units?

If its by texels then according to your stats the geforce2 has a fill rate of 1600Mtexels/s (200mhz x 8 texels) and the 8500 has a fill rate of 6600Mtexels/s (275mhz x 24 texels). If this is correct then The Tech Report has almost all of its texel fill rate stats incorrect on its videocard charts.
However if the fill rate is determined by the Texture units then again the Geforce2 has a fillrate of 800Mtexels/s and the 8500 has a fill rate of 2200Mtexels/s.

The result of the first case gives the correct fillrate of the Geforce2 but a very incorrect fillrate for the 8500.
Likewise using the second method the fill rate for the Geforce2 ends up being half what it really is but the 8500 fillrate is now correct. However this is all assuming that the Geforce2 has a 4x0 pipeline with 2 texels per line and the 8500 has a 4x2 pipeline with 3 texels each.

However to get both showing the correct fill rates that only way to do this is to multiply the core speed by the texture units and pretend that the Geforce2 does actually have 4x2 pipeline with 8 texture units total just like the 8500. Only then when you do the math do you get the correct fillrates for each card which means that the Geforce2 must have a 4x2 pipeline with that 2 representing texture units per pipe and not texels per pipe like you suggest.
 

cleeve

Illustrious
Now a fill rate lesson, eh? OK...

Instead of writing another essay I'll let Tom do it for me. He said it best here, compating the Geforce2 to the Geforce3:

<A HREF="http://www.tomshardware.com/graphic/20010227/geforce3-18.html" target="_new">http://www.tomshardware.com/graphic/20010227/geforce3-18.html</A>



________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>
 

Vimp

Distinguished
Jul 13, 2003
358
0
18,780
Well let me fall back to my ignorant ways. So what exactly is the formula used to decifer texel fill rates then? It apparently isn't either of the things I mentioned.
 

Nights_L

Distinguished
Jan 25, 2003
1,452
0
19,280
the Radeon 8500 having 2 texture units per pipe that can process 3 texels (textures) each, and the 9000 having 1 texture unit per pipe that can process 6 texels (textures) each.
This is what I don't understand..
(Cleeve, i'm not trying to argue, just trying to understand)
then theoreticaly, Radeon9000 should be as fast as a 8500, right? but then how come 8500 is faster than 9000 as we all know and as Tom's said?
since one is 4x2x3 and one is 4x1x6 as you said?
 

phial

Splendid
Oct 29, 2002
6,757
0
25,780
he has a good point ;P

-------


<A HREF="http://www.albinoblacksheep.com/flash/you.html" target="_new">please dont click here! </A>
dhlucke - "Phew...ok my wrists are hurting. I'm taking a break."
 

phial

Splendid
Oct 29, 2002
6,757
0
25,780
he has a good point ;P

-------


<A HREF="http://www.albinoblacksheep.com/flash/you.html" target="_new">please dont click here! </A>
dhlucke - "Phew...ok my wrists are hurting. I'm taking a break."
 

cleeve

Illustrious
Why is the Radeon 8500's 4x2x3 architecture
(24 texels per clock)
better than the Radeon 9000's 4x6 architecture?
(also 24 texels per clock)?

Until we started this discussion I was under the impression that more texture units allowed for more work to be done in a single pass, by making more textures available to the pipelines without another texture read (saving time and bandwidth).
But now I'm not so sure the number of texture units is directly related to the number of textures a pipeline can read in a single pass... here, for starters have a look at what John Carmack said about the 8500:

The fragment level processing is clearly way better on the 8500 than on the Nvidia products, including the latest GF4. You have six individual textures, but you can access the textures twice, giving up to eleven possible texture accesses in a single pass, and the dependent texture operation is much more sensible. This wound up being a perfect fit for Doom, because the standard path could be implemented with six unique textures, but required one texture (a normalization cube map) to be accessed twice. The vast majority of Doom light / surface interaction rendering will be a single pass on the 8500, in contrast to two or three passes, depending on the number of color components in a light, for GF3/GF4 (*note GF4 bitching later on).
Since he notes two texture reads in one pass here, I have always thought that each texture unit in the 8500 allowed all available textures to be accessed in a single pass.

But then, when I was doing more looking around, I saw this on Tom's preview of the 8500:

'Pixel Tapestry II', the pixel-rendering unit, consists of four pipelines and it looks (ATi's information is rather inconclusive) as if 3 texels can be applied per pixel per clock, while twice the amount of texels (6) can be applied per pass (which can take quite a lot longer than one clock cycle). With Radeon 8500's core clock of 250 MHz this would lead to a pixel fill rate of 1 GPixels/s and a texel fill rate of 3 GTexels/s. For some strange reason, two different ATi white papers are claiming two different texel fill rates of 2 GTexels/s or 2.4 GTexels/s respectively. Our benchmarks lead to believe that the actual texel fill rate should be around 2 GTexels/s, making it look as if Radeon 8500 is only able to apply two texels per pixel per clock cycle.
So here we have tom's saying that the Radeon can only handle 6 textures per pass.
Honestly though, I'm more inclined to believe Carmack when it comes to stuff like this, but it casts a doubt on the whole thing.
Now I'm not positive what relation the TMU's have to anything. I've also seen some stuff that suggests the TMU's have something to do with mipmapping and filtering.

Unfortunately this type of information seems to be hard to come by, so if anyone knows for sure please let us know...

Until then, it's more research. As always.

________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>
 

Vimp

Distinguished
Jul 13, 2003
358
0
18,780
Again I'm not pretending to fully understand what I read,(That includes what I read in this thread), but I had already concluded that per pass and per clock were 2 different things since every Nvidia manufacturer advertises the FX5900 cards as rendering 16 textures per pass even though all other sources I read said 8 textures per clock. So natrually I concluded that it can do 2 clocks in a single pass. PNY's site kinda confirmed this to me when it says "16 textures per pixel(max in a single rendering pass)with 8 textures applied per clock" in its list of performance features.

I would imagine that the Geforce2 GTS however was only capable of 1 pass per clock. But this 1 and 2 we are talking about is not the same thing as the 2 in the 4x2 pipeline design obviously. This would mean that even if both the Geforce2 GTS and the FX5900 both use a 4x2 pipeline like The Tech Report says they do it would still mean that the FX5900 can render twice as many textures per pass as a Geforce2 GTS but both do 8 textures per clock.
 

cleeve

Illustrious
The reason for your confusion Vimp is that if you're going to quote numbers for specs, you have to be comparing apples to apples.

The Geforce2 is a 4x1 architecture because it only has one TMU per pipe... regardless of how many textures per TMU it can handle.

The GeforceFX architecture is a 4x2 architecture because it has two TMUs per pipe, once again, regardless of how many textures it can handle per TMU.

When you're quoting the Geforce2 as "4x2" and you're quoting "Pipes x Textures"

When you're quoting the geforceFX as "4x2", you're quoting something different, which is "Pipes x Texture Units (TMUs)

Texture units are NOT the same as textures per pipe. That is why you are having trouble with this.

________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>
 

Vimp

Distinguished
Jul 13, 2003
358
0
18,780
I may have been confusing texture units with textures but not necessarily. So far I have 1 source that shows the Geforce2 GTS has 4 pipelines with 2 Tuxture Ubits Per Pipeline and another source showing my Card to have 4 Pipelines and 2 TMUs. (Anyone know what TMU stands for?) I'm guessing that TMU is the same as Texture Units per Pipeline.

So far The Tech Report says the Geforce2 GTS has 2 Texture Units per pipe just like it says the FX5900 series does. Having looked back at the article you showed me on Tom's site I came to realize that Tom's site may not be contradicting anything I''ve read else where. However Tom's site dosn't explain things in a comprehensive and obvious manner since it assumes you already know a whole bunch of things where as The Tech Report does not so it could be that Tom's site actually says the same thing as The Tech Report and I just don't know it. Either way The Tech Report very obviously points out that the Geforce2 GTS has 4 pixel pipelines with 2 texture units per pipeline and 8 textures per clock just like the FX5900 does. What the graph shows for stats though does not include how many textures per pass. Other sources show that the FX5900 does 16 textures per pass. However sources for the Geforce2 GTS seem to suggest that it can only do 8 textures per pass suggesting that the FX5900 can do 2 clocks per pass where the G2 GTS can only do 1 clock per pass.

One assumtion I did infact make is the idea that the pipelines multiplied by the texture units per pipeline equals the textures per clock. The reason I made this assumtion is because according to The Tech Report's charts it shows the Textures per clock for dozens of cards and for each and every single one of them the listed Textures per clock number happens to be the same number you get when you multiply the pipelines of each card by its Texture Units per pipe shown in the same chart. This, along with Tom's article both imply that texture units are actually the same as textures. Infact Tom's article even implies to me that texels too are the same as textures. However I don't fully understand whats being said in Tom's article due to the article being written in such away that it assumes the reader knows alot of very complicated things that arn't even common knowledge among computer experts thus I may not be interpreting it correctly.
My other source which shows my G2 GTS card to have 2 TMUs per pipeline is from a program called AIDA32 which is used to show your computers system information. This program shows my Geforce2 GTS to have the following stats:

GPU Code Name: NV15
Transistors: 25 million
Process Technology: 0.18u
Memory Size: 64 MB
GPU Clock: 200 Mhz
RAMDAC Clock: 350 Mhz
<b>Pixel Pipelines: 4
TMU Per Pipeline: 2</b>
Vertex Shaders: Not Supported
Pixel Shaders: 1 (v0.5)
DirectX Hardware Support: DirectX v7.0
Pixel Fillrate: 800 MPixel/s
Texel Fillrate: 1600 MTexel/s

Bus Type: DDR
Bus Width: 128-bit
Real Clock: 166 Mhz (DDR)
Effective Clock: 333 Mhz
Bandwidth: 5328 MB/s

The above info completly coincides with The Tech Report's info if TMU is the same as Texture units. What this also means is since I see no other formula for determining the Texel Fillrate aside from multipling the Core speed by the Textures per Clock I'm inclined to agree with The Tech Report's stats which show both my Geforce2 GTS and the FX5900 to have the same 8 Textures Per Clock. The reason I assume it to be this formula is because for every card listed in the chart at The Tech Report the texel fill rate numbers just happen to be the numbers you get when multiplying each cards core speed by its Textures per clock.

If I'm missing something in all this I ask that you please point it out to me so that I can understand all this better.
 

cleeve

Illustrious
The reason I assume it to be this formula is because for every card listed in the chart at The Tech Report the texel fill rate numbers just happen to be the numbers you get when multiplying each cards core speed by its Textures per clock.

Yes, absolutely. But textures per clock does not = TMUs per clock.

TMU stands for "Texture Mapping Unit"

If "TMU"s were the same as textures, then this chart must be grossly in error. But I don't believe it is:

<A HREF="http://www6.tomshardware.com/graphic/200207181/index.html" target="_new">http://www6.tomshardware.com/graphic/200207181/index.html</A>

I have looked at more than ten Radeon 8500/Geforce2 reviews and it seems to me that the reviewers know about as much as we do: that is to say, they're flying by the seat of their pants like we are. They pretty much disagree with each other on the point.

What I do know for sure is this:

1. The Radeon 8500 has 4 pipelines, 2 TMU's per pipe, and can process 3 textures per TMU.

2. The Geforce2 has 4 pipelines and can process 2 textures per pipeline.

So if the Geforce2 has 2 TMUs per pipeline... then <b>perhaps it can only process 1 texture per TMU.</b>

Which would really clear alot of things up.

What it wouldn't clear up is why having more TMUs with fewer textures (i.e. the Radeon 8500) is preferable to having more textures with fewer TMUs (the Radeon 9000), when both yield the same number of texels per clock.

I'm still looking into it, but I hypothesize that each TMU allows for a pass per clock cycle.

So the Radeon 8500's 2 TMU with 3 textures each would yield 12 textures per pass (2 TMUs = 2 passes of 6 textures, which would jive with what Carmack said - but would go against what alot of review sites, that have stated: that the 8500 can process 6 textures per pass)

That would mean that even thought the Radeon 9000 can process the same number of textures per clock cycle (6 per pipe), because all 6 of those textures are from 1 TMU, that it is limited to 6 textures per pass...

Which also would explain alot.

But I'm still looking for a good explanation, and whatever the answer is it seems to me that alot of hardware review sites have it wrong, because they aren't consistant in their explanations...



________________
<b>Radeon <font color=red>9500 PRO</b></font color=red> <i>(hardmodded 9500, o/c 322/322)</i>
<b>AthlonXP <font color=red>2600+</b></font color=red> <i>(o/c 2400+ w/143Mhz fsb)</i>
<b>3dMark03: <font color=red>4,055</b></font color=red>