Sign in with
Sign up | Sign in
Your question

Two GPU fused in one chip?

Last response: in Graphics & Displays
Share
April 8, 2013 8:55:14 PM

Video cards like 6990 and GTX 690 have two separate GPU. Why don't nvidia and AMD put the two GPU into one larger chip?

More about : gpu fused chip

April 8, 2013 8:58:27 PM

I want to say that the heat output would be too much to handle in one spot.
m
0
l
April 8, 2013 9:00:41 PM

affroman112 said:
I want to say that the heat output would be too much to handle in one spot.

^ I think that would be it right there. I can't imagine the cooler you would need to keep a chip like that from going up in flames.
m
0
l
Related resources
April 8, 2013 9:06:07 PM

That would require another die which would make them even more expensive. By reusing gpus of other cards, it reduces prices without affecting performance. Many of the other gpus are using the same die with disabled sections.
m
0
l
April 8, 2013 9:10:55 PM

Two reasons I can think of:
1) Yields: The processes are not 100% accurate and a percentage of chips will always have faults meaning they either have to be thrown away, or work done to disable the faulty areas and sell them on as lower spec chips. The larger you go, the more chips you 'waste'. I.e. if you have 2 smaller chips and one has a fault, you still have 1 working chip. 1 double sized chip with a fault, the whole thing is faulty.
2) Design costs: The higher/more expensive you go the less units you sell. There would be significant costs associated with designing a massive chip which would only be purchased by a tiny percentage of consumers. While there are some design costs associated with creating dual GPU boards (690, 7990 etc), they're basically just two already designed chips on a single card. I expect they're far simpler from a design (thus R&D cost perspective) compared to designing an equivalent monster single GPU for this small market.
Possibly other reasons too, but these are the ones I can think of.
m
0
l
April 8, 2013 10:20:50 PM

They did at least almost did with GK110(titan). Its got close on double a 680 just not quite double the performance though unfortuneately.

edit : By close i meant around 50% increase which may or may not improve with drivers (crosses fingers :) ).
m
0
l
April 8, 2013 11:22:00 PM

One interesting option in the future is to increase performance by SPLITTING the chip into two or more dies.

They currently try to get the maximum performance by having multiple, IDENTICAL GPU's. In theory, they could design something with 2x the area as a GTX680 die, then split it into two pieces to manufacture then place the pieces back together on the chip.
m
0
l
April 9, 2013 12:16:36 AM

photonboy said:
One interesting option in the future is to increase performance by SPLITTING the chip into two or more dies.

They currently try to get the maximum performance by having multiple, IDENTICAL GPU's. In theory, they could design something with 2x the area as a GTX680 die, then split it into two pieces to manufacture then place the pieces back together on the chip.


Trouble with that is driver support - the scaling for tri- or quad- SLI is absolutely HORRIBLE, as the drivers are bitches to write.
m
0
l
April 9, 2013 4:20:54 AM

photonboy said:
One interesting option in the future is to increase performance by SPLITTING the chip into two or more dies.

They currently try to get the maximum performance by having multiple, IDENTICAL GPU's. In theory, they could design something with 2x the area as a GTX680 die, then split it into two pieces to manufacture then place the pieces back together on the chip.


Is this something you've read about or seen discussed? Or is it just your theory? Not wishing to be rude, but that sounds highly implausible to me. The "manufacturing" process creates billions of transistors out of a silicon die. I can't imagine how two halves manufactured separately could be 'placed back together'. You'd need billions of electrical connections between the two 'pieces'. Please point me to your source if this has been discussed somewhere, because it sounds virtually impossible to me and certainly not economically viable even if theoretically possible.

The main issue with getting multiple smaller chips to work together effectively (like in SLI/CF) is that the interface between them is nowhere near fast nor low-latency enough to allow effective teaming. Even PCIe 3.0 x16 is slow and high latency compared to the sort of interface you'd need to effectively distribute a GPU workload in realtime. Thus the drivers have to 'guess' how to evenly distribute the workload (by giving, for example, alternative frames to each GPU). This is a main source of micro-stutter, because the drivers have no way of guessing correctly and thus one frame is processed quickly and the next takes much longer than expected, introducing uneven frames rates: stutter.
Just theorising now: But as I understand it, effective teaming would require a super-high speed interface (something like an additional memory controller but used to connect the chips) which would take significant die space and add heat on each chip. I suspect - educated guessing here - that the cost of going such a route would be greater than just fabbing a larger chip and accepting the lower yields, which, for the reasons already discussed, just isn't economically viable.

Titan is an interesting example. It's actually just a consumer version of the Tesla K20 which power the Titan Supercomputer. I imagine this massive order would have covered all the R&D costs and some. Once they had sufficient supply to meet their K20 demand with chips left over, they released the consumer version. It's also clocked lower, which supports what others were suggesting above that bigger chips have bigger heat issues (necessitating lower clocks).

Interesting discussion here (for me anyway!) Thanks for raising it.
m
0
l
April 9, 2013 1:35:40 PM

rhysiam said:
photonboy said:
One interesting option in the future is to increase performance by SPLITTING the chip into two or more dies.

They currently try to get the maximum performance by having multiple, IDENTICAL GPU's. In theory, they could design something with 2x the area as a GTX680 die, then split it into two pieces to manufacture then place the pieces back together on the chip.


Is this something you've read about or seen discussed? Or is it just your theory? Not wishing to be rude, but that sounds highly implausible to me. The "manufacturing" process creates billions of transistors out of a silicon die. I can't imagine how two halves manufactured separately could be 'placed back together'. You'd need billions of electrical connections between the two 'pieces'. Please point me to your source if this has been discussed somewhere, because it sounds virtually impossible to me and certainly not economically viable even if theoretically possible.

The main issue with getting multiple smaller chips to work together effectively (like in SLI/CF) is that the interface between them is nowhere near fast nor low-latency enough to allow effective teaming. Even PCIe 3.0 x16 is slow and high latency compared to the sort of interface you'd need to effectively distribute a GPU workload in realtime. Thus the drivers have to 'guess' how to evenly distribute the workload (by giving, for example, alternative frames to each GPU). This is a main source of micro-stutter, because the drivers have no way of guessing correctly and thus one frame is processed quickly and the next takes much longer than expected, introducing uneven frames rates: stutter.
Just theorising now: But as I understand it, effective teaming would require a super-high speed interface (something like an additional memory controller but used to connect the chips) which would take significant die space and add heat on each chip. I suspect - educated guessing here - that the cost of going such a route would be greater than just fabbing a larger chip and accepting the lower yields, which, for the reasons already discussed, just isn't economically viable.

Titan is an interesting example. It's actually just a consumer version of the Tesla K20 which power the Titan Supercomputer. I imagine this massive order would have covered all the R&D costs and some. Once they had sufficient supply to meet their K20 demand with chips left over, they released the consumer version. It's also clocked lower, which supports what others were suggesting above that bigger chips have bigger heat issues (necessitating lower clocks).

Interesting discussion here (for me anyway!) Thanks for raising it.


The concept is fairly simple:
The architecture would be similar to a multi-core CPU. AMD already makes a FUSION chip that is the CPU and GPU on a single die as well as a version that is two separate dies.

The GPU's wouldn't require billions of connections, they would mainly have the input stage on the GPU bus.

The entire design isn't totally different from SLI, except that instead of alternating frame rendering between multiple GPU's, all GPU's are seen as one GPU. (so the GPU is the same as a CPU, and each GPU die is the same as a CPU CORE).

I don't have any links, but as a computer technician I see no issues here though it would be a high-end card obviously.

I suppose there could be some latencies involved as you've mentioned but these can be addressed and SLI has major issue with latencies/stutter anyway. The big bonus is that no SLI drivers would be required so they could simply drop SLI completely which would save a lot of money. I see no downside.
m
0
l
April 9, 2013 1:39:41 PM

photonboy said:
The concept is fairly simple:
The architecture would be similar to a multi-core CPU. AMD already makes a FUSION chip that is the CPU and GPU on a single die as well as a version that is two separate dies.

The GPU's wouldn't require billions of connections, they would mainly have the input stage on the GPU bus.

The entire design isn't totally different from SLI, except that instead of alternating frame rendering between multiple GPU's, all GPU's are seen as one GPU.

I don't have any links, but as a computer technician I see no issues here though it would be a high-end card obviously.


"The entire design isn't totally different from SLI, except that instead of alternating frame rendering between multiple GPU's, all GPU's are seen as one GPU."

That's all well and good, but how in the world do you propose the GPU is going to function.. at all?
Is the data just magically going to be worked on by whatever chip doesn't happen to be busy at the moment? How's it going to figure that out and direct the workload efficiently?

Just in terms of an armchair electronics discussion, it's a wonderful idea, but it's hideously implausible and wouldn't be anywhere close to effective - at least not with technology as we know it now.
m
0
l
April 9, 2013 3:01:19 PM

DarkSable said:
photonboy said:
The concept is fairly simple:
The architecture would be similar to a multi-core CPU. AMD already makes a FUSION chip that is the CPU and GPU on a single die as well as a version that is two separate dies.

The GPU's wouldn't require billions of connections, they would mainly have the input stage on the GPU bus.

The entire design isn't totally different from SLI, except that instead of alternating frame rendering between multiple GPU's, all GPU's are seen as one GPU.

I don't have any links, but as a computer technician I see no issues here though it would be a high-end card obviously.


"The entire design isn't totally different from SLI, except that instead of alternating frame rendering between multiple GPU's, all GPU's are seen as one GPU."

That's all well and good, but how in the world do you propose the GPU is going to function.. at all?
Is the data just magically going to be worked on by whatever chip doesn't happen to be busy at the moment? How's it going to figure that out and direct the workload efficiently?

Just in terms of an armchair electronics discussion, it's a wonderful idea, but it's hideously implausible and wouldn't be anywhere close to effective - at least not with technology as we know it now.


I explained this poorly.

Every GPU has an input stage that tasks incoming jobs in parallel. Many of the units are in parallel but not physically connected. I'm simply talking about splitting this GPU into THREE physical pieces and join them together via the bus on the card:
1) the Input Stage
2) LEFT SIDE
3) RIGHT SIDE

It's actually not that complicated, but I'm not going to hijack this thread further.
m
0
l
April 9, 2013 3:12:00 PM

photonboy said:
I explained this poorly.

Every GPU has an input stage that tasks incoming jobs in parallel. Many of the units are in parallel but not physically connected. I'm simply talking about splitting this GPU into THREE physical pieces and join them together via the bus on the card:
1) the Input Stage
2) LEFT SIDE
3) RIGHT SIDE

It's actually not that complicated, but I'm not going to hijack this thread further.


Gotcha, I see what you're saying now. Yes, that's certainly do-able, but the question is whether the increased latencies would be worth the extra heat dispersion.
m
0
l
April 9, 2013 5:39:48 PM

DarkSable said:
photonboy said:
I explained this poorly.

Every GPU has an input stage that tasks incoming jobs in parallel. Many of the units are in parallel but not physically connected. I'm simply talking about splitting this GPU into THREE physical pieces and join them together via the bus on the card:
1) the Input Stage
2) LEFT SIDE
3) RIGHT SIDE

It's actually not that complicated, but I'm not going to hijack this thread further.


Gotcha, I see what you're saying now. Yes, that's certainly do-able, but the question is whether the increased latencies would be worth the extra heat dispersion.


There are latency issues with SLI and they do that.

Let's be clear though, I'm not sure there would be any latencies added. I'm not talking about sticking a multi-tasking unit in front of multiple GPU's like in a CPU, I'm simply talking about taking modifying a SINGLE GPU design so it can be split into pieces. These two, three, or four dies would simply be connected together.

Also, it's not mainly due to HEAT it's also the die size (the Titan die is about as large as we can manufacture.)

So imagine taking a GTX680 design with more processing units etc so the die is 3x the square area. Now figure out how to separate it into three or so pieces (many functions are in parallel so it's not as hard as you might think). You can basically draw a big box around major portions and make it an individual die then carefully join all the dies together with precision soldering.

So FUNCTIONALLY these connected dies would be identical to one GPU with the following advantages:
1) Cost to manufacture is cheaper
2) Eliminate the SLI driver team costs
3) Superior performance to SLI
4) no VRAM cloning (GTX690 is 2GB usable or 2x2GB physically)
5) superior cooling? (actually probably still an issue as the dies would be side-by-side)
m
0
l
April 9, 2013 5:44:44 PM

Hmm... that's a very interesting prospect. It might almost make sense to do what I was thinking you mean, separating the parts to help disperse heat, but that would introduce latency.

The only reason I can see that this would be prohibitively expensive is the amount of precision required in manufacturing.
m
0
l
April 9, 2013 6:08:57 PM

DarkSable said:
Hmm... that's a very interesting prospect. It might almost make sense to do what I was thinking you mean, separating the parts to help disperse heat, but that would introduce latency.

The only reason I can see that this would be prohibitively expensive is the amount of precision required in manufacturing.


Well AMD has a single-die CPU/GPU chip as well as the functionally same thing on two separate dies and they appear to perform identically so I'm not sure this is an issue.

The PS4 is expected to have the CPU and GPU elements on separate dies but later be on the same die whenever cost makes this sensible.
m
0
l
!