GTX 760 R9 270 performance AS INTEGRATED GRAPHICS of APU

deathcall666

Honorable
Nov 23, 2012
264
0
10,860
Why isnt AMD APU isnt making a apu with a r9 270 as integrated graphics. It would mean 150w TDP FROM a underclocked 270 and 100w TDP from a quad procesor like a fx4300( maibe better and more energy eficient) => 250TDP. With a stock coller master gts v8 or a liquid solution it would be a killing combo for 1080p gaming. So why arent they doing such a thing rather then spending resources on potato cpus like 9590 (which is a rebranded shit). Id buy such a APU in the 250-350 maybe even 400€. Answers?
 
Solution
The R9-270 needs a 256bits memory interface operating at over 5GT/s to push its performance numbers. An APU only has a 128 bits wide memory interface operating at ~2GT/s. That's only 1/5th as much memory bandwidth, nowhere near enough to deliver anywhere near R9-270 performance.

Even trying to cram R7-260 level performance in an APU would be difficult since the APU only has about 40% as much available memory bandwidth as the discrete GPU would.
APU's are designed to meet low graphics demands eg laptops. Discrete gpu's are for the higher graphic demand applications eg gaming.

If you tried to combine the graphics power of a discter card with a gpu, you'd end up with a gpu size cpu which wouldn't fit a laptop.
 

Iron124

Reputable
Jun 1, 2014
607
0
5,360
I understand what you're saying and agree completely. The newest APU's should've had at least one enthusiast grade on-die equivalent of an R9 chip, that way we might actually consider them as an alternative to a dedicated GPU outside of el-cheapo home PC boxes for light multimedia. Imagine if you could crossfire it with an R9 270 similar to how you can crossfire an R7 with the newst APU's right now? It would be fantastic.

Sadly, AMD seems to be falling behind, but I believe they will make it there eventually, their CPU's are undergoing a total architecture rebuild and I don't think we'll see them come back strong in anything but the graphics card department until 2015 or 2016.
 

Andrew Buck

Honorable
This is because dedicated GPUs are much bigger and have their own dedicated cooling. You could not fit a 270 in a tiny processor without uncharted levels of heat output and required watercooling. That would defeat the purpose of the integrated GPU. It could maybe be done, but it isn't worth it. The APUs, to have managable heat, have 8 GPU cores. An R9 270 has 20. That is a big difference.
 

InvalidError

Titan
Moderator
The R9-270 needs a 256bits memory interface operating at over 5GT/s to push its performance numbers. An APU only has a 128 bits wide memory interface operating at ~2GT/s. That's only 1/5th as much memory bandwidth, nowhere near enough to deliver anywhere near R9-270 performance.

Even trying to cram R7-260 level performance in an APU would be difficult since the APU only has about 40% as much available memory bandwidth as the discrete GPU would.
 
Solution


the problem is the RAM

I think they're basically at the limits of what they can do with gpu power and ddr3 ram. making a bigger on chips graphic section won't help if the ram bandwidth is too small and slow.

 
Roughly half the die (125mm2) is 512 stream processors --- if you more than double the stream processors (the R9 270 has 1280) the complete die size would zoom past 400mm2 (not counting the necessary additional space for a 256-bit memory interface).

I would not be surprised with the next die shrink bringing 4 more CUs (or another 256 stream processors) to the die -- especially if HSA gets legs.

 

InvalidError

Titan
Moderator

The memory controller itself uses almost no die area itself. The really big problem is having a cost-effective 256-bits wide memory bus between the CPU and DIMM slots: this requires a socket with 500+ extra pins, a CPU substrate with 2-3 extra layers, a motherboard with 2-4 extra layers, extra die area just to fit the extra uBGA lands between the die and CPU substrate, etc.

There is not much point in increasing the amount of die area dedicated to the GPU when the socket interface lacks the necessary memory bandwidth to actually support it.

And there is also the "handicap" of the CPU needing access to RAM too..
 

deathcall666

Honorable
Nov 23, 2012
264
0
10,860
Soo sumarizing the total amount of data from you the RAM is the problem. The cooling problem isnt dat much problem since ive mentioned a solution either air cooled like v8 gts or liquid cooling. Not to mention that for such a combo OEM retailers will go with supermini high power PC. SOO my new question is ... Can the new DDR4 change the situation or maybe the MEMORY CUBE ? Or simply make it like in the PS4 and put 8gb of gddr5. It would make AMD make a PC almost from scratch.
 

InvalidError

Titan
Moderator

DDR4 on a 128bits memory bus is still only about half as much bandwidth as GDDR5 on 128bits GPUs.

Using GDDR5 as system memory is expensive and GDDR5 signaling is not meant to go through DIMM slot connections so the memory would most likely need to be soldered on the motherboard. So, now you would have a PC with non-upgradable memory that only has enough memory bandwidth to perform about on par with an R7-260.

If you want more performance than that, you need a wider memory interface. All those extra pins and signal traces across connectors if you want to make the system upgradable add significant cost and complexity.

The reason why GDDR clocks can be pushed so much harder than DDR is precisely because there are no sockets between the memory and GPU/SoC to mess up signal integrity at high speeds. Maintaining signal integrity is becoming so tricky at over 2GT/s that many DDR3 kits fail to work in two-DIMM-per-channel configuration and the ability got dropped from from the DDR4 specification altogether.
 

deathcall666

Honorable
Nov 23, 2012
264
0
10,860
Well r7 265 ( the highest with 128bit interface would be a start) would be a start. Not to say that 8 Gb of gddr is enough for anyone at 1080p. As for the gddr vs ddr coulnt they isolate circuitry so there wont be any interference to mess the speeds?. Maybe a new 20nm fab proces or some sort of multylayer procesor crazyness like the HMC memory cube?
 

titanHUNTER

Reputable
BANNED
Jun 24, 2014
207
0
4,710
From my understanding, AMD is trying to create an entire system on one chip. Not just CPU and GPU integration, but RAM also. With the CPU, GPU and RAM working together in a seamless integrated process, the entire architecture (or power requirements) of GPUs may change!! Their vision is many years away. However, do not be surprised if and when AMD hits this breakthrough (HSA is the seed of this "breakthrough"). We need innovation like this (and competition) to take computers to the next level.
 

InvalidError

Titan
Moderator

The R7-265 is actually a 256bits model AMD was forced to introduce because there was too much of a performance gap between AMD's and Nvidia's best 128bits GPUs - the performance gap is so wide Nvidia's 128bits Maxwell gives even AMD's 256bits R7-265 a fair fight.

AMD has lots of work to do on their GPUs' memory bandwidth efficiency.
 

deathcall666

Honorable
Nov 23, 2012
264
0
10,860
My bad r7 265 is actually 7850 with 256bit interface. But the r7 260x (128 bit) would be nice and its just 12% slower than a gtx750ti ( which is newer tehnology) which means full detail at under 1080p and even 1080 on older titles.
 
Soo sumarizing the total amount of data from you the RAM is the problem...

Not really.

AMD has IP for stacked on-die "T-RAM" but the bottom line is the graphics function is secondary to the "SIMD Engine" capability being developed for the stream processors.

Kaveri doubled the number of 256-bit 'Fusion Control Links" between CPU cores and RAMs. As part of HSA AMD has developed 'Unified Memory Addressing' equally accessible by the CPU cores and SIMD Engine 'CUs' (There are 64 Radeon stream processors in each CU). This actually ties in well with the serial nature of DDR4 and the further advancement of the AMD 'IOMMU' (Look it up :lol:)

In order to make the Unified Memory Addressing function at a high level, they added a 256-bit 'Radeon Control Link' whereby the SIMD CUs sniff the L2 cache of the Steamroller cores for 'coherency'

It's safe to assume as more CUs are added to the die a 2nd 256-bit Radeon Control Link will be added and the question becomes ... whether it will be used for CPU core cache coherency or not. The CUs could have their own on-die T-RAM to accelerate SIMD instructions without have to bother with the IMC or IOMMU ... OR ....

DDR4 does away with the concept of memory 'channels.' The CUs conceivably could have as much address space "as required" by the task at hand --- 512MB .... 1GB .... 2GB ...





 

InvalidError

Titan
Moderator

Memory channels are still very much there with DDR4. There are basically three big changes from DDR3 to DDR4:
1- lower voltages
2- the number of banks per device goes up from 8 to 16
3- each channel can only have one DIMM

Beyond that, DDR4 works almost exactly the same way as DDR3; just with entry-level clocks that match high(-ish)-end DDR3.
 

deathcall666

Honorable
Nov 23, 2012
264
0
10,860
Invalid error said that it would need another 500+ pin for this wider memory interface.
Considering thag FCH will also be on the die in next series APU it means that what im proposing would be like:
Southbridge , northbridge , gpu , cpu will all be on obe die
Also the embeded memory will be already in the system (8gb ddr5)
What else could you spare from the traditional MB?
The pci express for GPU. It make it sort of console.
Also it wouldnt require much it would have a good cooling. A 300w would be enough really. Soo?