AMD Unveils its Heterogeneous Uniform Memory Access (hUMA) Technology
AMD's hUMA technology builds upon the "intelligent computing architecture" featured in the company's APUs and aims to properly utilise the processing power offered by GPUs.
It takes only a cursory glance at the increasing graphical fidelity offered by games to see the advances in GPU computing power over the past decade. To truly appreciate just how vast this improvement has been, however, one needs to look beyond visuals and examine the raw computing ability offered by GPUs.
To illustrate this, consider the following example: in 2002, the Radeon 9700 Pro could provide a performance of 31.2 GFLOPS of performance, 5 years later the Radeon HD 2900 XT offered 473.6 GFLOPS and by 2012, the Radeon HD 7970 GHz Edition was capable of computing 4301 GFLOPS - an increase of 13,700% when compared to the Radeon 9700.
Though this change can reasonably be attributed to Moore’s Law and the continued decline of the $:GFLOP ratio, it is important to note that CPUs have not kept pace with the exponential growth of GPUs’ computing power as between 2002’s Pentium 4 “Northwood” processor and 2012’s Core i7-3970X processor, the computing ability rose by “just” 2600% from 12.24 GFLOPS to 336 GFLOPS.
The significance of these comparisons are twofold, firstly that current generation GPUs boast well in excess of 10 times the computing ability of current generation CPUs and secondly, the vast majority of applications and computing tasks do not take advantage of the processing power offered by GPUs.
AMD aims to address this with its heterogeneous Uniform Memory Access (hUMA) technology which builds upon HSA, the “intelligent computing architecture” utilized in the company’s APUs that that “enables CPU, GPU and other processors to work in harmony on a single piece of silicon by seamlessly moving the right tasks to the best suited processing element”.
The hardware coherency provided by hUMA brings three key features to the table.
- Coherent Memory: Ensures that CPU and CPU caches both see an up-to-date view of the data
- Pageable Memory that allows the GPU to seamless access virtual memory addresses that are not (yet) present in physical memory
- Entire Memory Space: Both CPU and GPU can access and allocate any location in the system’s virtual memory space.
AMD demonstrates the technologies functionality with the following examples, without hUMA, the CPU must first explicitly copy data to GPU memory, the GPU completes the computation and then the CPU must explicitly copies the result back to CPU memory in order for it to be read. With hUMA the CPU can simply pass a pointer to the GPU, which completes the computation and produces a result that the CPU can directly read without any copying required.
In addition to the "Top 10 Reasons" noted in the above slide, AMD cites an additional six benefits that its hUMA technology brings to both developers and consumers:
- Ease and simplicity of programming through single, standard computing environments
- Support for mainstream programming languages including Python, C++ and Java
- Lover development costs as the more efficient architecture enables less people to do the same work
- Better experiences through "radically different user experiences"
- Enables more performance from the same form factor
- Longer battery life without sacrificing performance
Through both the HSA Foundation and partnerships with companies such as ARM, Qualcomm, Samsung and Texas Instruments, AMD can already claim broad support from major industry players. The company is evidently confident enough in its HSA / hUMA technology that it boldly predicts that HSA-based devices have the potential to constitute two-thirds of the 2.1 billion connected devices it expects to see by 2016.
Further information on this technology will be revealed during APU '13, AMD's Developer Summit taking place in San Jose between November 11 and November 14 2013 that will offer "14 different tracks with over 140 individual presentations".



I think you meant compared to the 9700 Pro... The difference between the 7970 and 2900XT is 908%, or 9-times the GFLOP performance.
On top of that... AMD stated that they were focusing on fixing single-threaded performance on Steamroller cores and that with all the modifications, 30% to 40% performance gains in CPU performance should be seen.
So effectively speaking, the 1 module that contains 2 cores won't behave like a single core Intel CPU, but instead will behave like 2 cores i series CPU.
Right now (at least in the mobile area), the A10 4600m is comparable to an entry level i5 Ivy Bridge in terms of overall CPU performance.
If Kaveri CPU performance reflects AMD statements, then quad core Kaveri APU's will basically be comparable to Ivy Bridge/Haswell in CPU power alone... and probably surpass Haswell in IGP performance.
Adding to all of this, seeing how consoles like PS4 and Xbox have adopted AMD x86 hardware, it hUMA will probably find itself in widespread adoption... not to mention utilization in multicore cpu's, and adoption of x86 instruction set (it will be far easier to port games from consoles to PC's... they should be far more optimized compared to what's done today).
And of course, graphics will likely experience a very high jump.
All in all, I'm interested to see what Kaveri brings.
I just hope that OEM's decide to build proper mobile platforms for those APU's instead of putting an A10 4600m into a 17" monster.
Current AMD APU's are ideal for 15" and lower form factors... but their adoption is extremely low to the point of non-existance.
Heck, even AMD's mid-range mobile 7xxx series GPU's cannot be found virtually anywhere.
If AMD can put some pressure on Intel with heavy use of unified CPU GPU computing we can see somewhat faster advancement allso in CPU department...
Please see below:
"In addition to the "Top 10 Reasons" noted in the above slide, AMD cites an additional six benefits that its hUMA technology brings to both developers and consumers:" -> Horrible sentence structure -> Additionally, AMD cites six benefits to its hUMA technology for both developers and consumers.
"Lover development costs as the more efficient architecture enables less people to do the same work" -> Lower development costs come from improved efficiency and enable smaller resources for the same job.
Better experiences through "radically different user experiences" -> What does this mean? Sounds like marketing gibberish.
"Further information on this technology" -> Further information about this technology.
Respectfully, please correct the above.