AMD Unveils its Heterogeneous Uniform Memory Access (hUMA) Technology

It takes only a cursory glance at the increasing graphical fidelity offered by games to see the advances in GPU computing power over the past decade. To truly appreciate just how vast this improvement has been, however, one needs to look beyond visuals and examine the raw computing ability offered by GPUs.

Though this change can reasonably be attributed to Moore’s Law and the continued decline of the $:GFLOP ratio, it is important to note that CPUs have not kept pace with the exponential growth of GPUs’ computing power as between 2002’s Pentium 4 “Northwood” processor and 2012’s Core i7-3970X processor, the computing ability rose by “just” 2600% from 12.24 GFLOPS to 336 GFLOPS.

The significance of these comparisons are twofold, firstly that current generation GPUs boast well in excess of 10 times the computing ability of current generation CPUs and secondly, the vast majority of applications and computing tasks do not take advantage of the processing power offered by GPUs.

AMD aims to address this with its heterogeneous Uniform Memory Access (hUMA) technology which builds upon HSA, the “intelligent computing architecture” utilized in the company’s APUs that that “enables CPU, GPU and other processors to work in harmony on a single piece of silicon by seamlessly moving the right tasks to the best suited processing element”.

The hardware coherency provided by hUMA brings three key features to the table.

Coherent Memory: Ensures that CPU and CPU caches both see an up-to-date view of the data
Pageable Memory that allows the GPU to seamless access virtual memory addresses that are not (yet) present in physical memory
Entire Memory Space: Both CPU and GPU can access and allocate any location in the system’s virtual memory space.

AMD demonstrates the technologies functionality with the following examples, without hUMA, the CPU must first explicitly copy data to GPU memory, the GPU completes the computation and then the CPU must explicitly copies the result back to CPU memory in order for it to be read. With hUMA the CPU can simply pass a pointer to the GPU, which completes the computation and produces a result that the CPU can directly read without any copying required.

In addition to the "Top 10 Reasons" noted in the above slide, AMD cites an additional six benefits that its hUMA technology brings to both developers and consumers:

Ease and simplicity of programming through single, standard computing environments
Support for mainstream programming languages including Python, C++ and Java
Lover development costs as the more efficient architecture enables less people to do the same work
Better experiences through "radically different user experiences"
Enables more performance from the same form factor
Longer battery life without sacrificing performance

Through both the HSA Foundation and partnerships with companies such as ARM, Qualcomm, Samsung and Texas Instruments, AMD can already claim broad support from major industry players. The company is evidently confident enough in its HSA / hUMA technology that it boldly predicts that HSA-based devices have the potential to constitute two-thirds of the 2.1 billion connected devices it expects to see by 2016.

Further information on this technology will be revealed during APU '13, AMD's Developer Summit taking place in San Jose between November 11 and November 14 2013 that will offer "14 different tracks with over 140 individual presentations".

TOPICS

Tarun Iyer was a contributor for Tom's Hardware who wrote news covering a wide range of technology topics, including processors, graphics cards, cooling systems, and computer peripherals. He also covered tech trends such as the development of adaptive all-in-one PCs.

31 Comments Comment from the forums

wintermint

"Lover development costs as the more efficient architecture enables less people to do the same work" I think we should lower the development costs instead of trying to love it :P
Reply
RazberyBandit

"the Radeon 9700 Pro could provide a performance of 31.2 GFLOPS of performance, 5 years later the Radeon HD 2900 XT offered 473.6 GFLOPS and by 2012, the Radeon HD 7970 GHz Edition was capable of computing 4301 GFLOPS - an increase of 13,700% when compared to the HD 2900 XT."
I think you meant compared to the 9700 Pro... The difference between the 7970 and 2900XT is 908%, or 9-times the GFLOP performance.
Reply
digiex

AMD still has some aces in its sleeve. Go AMD, make this a reality, not just a press release.
Reply
wintermint

"Lover development costs as the more efficient architecture enables less people to do the same work" I think we should lower the development costs instead of trying to love it :P
Reply
rolli59

Certainly tech worth following.
Reply
Philippe Leblanc

As usual, AMD is ahead of the curve in terms of ideas. This is trully a good leap of progress and will set the tone for HPC computing for the next decade. However, they are usually behind the curve in terms of implementing their awesome ideas. AMD get this stuff to market as soon as possible!!!
Reply
slomo4sho

So which is it? Heterogeneous or uniform? Non-uniform uniform memory access! Great job naming this!
Reply
ct001

I figured this would happen sometime. It'll be nice when the ususal APIs (OpenGL/Direct3D) support this directly but until then drivers should still be able to optimize for older applications. This tight integration has been needed for a while and should open up alot of new techniques.
Reply
DjEaZy

AMD Unveils its Heterogeneous Uniform Memory Access (hUMA) Technology? Of course... it will be in PS4... the 8 core + GPU + 8Gb of GDDR5...
Reply
vmem

long story short, AMD plans to further utilize their superiority in the GPU department over Intel to boost their systems' overall performance, hoping to catch up and surpass intel in the coming years. hmm, lots of hurdles to jump through but sounds promising. nice work and good luck to you AMD!
Reply

Show more comments