Sign in with
Sign up | Sign in

GPU Performance: More is Better

Apple's iPad 2 Review: Tom's Goes Down The Tablet Rabbit Hole

Apple's A5 effectively underwent a three-part upgrade. Aside from its processor and memory, the iPad 2 now sports Imagination Technologies’ dual-core PowerVR SGX 543MP2. In comparison, the original iPad employs a single-core PowerVR SGX 535. That's the same GPU Intel used on its GMA 500, built into the Poulsbo series of System Controller Hubs for Atom.

PowerVR SGX 535PowerVR SGX 535

PowerVR SGX 543PowerVR SGX 543

GPU System-on-Chip
PowerVR SGX 535
(Apple A4)
PowerVR SGX 543
(Apple A5)
Bus Width (in bits)
Triangle rate @ 200 MHz
14 MTriangles/s35 MTriangles/s

The SGX 543 includes four USSE2 (Universal Scalable Shader Engine 2.0) pipes. The SGX 535 only has two USSE pipes. This unified shader design is similar to what we've seen from competing graphics vendors for years, as it allows vertex and pixel shader code to share the same hardware. The idea is to get better performance, even if you’re rendering more of one type of shader. It’s not clear how these are second-generation shaders aside from their name, but Imagination Technologies states that the pipeline effectively delivers “twice the peak floating point and instruction throughput of the Series5 USSE.”

This isn't just about a revised SIMD architecture though. Apple also doubles the number of rendering pipelines again by placing two SGX 543 GPU cores on the A5 (the MP2 in its name represents that pair of GPU cores). This helps account for the iPad 2’s quadrupling of available GPU resources.

GLBenchmark 2.0GLBenchmark 2.0

The GPU clock rate remains an unknown, so benchmarks remain the best way to determine effective performance. Unfortunately, it's difficult to measure real-world graphics capabilities in a meaningful way. There's no real equivalent to Fraps, and we're still a ways away from game developers including frame rate counters in their code. That’s why we turning to GLBenchmark 2.0, a synthetic OpenGL ES 2.0 metric that emphasizes texture performance. Think of it as the 3DMark of mobile devices.

GLBenchmark 2.0Apple iPadApple iPad 2Motorola Xoom
Triangle Test (textured), Mtriangles/s8.629.015.3
Triangle Test (textured, fragment lit), Mtriangles/s4.219.98.6

Like on a desktop, graphics rendering on a tablet begins with an application sending a GPU an array of vertices, vertex shaders, fragment shaders, and a bunch of other control information. The sum of this information is used to draw millions (or billions) of triangles that are used to assemble a larger 3D object. It's important to know the number of triangles that a GPU is capable of rendering because more triangles translates into greater graphics detail. GLBenchmark offers a glimpse into real-world triangle performance because it measures the triangle rate for an actual gaming scene. The results aren't that much of a surprise. At 29.0 Mtriangles/s, the second-generation iPad delivers 3x the performance of its predecessor. This means that game developers can conceivable increase geometric detail three-fold on the iPad 2 and get the same performance out of the original iPad.

The fragment lit test taxes texturing performance, with an additional focus on lighting. Thus, it's a more stressful benchmark. As the geometry becomes more complex, the iPad 2 demonstrates its improved handling of more detailed graphics workloads. It actually delivers about five times more performance than the original iPad.

GLBenchmark 2.0
Frames per set duration
Apple iPadApple iPad 2Motorola Xoom
Egypt frames (frames)
Pro (frames)

The performance of an actual graphics scene is easier to understand. When you look at this in terms of frames rendered in a set period of time, you're getting a lot more performance with the iPad 2. Conservatively, you're looking at least 3x more frames rendered according to the Pro test, and up to 8x more according to the Egypt test.

Comparing GPU Performance: Words of Caution

If you really want to go to the trouble of researching tablet-based graphics performance (there may be a few of you out there), bear in mind that potential won't always match up to the numbers you see in the real world. The form factor's constraints prevent vendors from pairing graphics hardware with the memory that'd best demonstrate its peak specifications, for example. Instead, you end up with the configuration that hits the performance profile needed, and nothing more. 

On the desktop, a graphics card manufacturer has the freedom to balance performance between a GPU and its memory subsystem, altering data rate, memory bus width, and capacity to best exploit the processor's capabilities, whether they're bleeding-edge or decidedly mainstream. When you're dealing with smartphones and tablets, that's no longer the case. In order to cut back on power, minimize heat, or avoid monopolizing too much space on the PCB, engineers might tolerate a memory bottleneck in order to achieve other design goals. So, forget comparing individual and theoretical pieces of the graphics puzzle. Rather, focus on the end product's measurable performance.

GLBenchmark 2.0Apple iPadApple iPad 2Motorola Xoom
Egypt with FSAA (frames)
Pro with FSAA (frames)
Egypt with FSAA Fixed Time (sec)
Pro with FSAA Fixed Time (sec)
Swap Buffer Test (frames)
Fill Test (texture fetch) ktexel/s170980918551129897
Trigonometric Test (vertex weighted) kvertex/s103933262632
Trigonometric Test (fragment weighted) kfragment/s119135124452
Trigonometric test (balanced) kshader/s125931582543
Exponential Test (vertex weighted) kvertex/s313035352628
Exponential Test (fragment weighted) kfragment/s3774111653003
Exponential Test (balanced) kshader/s2043117351656
Common Test (vertex weighted) kvertex/s152437271973
Common Test (fragment weighted) kfragment/s163436994451
Common Test (balanced) kshader/s106541142530
Geometric Test (Vertex Weighted) kvertex/s194937761316
Geometric Test (Fragment Weighted) kfragment/s208163882888
Geometric Test (Balanced) kshader/s128161811628
For Loop Test (Vertex Weighted) kvertex/s167138601315
For Loop Test (Fragment Weighted) kfragment/s184262377271
For Loop Test (balanced) kshader/s127537183583
Branching Test (vertex weighted) kvertex/s390637782633
Branching Test (fragment weighted) kfragment/s6045225573211
Branching Test (balanced) kshader/s2106111931493
Array Test (uniform array access) kvertex/s291836583946
Triangle Test (white) ktriangle/s95482995712595
Triangle Test (textured, vertex lit) ktriangle/s70582112910520
React To This Article