Snapdragon 820 Architecture
Qualcomm has remained tight-lipped about its latest processor designs. Unlike ARM’s full-disclosure model, Qualcomm is much more Apple-like when it comes to releasing low-level information, particularly about its GPUs.
While technical details are in short supply, our curiosity about Snapdragon 820’s most interesting component—the 64-bit Kryo CPU core—remains high. The move from TSMC's 20nm HKMG planar process to Samsung’s 14nm FinFET process should mitigate the thermal issues experienced by Snapdragon 810 and allow the 820 to use less power and/or reach higher clock speeds. According to Qualcomm, with the “Kryo CPU and Snapdragon 820, you can expect up to two times the performance and up to two times the power efficiency” when compared to the A57 CPU in Snapdragon 810. This is certainly a bold claim, but based on the overheating issues we observed with the 810, coupled with the excellent performance from Samsung’s Exynos 7420 SoC, which uses Samsung’s first-generation 14nm LPE (Low Power Early) FinFET process, it might be possible. It’s not clear if the 820 will use LPE or Samsung’s second-generation 14nm LPP (Low Power Plus) FinFET process, which, according to Samsung, can achieve 10% higher frequency at lower power than LPE because of a better fin aspect ratio.
As mobile workloads continue to evolve, so do mobile SoCs. One parameter in constant flux is the optimal number of CPU cores, with designs ranging from Apple’s A9, which uses two CPU cores, to MediaTek’s Helio X20, which uses ten CPU cores in a tri-cluster, big.LITTLE arrangement. According to Tim McDonough, Qualcomm's VP of Marketing, “people don't really need more than four cores.” While this statement will likely spark some heated debate, Qualcomm’s data seems to be pointing it in that direction since the Snapdragon 820 uses four Kryo CPU cores in a dual-cluster, heterogeneous configuration. While the underlying architecture of each CPU core is the same, the clusters are optimized to operate at different frequencies and power levels, sort-of like ARM’s big.LITTLE approach. The two Kryo cores in the lower-power “Silver” cluster operate at frequencies up to 1.6GHz and share a 512KB L2 cache. The second pair of Kryo cores in the higher-performing “Gold” cluster operate at frequencies up to 2.2GHz and share a 1MB L2 cache. While L2 cache is not shared between the Gold and Silver clusters, the two L2 caches use a snooping mechanism to maintain coherency. Unlike Apple’s A9, Snapdragon 820 does not use an L3 cache. Qualcomm says it considered using an L3 cache, but ultimately decided the benefits did not outweigh the additional cost in power and die space for its design. Qualcomm is not divulging any lower-level details of its Kryo architecture, so we’ll have to see what, if anything, we can infer from our test data.
Qualcomm's Snapdragon 8xx Flagship Family
|Snapdragon 820||Snapdragon 810||Snapdragon 805||Snapdragon 801|
|Manufacturing Process||14nm FinFET||20nm HKMG||28nm HPm||28nm HPm|
|Architecture||ARMv8-A (32/64-bit)||ARMv8-A (32/64-bit)||ARMv7-A (32-bit)||ARMv7-A (32-bit)|
|CPU||Qualcomm Kryo (2x @ 2.15GHz + 2x @ 1.59GHz)||ARM Cortex-A57 (4x @ 2.0GHz) + ARM Cortex-A53 (4x @ 1.5GHz) [big.LITTLE]||Qualcomm Krait 450 (4x @ 2.65GHz)||Qualcomm Krait 400 (4x @ 2.45GHz)|
|GPU||Qualcomm Adreno 530 @ 624MHz||Qualcomm Adreno 430 @ 630MHz||Qualcomm Adreno 420 @ 600MHz||Qualcomm Adreno 330 @ 578MHz|
|Memory Interface||LPDDR4-1866 2x 32-bit (29.9GBps)||LPDDR4-1600 2x 32-bit (25.6GBps)||LPDDR3-800 2x 64-bit (25.6GBps)||LPDDR3-800/933 2x 32-bit (12.8/14.9GBps)|
|Camera ISP||14-bit dual ISPs (1.5GP/s throughput, image sensors up to 2x 25MP)||14-bit dual ISPs (1.2GP/s throughput, image sensors up to 55MP)||12-bit dual ISPs (1.2GP/s throughput, image sensors up to 55MP)||dual ISPs (930MP/s throughput, image sensors up to 21MP)|
|DSP||Hexagon 680 @ less than 1GHz||Hexagon V56 @ 800MHz||Hexagon V50 @ 800MHz||Hexagon V50 @ 800MHz|
|Integrated Modem||X12, LTE Cat 12/13, up to 600 Mbps DL & 150 Mbps UL||X10, LTE Cat 9, up to 450 Mbps||✗||MDM9x25, LTE Cat 4, up to 150 Mbps|
While information about Kryo is scarce, details about Snapdragon 820’s Adreno 530 GPU is nonexistent. Beyond the name, the only thing we know for sure is that it runs at 133-624MHz. When pushed for more info, Qualcomm said it made lots of small architectural changes throughout the design, which would imply that the Adreno 530 is not a drastic redesign, but an evolution of the Adreno 430. One of the changes it mentioned was making better use of data compression when moving data around within the GPU in order to reduce power consumption.
Given Qualcomm’s focus on heterogeneous computing, it’s no surprise that the GPU and CPU can both snoop into the other’s cache, enabling better sharing of data, since both processors use 64-bit virtual addresses. With a heavy focus on compute capability, we also expect the Adreno 530 to further improve ALU performance, something which Qualcomm has done for the past several generations.
The Adreno 530 supports the latest graphics API standards, including OpenGL ES 3.1 + Android Extension Pack, DirectX 12, and Vulkan (once ratified by Khronos). Like the Adreno 430, the 530 includes a dedicated fixed-function block in hardware for accelerating tessellation.
The Snapdragon 820 comes with the brand-new Kryo CPU core, a new Adreno 530 GPU, as well as a new Image Signal Processor (Spectra ISP) and a Digital Signal Processor (Hexagon 680 DSP). Each of them come with significant boosts in performance over the previous generation, but they can also work together through heterogeneous computing to finish tasks more than twice as fast than if only the CPU was being used and save up to 40 percent energy.