Arm has announced its first 64-bit Cortex-R82 core that can run both real-time and high-level operating systems. The new Cortex-R82 features double the performance of its predecessor and is aimed primarily at ultra-high-end SSDs that need up to 1 TB of DRAM, all-flash arrays as well as emerging in-storage processing applications.
Storage Needs More Compute Performance
Modern SSDs require quite significant compute performance to decode signals from novel types of memory, such as 3D QLC NAND. In-storage processing applications, such as SSDs with compute capabilities, are only beginning their journey, but promise to get quite useful both for datacenter and edge servers.
The two types of storage devices are expected to demand considerably higher compute performance than they have today along with a number of other features, but not at a cost of significantly higher power consumption.
According to Arm, 85% of HDD and SSD controllers today are based on its cores. A substantial share of modern SSD controllers use Arm’s rather mature Cortex-R5 or Cortex-R8 cores, whereas SSDs supporting in-storage processing rely on ASICs or FPGAs that use Arm’s Cortex-A53 cores that were not originally designed for SSDs and are generally not the most optimal choice for storage.
Arm’s Cortex-R82: Both for Storage and Compute
Arm’s Cortex-R82 — the company’s first 64-bit R-series processor core that is based on the Armv8-R architecture — can be equipped with a memory protection unit (MPU) to execute bare metal and RTOS as well as a memory management unit (MMU) to execute high-level operating systems. Furthermore, the core supports optional Neon accelerators for machine learning (ML) and floating-point computations that will be particularly useful for storage applications with compute capabilities. In addition, the Cortex-R82 core features a 40-bit memory addressing and can address up to 1 TB of DRAM, which allows to build in-storage processing applications with a lot of memory as well as client SSDs with more than 4 GB of DRAM.
The Cortex-R82 core is designed to run at over 1.80 GHz (when made using a 5 nm process technology and implemented using standard performance-cell libraries) and has all the low-latency parts for real-time applications, including Tightly Coupled Memories (TCMs), caches, and low latency ports. Meanwhile, the core can run both Linux and RTOS at the same time, which gives a lot of flexibility for controller developers.
As far as scalability is concerned, Arm says that the Cortex-R82 can be used in clusters with up to eight cores. Meanwhile, the company brings examples of ‘typical’ quad-core clusters, which might give an idea what to expect from future SSD controllers.
One of the major Arm Cortex-R82 features is its considerably higher performance when compared to its predecessor, the Arm Cortex-R8 launched several years ago and currently used for SSD controllers. According to Arm, the Cortex-R82 is typically 1.74x ~ 2.25x faster than the Cortex-R8 in real-world applications. Furthermore, the new core is said to be 21% and 23% faster than the Cortex-A55 in SPECint2006 and SPECfp2006, respectively.
As for efficiency, Arm’s Cortex-R82 offers over 30 DMIPS per mW, based on Arm’s internal preliminary estimates.
Higher performance of the Cortex-R82 compared to other solutions aimed at the storage market will let developers of SSD controllers to use more sophisticated ECC algorithms, which has two key implications on actual drives. Firstly, more advanced ECC technologies make SSDs more reliable in general. Secondly, sophisticated ECC opens doors to usage of new types of NAND memory, which allows to boost drive capacities and lower costs-per-TB.
One Chip, Multiple Devices
The ability of Arm’s Cortex-R82 to run both compute and real-time storage workloads at the same time enables developers of SSD controllers to target both traditional and in-storage compute applications with only one controller SoC. This will somewhat reduce their mask-set costs that tend to be high on leading-edge processes. Furthermore, this will enable to build controllers that will be able to run different workloads at different times.
Such an approach may not be the most optimal for client storage, but for controllers aimed at datacenters and edge servers, reducing the number of SKUs might make sense.
Available for Licensing
Arm’s Cortex-R82 core is now available for licensing along with a suite of technologies and tools to enable its implementation. Arm is also developing a TSMC 7FF POP implementation of the Cortex-R82 to allow controller makers to just throw-in a ready core into their designs.
Developers of SSD controllers rarely use leading-edge manufacturing technologies like TSMC’s N7. Meanwhile, even Arm itself describes a quad-core Cortex-R82-based cluster implemented using a 5 nm fabrication process.
Perhaps, as performance requirements of storage devices increase, Arm expects designers of SSD and HDD controllers to switch to more advanced nodes with Cortex-R82 cores.