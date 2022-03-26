Modern enthusiast-grade solid-state drives (SSDs) deliver unbeatable performance. Still, they tend to produce a significant amount of heat, which can hurt performance and harm data on NAND flash chips. Phison says that as drives get more advanced, they are getting hotter too, so expect high-end PCIe Gen 5 SSDs to require active cooling using a fan (perhaps, like the one pictured above). Meanwhile, the company says that it is working on reducing the heat of its controllers as applications like notebooks or compact desktops cannot accommodate large coolers.

"There are lots of things that we are doing to keep the SSD power within a reasonable envelope," said Sebastien Jean, Chief Technical Officer at Phison, in a recent interview with MSI Insider and StorageReview. "But for sure, the SSDs are going to be hotter, in the same way that CPU and GPU got hotter in the 1990s. As we move to Gen5 and Gen6, we may need to consider active cooling."

Higher Capacity Leads to Complex Controllers

To gain capacity, 3D NAND flash memory gains the number of layers or increases the number of charge levels it can store in a single cell. However, the increase in the number of layers typically reduces a cell's physical size, which fundamentally affects its ability to store charges reliably (somewhat mitigated with the adoption of new materials). In contrast, the transition from triple-level to quad-level cell architecture radically reduces the number of program/erase (P/E) cycles that a cell can endure. As both methods of increasing 3D NAND recording density coexists, this ultimately leads to the degradation of signal quality that those NAND cells produce. Hence, SSD controllers must adopt more sophisticated error correction methods to process those signals.

Those sophisticated methods feature low-density parity-check (LDPC) code algorithms. These tend to be compute-intensive, so in recent years, SSD controllers have gained complexity and compute capabilities. Since we are not going anywhere without those algorithms, controllers will keep getting more capable and hotter.

SSD controller developers like Phison adopt more sophisticated process technologies for their most advanced controllers. Still, it looks like the pace of adoption is somewhat slower than the pace of complexity gaining, which is why for now, SSD controllers for enthusiast-grade and enterprise/server applications keep gaining thermal design power (TDP).

Temperatures: 120ºC for Controller, But Only 70ºC for NAND ICs

An avid reader knows that modern CPUs and GPUs can sustain rather extreme temperatures of about 100 degrees Celsius and even higher (albeit at the cost of silicon degradation, which performs reliably within specifications but loses its overclocking capabilities). The same is true for SSD controllers. Produced by contract manufacturers like TSMC, these guys can survive temperatures of up to 120 degrees Celsius, according to Phison.

But the problem is that at 120 degrees Celsius the controller heats 3D NAND ICs, and there become much less reliable at temperatures of 75 degrees Celsius and above. So to prevent data loss, controllers usually start throttling at high temperatures, impacting performance.

"3D NAND memory can handle from 0ºC (32ºF) to anywhere between 70ºC and 85ºC (158ºF to 185ºF), depending on the grade of the NAND," said Jean, "And as heat goes up, retention of data in NAND goes down. […] If most of your data was written really hot and you read it really cold, you have a huge cross-temp swing. The SSD can handle that, translating into more error corrections. So lower maximum throughput. The sweet spot for an SSD is between 25ºC and 50ºC (77ºF to 122ºF)."

Meanwhile, not all SSD controllers are complex and power-hungry; devices for mainstream SSDs have reasonable power consumption and temperatures. They fit into laptops and will keep fitting there for the foreseeable future. As a bonus, they are going to increase performance as well.

Cooling: Many Ways

SSDs tend to go into critical shut down when they detect that the temperature of the NAND is above 80 degrees Celsius, so cooling is essential for these drives. In the case of an M.2 form-factor, SSDs have two natural ways of cooling them down: conduction (via copper/gold contacts on the drive and the screw that fixes them in the position) and convection (dissipating heat into the air). However, it is not enough to cool down high-performance SSDs, so they are already equipped with large heat spreaders and will need active cooling.

“I would expect to see heatsinks for Gen5,” said Jean in the context of higher-end drives. “But eventually we will need to have a fan that’s pushing air right over the heatsink, too.”

Interestingly, makers of cheap SSDs tend to bundle plastic or nylon screw, essentially leaving them without a crucial way of cooling. Phison recommends using proper screws made of aluminum, but sometimes bill-of-materials cost prevails over reliability and performance matters.

Different SSDs

But not all SSDs can accommodate active cooling or even an oversized heatsink. For example, laptops (which represent some 75% of PC sales) cannot. Therefore, SSD developers like Phison have to adopt different strategies.

One of them is to use thinner process technology. At the exact transistor count, a 7-nm-class chip will consume less power and will emit less thermal power than a similar chip made using a 16-nm-class node. It also helps to make these chips smaller and, in some cases, cheaper to manufacture (TSMC’s N7 is significantly more expensive than TSMC’s N16, so the transition to a thinner node does not necessarily mean lower manufacturing cost).

A minor chip also means less space for physical interfaces, so designers tend to reduce the number of NAND channels in client drives, which helps with costs and reduces power consumption. In the end, high data rates from modern NAND devices support these days (we are talking about 1200 MT/s interfaces here, and with subsequent generations, these transfer rates will get higher) saturate a PCIe Gen4 x2 interface today.PC OEMs believe it is enough for mainstream systems. While Tom’s Hardware constantly agitates for the highest performance possible, we understand that many applications benefit from compact dimensions.

Summary

As SSDs gain capacity and performance, their controllers also need to gain compute capabilities and complexity, which in many cases mean an increase in power consumption. However, developers of SSD controllers — like Phison — are mitigating increasing needs for computing capabilities by using thinner process technologies and reducing the number of NAND channels and PCIe lanes.

While we Phison expects mainstream SSDs to stay cool enough to fit into laptops, it envisions that high-performance SSDs require sophisticated cooling systems with a fan.