Phison: Enthusiast PCIe 5.0 SSDs Will Require Active Cooling

ElecGear
(Image credit: ElecGear)

Modern enthusiast-grade solid-state drives (SSDs) deliver unbeatable performance. Still, they tend to produce a significant amount of heat, which can hurt performance and harm data on NAND flash chips. Phison says that as drives get more advanced, they are getting hotter too, so expect high-end PCIe Gen 5 SSDs to require active cooling using a fan (perhaps, like the one pictured above). Meanwhile, the company says that it is working on reducing the heat of its controllers as applications like notebooks or compact desktops cannot accommodate large coolers. 

"There are lots of things that we are doing to keep the SSD power within a reasonable envelope," said Sebastien Jean, Chief Technical Officer at Phison, in a recent interview with MSI Insider and StorageReview. "But for sure, the SSDs are going to be hotter, in the same way that CPU and GPU got hotter in the 1990s. As we move to Gen5 and Gen6, we may need to consider active cooling."

Higher Capacity Leads to Complex Controllers

To gain capacity, 3D NAND flash memory gains the number of layers or increases the number of charge levels it can store in a single cell. However, the increase in the number of layers typically reduces a cell's physical size, which fundamentally affects its ability to store charges reliably (somewhat mitigated with the adoption of new materials). In contrast, the transition from triple-level to quad-level cell architecture radically reduces the number of program/erase (P/E) cycles that a cell can endure. As both methods of increasing 3D NAND recording density coexists, this ultimately leads to the degradation of signal quality that those NAND cells produce. Hence, SSD controllers must adopt more sophisticated error correction methods to process those signals.  

Those sophisticated methods feature low-density parity-check (LDPC) code algorithms. These tend to be compute-intensive, so in recent years, SSD controllers have gained complexity and compute capabilities. Since we are not going anywhere without those algorithms, controllers will keep getting more capable and hotter. 

SSD controller developers like Phison adopt more sophisticated process technologies for their most advanced controllers. Still, it looks like the pace of adoption is somewhat slower than the pace of complexity gaining, which is why for now, SSD controllers for enthusiast-grade and enterprise/server applications keep gaining thermal design power (TDP).

Temperatures: 120ºC for Controller, But Only 70ºC for NAND ICs

An avid reader knows that modern CPUs and GPUs can sustain rather extreme temperatures of about 100 degrees Celsius and even higher (albeit at the cost of silicon degradation, which performs reliably within specifications but loses its overclocking capabilities). The same is true for SSD controllers. Produced by contract manufacturers like TSMC, these guys can survive temperatures of up to 120 degrees Celsius, according to Phison.  

But the problem is that at 120 degrees Celsius the controller heats 3D NAND ICs, and there become much less reliable at temperatures of 75 degrees Celsius and above. So to prevent data loss, controllers usually start throttling at high temperatures, impacting performance. 

"3D NAND memory can handle from 0ºC (32ºF) to anywhere between 70ºC and 85ºC (158ºF to 185ºF), depending on the grade of the NAND," said Jean, "And as heat goes up, retention of data in NAND goes down. […] If most of your data was written really hot and you read it really cold, you have a huge cross-temp swing. The SSD can handle that, translating into more error corrections. So lower maximum throughput. The sweet spot for an SSD is between 25ºC and 50ºC (77ºF to 122ºF)."

Meanwhile, not all SSD controllers are complex and power-hungry; devices for mainstream SSDs have reasonable power consumption and temperatures. They fit into laptops and will keep fitting there for the foreseeable future. As a bonus, they are going to increase performance as well.

Cooling: Many Ways

SSDs tend to go into critical shut down when they detect that the temperature of the NAND is above 80 degrees Celsius, so cooling is essential for these drives. In the case of an M.2 form-factor, SSDs have two natural ways of cooling them down: conduction (via copper/gold contacts on the drive and the screw that fixes them in the position) and convection (dissipating heat into the air). However, it is not enough to cool down high-performance SSDs, so they are already equipped with large heat spreaders and will need active cooling.

“I would expect to see heatsinks for Gen5,” said Jean in the context of higher-end drives. “But eventually we will need to have a fan that’s pushing air right over the heatsink, too.” 

Interestingly, makers of cheap SSDs tend to bundle plastic or nylon screw, essentially leaving them without a crucial way of cooling. Phison recommends using proper screws made of aluminum, but sometimes bill-of-materials cost prevails over reliability and performance matters.

Different SSDs

But not all SSDs can accommodate active cooling or even an oversized heatsink. For example, laptops (which represent some 75% of PC sales) cannot. Therefore, SSD developers like Phison have to adopt different strategies.

One of them is to use thinner process technology. At the exact transistor count, a 7-nm-class chip will consume less power and will emit less thermal power than a similar chip made using a 16-nm-class node. It also helps to make these chips smaller and, in some cases, cheaper to manufacture (TSMC’s N7 is significantly more expensive than TSMC’s N16, so the transition to a thinner node does not necessarily mean lower manufacturing cost).

A minor chip also means less space for physical interfaces, so designers tend to reduce the number of NAND channels in client drives, which helps with costs and reduces power consumption. In the end, high data rates from modern NAND devices support these days (we are talking about 1200 MT/s interfaces here, and with subsequent generations, these transfer rates will get higher) saturate a PCIe Gen4 x2 interface today.PC OEMs believe it is enough for mainstream systems. While Tom’s Hardware constantly agitates for the highest performance possible, we understand that many applications benefit from compact dimensions.

Summary

As SSDs gain capacity and performance, their controllers also need to gain compute capabilities and complexity, which in many cases mean an increase in power consumption. However, developers of SSD controllers — like Phison — are mitigating increasing needs for computing capabilities by using thinner process technologies and reducing the number of NAND channels and PCIe lanes. 

While we Phison expects mainstream SSDs to stay cool enough to fit into laptops, it envisions that high-performance SSDs require sophisticated cooling systems with a fan.

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • Phaaze88
    Admin said:
    Active cooling will be required for high-performance PCIe 5.0 SSDs, as other drives will get hotter.
    This is supposed to say the NAND ICs, right?
    Reply
  • hotaru251
    still waiting for immersion coolign system to advance....thats about at point we are heading.
    Reply
  • InvalidError
    The other obvious option: wait for the 2nd or 3rd-generation PCIe 5.0 controllers that will likely be far more power-efficient and run that much cooler.

    Right now, there is almost no meaningful difference between a good SATA SSD and the fastest NVMes currently available for most everyday uses besides massive file copying, especially when you have enough RAM to keep working data cached, so there is no need to rush to replace your current 3.0x4 or 4.0x4 NVMe SSD.
    Reply
  • Geef
    A fan that small will most definitely make noise you don't want. Just having the extra metal from the cooler without a fan in the middle of the case would work well. Most cases have good airflow anyway.
    Reply
  • InvalidError
    Geef said:
    A fan that small will most definitely make noise you don't want. Just having the extra metal from the cooler without a fan in the middle of the case would work well. Most cases have good airflow anyway.
    The 5.0x4 slot would be either under the GPU HSF or squeezed between the CPU HSF and GPU backplate due to trace length limitations at least without PCIe 5.0 re-timers, two of the worst possible locations for airflow no matter how good the case ventilation may be overall.
    Reply
  • randomizer
    InvalidError said:
    The 5.0x4 slot would be either under the GPU HSF or squeezed between the CPU HSF and GPU backplate due to trace length limitations at least without PCIe 5.0 re-timers, two of the worst possible locations for airflow no matter how good the case ventilation may be overall.

    And not very good locations for trying to fit active cooling either.
    Reply
  • So this is a completely stupid and unusable product. Got it.
    Reply
  • watzupken
    At this point,
    Higher CPU power requirements - Checked
    Higher MOBO power requirements - Checked
    Higher GPU power requirements - Checked
    High SSD power requirements - Checked
    Potentially higher DDR5 power requirements as we start pushing clockspeed upwards. In summary, the system is getting quite substantially more power hungry and hot. The future of PCs seems to be trending towards an ”eco-unfriendly” space heater.
    Reply
  • InvalidError
    watzupken said:
    Potentially higher DDR5 power requirements as we start pushing clockspeed upwards. In summary, the system is getting quite substantially more power hungry and hot. The future of PCs seems to be trending towards an ”eco-unfriendly” space heater.
    The bleeding-edge has been a power hog ever since parts capable of blowing total system power past 500W have become available. 500W GPUs aren't exactly new, they just used to be SLI/CF-on-a-card type monstrosities instead of single GPUs.

    If you go the other way, as in looking into the comfortably viable average office, point-of-sale and other everyday use computers, many of those are being replaced by smartphones, tablets, handhelds, laptops and other low-power devices. Heaps of everyday compute is actually getting far more power-efficient.

    One thing to make sure to keep in mind is that if you need 30% more power for 100% more performance, you still have 50% better overall performance per watt, which means efficiency is technically still improving nicely.
    Reply
  • jp7189
    Wait.. the m.2 screw is a significant contributor to heat dissipation?! The controller is all the way at the other end and even the NAND packages aren't close enough to benefit as far as I can imagine.

    What am I failing to grasp here?
    Reply