Igor's Lab reports that AMD's new Ryzen 7000 burnout firmware fix, AGESA ComboAM5 patch 220.127.116.11, addresses further bugs surrounding Ryzen 7000's temperature control system. The new combo patch now sets a forced SoC voltage limit of 1.3v and packs in new SMU bug fixes to ensure the processors do not go beyond their temperature specifications.
AMD doesn't give us specific details on the issue, but apparently, there were optimization issues (in previous AGESA microcode updates) surrounding the temperature control system inside Ryzen 7000's SMU, causing the chip to operate incorrectly when hitting TJmax (i.e., its temperature ceiling). We don't know the extent of the potential damage this has caused, but there's a high likelihood this SMU bug was causing the Ryzen 7000 burnouts in conjunction with unsafe SoC voltages.
The issue particularly related to Ryzen 7000's CBS SMU_COMMON 'PROCHOT Control', and PROCHOT Deassertation Ramp Time. The former is a thermal safety mechanism that protects the CPU from overheating when the CPU reaches its targeted thermal limit. The SMU will issue a PROCHOT signal that will reduce CPU power and frequency to stay below the thermal threshold and prevent possible damage.
The latter is the inverse of the former mechanism that allows the CPU to boost power and frequency again when the thermal limit is not being hit (and the PROCHOT signal is inactive). This system is time-based, so the CPU can gradually increase power and clock speeds when there's temperature headroom to spare. This is an essential function of the SMU so that CPU clock speeds aren't bouncing all over the place due to temperature fluctuations, which would cause inconsistent performance.
According to the AMD patch notes in Igor's report, both mechanisms had no effect on Ryzen 7000 CPUs with previous AGESA code updates. We don't know what this means exactly, but it seems like the SMU was allowing the CPU to go past TJmax at least a little bit and causing performance issues with the CPU below TJmax in some way.
Again, we don't know how extensive this issue has become. Still, nonetheless, it's a big enough issue for us to highly recommend all Ryzen 7000 users to upgrade their motherboard BIOS/UEFI to a version with AMD's AGESA ComboAM5 patch 18.104.22.168 as soon as possible. The firmware update also provides many other bug fixes, including improved boot times, deep sleep fixes, curve optimizer fixes, and a plethora of DDR5 memory bug fixes.
- By following the mitigation from AMD, fix CPU SoC voltage upper limit for Ryzen 7000X3D and non-X3D series CPU, which might affect the performance of certain EXPO memory modules. - Support 48/24GB high density DDR5 memory module.
Will update when I get home. The BIOS I was running when my CPU and board died (and am running now with EXPO disabled) has been removed from that page.
My RMA was approved yesterday for the dead parts.
I dunno.... I was just talking about after reading yesterday that BIOS updates were coming to fix the issues that are blowing up PCs including mine that today there's an update for my board and Ryzen 7000 users might want to check theirs.
The link at the bottom of the page from the OP leads to https://www.tomshardware.com/news/preventing-ryzen-burnout-motherboard-makers-issue-new-firmware with the headline "Most of the major AM5 motherboard vendors have new BIOS downloads ready to limit SoC voltages to safe levels" which is why I posted.