Intel Resumes Shipping Xeon MCC Processors After Bug is Mitigated

Sapphire Rapids
(Image credit: Tom's Hardware)

We reported last week that Intel Intel has confirmed that it had paused shipments of some of its fourth-gen Xeon Sapphire Rapids processors due to a newly-discovered bug, but it hadn't set a specific date for shipments to resume. The company has now issued a statement to Tom's Hardware indicating that it has developed a firmware fix and has resumed shipments. 

"Last week, we informed you of an issue on a subset of 4th Generation Intel Xeon Medium Core Count Processors (SPR-MCC) that could interrupt system operation under certain conditions. Out of an abundance of caution, we temporarily paused some SPR-MCC shipments while we thoroughly evaluated a firmware mitigation. We are now confident the firmware mitigation addresses the issue. We have resumed shipping all versions of SPR-MCC and are working with customers to deploy the firmware as needed." — Intel spokesperson to Tom's Hardware

Intel originally paused the shipments of a subset of its oft-delayed fourth-gen Sapphire Rapids processor line after it discovered a bug that "could interrupt system operations under certain conditions," but the company still hasn't shared specifics of the now-mitigated issue.

Intel said that it decided to pause shipments of the impacted processors out of an abundance of caution while it worked on the firmware fix. Discovering new bugs in already-shipping processors certainly isn't uncommon, but it is uncommon for those types of bugs to lead to a halt in shipments, implying that this was more than a garden-variety erratum. Intel did say the firmware mitigation isn't expected to have any performance impacts, though.

Intel's fourth-gen Xeon has two types of underlying designs: The XCC package, which employs four compute tiles (die) to create a single chip, and the MCC package, which uses a single monolithic die. As shown in the slides above, the MCC design is used for chips up to 32 cores, which are the source of high-volume sales for Intel, while the XCC variants are used for the halo chips between 36 and 60 cores.

The bug impacted only the models built on the MCC die. Intel hasn't confirmed how long it paused shipments, though unofficial reports indicate that the pause began in mid-June. Intel says it is now resuming shipments of the impacted models and is distributing its firmware fix to its partners, meaning that the company won't be required to replace any of the chips it has already shipped to customers. 

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • rluker5
    Seems like a pretty responsible way to handle the problem: found, stopped from spreading, fixed, shipments resumed, and fix sent to owners of parts already shipped.

    It also seems like Intel would receive at least some negative press if they released more details that people could speculate on. While I would prefer to know more, I can understand that there may be reasons why Intel would not want to share more information.
    Reply
  • thestryker
    Usually the details on stuff like this don't get publicized until the products are EOL and they release full documentation unless there's some sort of exploit that can hit it.

    The most surprising part of this to me is as bit_user commented on the original article that it's impacting the MCC not the XCC CPUs.

    On a complete tangent I'm still hoping they drop prices on the MCC die Xeon X SKUs. Not that the platform is affordable at all, but if the CPUs (at least the 12/16c) were closer in price to their desktop counterparts I think the market would open up more.
    Reply
  • kjfatl
    I wouldn't be surprised if the test engineers at the factory kept this from shipping. There aren't a lot of these parts made and the next batch might be a month or more away. They might have wanted a few days to test their fix with a reasonable number of parts in order to make sure the fix really works.
    Reply
  • bit_user
    thestryker said:
    The most surprising part of this to me is as bit_user commented on the original article that it's impacting the MCC not the XCC CPUs.
    Yeah, because (for anyone who doesn't know) the XCC version is the multi-die one and therefore presumably much more complex.

    thestryker said:
    On a complete tangent I'm still hoping they drop prices on the MCC die Xeon X SKUs. Not that the platform is affordable at all, but if the CPUs (at least the 12/16c) were closer in price to their desktop counterparts I think the market would open up more.
    I gave up my last hopes of them being remotely affordable, when it came out that they would use the same physical socket as the XCC version.

    That's okay, though. Since mainstream CPUs reversed the trend of cutting down on PCIe lanes, and now that they're available with up to 32 threads, I don't really need a full-fledged workstation anyhow.
    Reply
  • bit_user
    the MCC design is used for chips up to 32 cores, which are the source of high-volume sales for Intel, while the XCC variants are used for the halo chips between 36 and 60 cores.
    That's funny. I'd have assumed the XCC is where most of their volume lies. Since hyperscalers seem to care about density as much as anything else, I'd guess they want the one with more cores and accelerators.
    Reply
  • thestryker
    bit_user said:
    I gave up my last hopes of them being remotely affordable, when it came out that they would use the same physical socket as the XCC version.

    That's okay, though. Since mainstream CPUs reversed the trend of cutting down on PCIe lanes, and now that they're available with up to 32 threads, I don't really need a full-fledged workstation anyhow.
    Me too, but I was hoping the 2455X would be around $800. I'd still much rather have a proper PCIe lane setup with the extra memory bandwidth (without super clocking the DDR5 though it seems pretty much all 13th gen can do 7200 without much trouble). My current system has been in use since 2016 and the one it replaced was from 2008. For my desktop usage I could go with a regular desktop setup now and probably be okay with it though I would have to buy a new 10gb card. The HEDT platforms just had more longevity built into them so I'd rather go that route if it wasn't for the cost being basically double now.

    The machine I use for 24/7 operation wouldn't have worked with a regular desktop setup if I wasn't going TrueNAS instead of Windows this time. Not enough lanes for full NAND SSD performance, but there were for P1600x drives. As it was I still had to go buy some bifurcation cards since Intel client platforms (including W690) can't do anything other than x8 splits.
    Reply