Behind the Tech: Sandy Bridge Recall, An Insider's Story

The Timeline

When Intel announced its stop-shipment, the implication was that the company had just concluded its investigation. Everyone assumed that Intel was preemptively pulling chipsets back from the channel before a real problem occurred. However, according to our friends in the motherboard business, this turned out to be only one part of the story.

Out of seven vendors surveyed, we learned that there was at least one company outside of Intel that knew about the SATA 3Gb/s PLL clocking tree error prior to January 31. Intel acknowledged that one of its OEM partners discovered the problem in November when reference designs were being built. Much of this is sensitive information, and it is possible that other companies were part of the discovery process. However, we are going to follow the story of how the problem was discovered at one specific company.

In early November, a team of engineers exposed some odd behavior as they were making the reference design for the B1 version of the H67/P67 chipsets. At that stage, there was an internal debate within the company before it approached Intel with the problem, because it was unsure if the fault was with its motherboard design. After the Pentium FDIV bug and Rambus bug, the company thought it was unlikely that Intel could have let this issue through. Eventually, the manufacturer brushed aside the strange behavior as an idiopathic aberration of early designs. However, Intel was notified.

Sometime in late November or early December, motherboard manufacturers started receiving the B2 version of the chipset. This is when the company’s R&D engineers discovered that the strange leakage behavior persisted through validation. As a result, the discussion within that company became direr. According to our source, this is the point at which Intel began to take the problem more seriously.

We don’t have an exact date as to when Intel produced a solution (the B3 revision) to the problem, but we do know that the bug was identified in late December or early January. It takes weeks to redo the R&D, retest it, send the design to the fab, produce the chipset, package it, and test it before it can go into final production. This type of process requires a two- to three-month lead time, which resulted in the product being shipped during the third week of February.

Intel was able to reduce the lead time for one reason: the problem was resolved at the final metal layer. Therefore, there was no need to completely retool the chipset fab process. In addition, this was how the company avoided a complete halt of chipset production. The only thing we don't know is if Intel knew about the solution prior to the Sandy Bridge launch on Jan. 3. From what we have been told, it seems unlikely. If it did, a three-month launch delay wouldn't have been a terrible price to pay. On the other hand, some of our sources believe Intel was attempting to capitalize on the hype during CES and issue a long-term warranty process. This part of the time table is a bit unclear, but spending a billion dollars on hype is certainly unlikely.

Intel made a deliberate decision to announce its stop-shipment when it did. For those unaware, the notification went out the day before many companies in Taiwan and China started celebrating the Chinese Lunar New Year. As a result, most companies were caught completely off guard. The majority of our motherboard contacts believe this was a strategic move on Intel's part to reduce the amount of "organized outrage from Taiwan," as one person said. For many people, this is the only time during the year when they get a vacation, so there was a high level of frustration.

For example, one top marketing official who was on vacation outside of Taiwan received a personal holiday email greeting from the VP. Three hours later, he received another email from the same VP apologizing and asking him to come back to the Taiwan office. Some companies recalled large parts of their marketing as well as their engineering staff to begin working on tools that would assist customers to identify what ports their drives were connected. Gigabyte was the first to release its 6-Series SATA check utility,  and MSI followed with its SATA Scan.

For many manufacturers, the stop-shipment was a nightmare because management had to recall vacationing (and consequently unavailable) staff members. Some vendors were better off than others. A few small motherboard companies had just started to ship H67/P67 motherboards. The first batch of motherboards was literally sitting on a boat waiting to sail. In that specific example, there was no need to issue a customer-wide recall  notice.

  • Assmar
    Lol, that neighborhood watch picture still creeps me out to this day.
    Reply
  • acku
    9512301 said:
    Lol, that neighborhood watch picture still creeps me out to this day.

    Well that's why you shouldn't talk to strangers. :)
    Reply
  • dogman_1234
    So Intel noticed the problem before hand and did nothing until it became apparent? I am not one to criticize Intel, but if any error is noticed, i would warn of it immediately.
    Reply
  • ferelden
    Sandybridge performance is just too beastly for this to affect it long term, most people who have just come into the cpu market to upgrade didn't even realize there was a recall in the past and just got a b3 p67 and i5-2500k without any hesitation
    Reply
  • Stardude82
    Knowing the problem, I'd totally buy a B2 board at a serious discount. Most PC's only have 2 drives anyway and SATA add-on cards are cheap.
    Reply
  • f-14
    i didn't see any problem for sales, people just bought core i-7 9XX's
    AMD could crow all they wanted, they didn't have a new product going to the shelf capable of beating the previous i-series. now had AMD released llano during this, they would have made a pre-emeptive death knell strike against sandybridge, but they didn't so it didn't matter what happened until AMD does.
    Reply
  • f-14
    oh and also the problem could have been more easily rectified by intel giving out coupons for sata3 devices priced to match sata2 devices and pull it off as a promo that would have sold them millions more in profits.
    as for the people who already bought a 'free' sata 2 device to sata3 replacement device coupon would have been cheaper also.
    Intel just blew a huge marketing opprotunity to grind AMD under the imperial capitalist yankee boot.
    IMO
    Reply
  • joytech22
    All I can say after reading this is that Intel did a great job with this issue.

    But it somehow still brings me back to how AMD is still spending money on a product we haven't even seen benchmarks about yet.

    Somebody who uses an OEM machine somewhere around the world MUST have released a benchmark of Llano or BD at some point in time.
    Reply
  • rolli59
    It is just business! I wonder if the speculations on the time line is true but most definitely the OEM's had discovered problems well before release.
    Reply
  • emergancy exit
    or people are just brand loyal to the point of insanity. and intel is cashing in and and keeping their prices high.
    Reply