Gfail: Gmail Goes Down for Nearly 2 Hours

Google yesterday evening took to its Gmail Blog to talk about what the company dubbed a "Big Deal." Detailing that the outage was because of a miscalculation of capacity, Ben Treynor, VP Engineering and Site Reliability Czar, wrote that the company had already thoroughly investigated what happened, and was compiling a list of things it intended to fix or improve as a result of the investigation.

It's nice to know the team has things under control, but what actually went wrong? Treynor says that yesterday morning the team took a small fraction of Gmail's servers offline to perform routine upgrades, a procedure that normally goes off without a hitch. Ben continued on to explain that this time the team underestimated the load that some recent changes (ironically, some designed to improve service availability) placed on the request routers. At approximately 12:30, a few of the request routers became overloaded and so, the load was transferred onto the remaining request routers. More became overloaded and it sort of went from there until they were all down.

Steps Google is taking to ensure the same thing doesn’t happen again include increasing request router capacity, and figuring out a way to make sure problems in datacenter A don't affect datacenter B.

How many of you were unable to access your Gmail yesterday? Let us know in the comments below how you dealt with the outage!

Create a new thread in the US News comments forum about this subject
This thread is closed for comments
37 comments
    Your comment
    Top Comments
  • It didn't affect me. I would like to add that I hardly ever have service issues with GMail. Kudos to them for addressing the problem and being open about it, and for excellent FREE service.
    15
  • It doesn't make it a "Gfail" - sheesh.
    13
  • Other Comments
  • Only the web interface was down. Gmail worked fine via POP throughout the "outage."
    5
  • It didn't affect me. I would like to add that I hardly ever have service issues with GMail. Kudos to them for addressing the problem and being open about it, and for excellent FREE service.
    15
  • It affected me as far as accessing the web site but POP worked fine. Google still rocks for getting it back up so fast. Keep doing your thing Google!
    3