Wow - EQ 2 Servers Still Down?

Archived from groups: alt.games.everquest (More info?)

What's it been - 24 hours? Any eta on this mess?

Lou
19 answers Last reply
More about servers down
  1. Archived from groups: alt.games.everquest (More info?)

    Thomas T. Veldhouse wrote:
    > Meaffwin <suka_@cox.net> wrote:
    > >
    > > /Delurk
    > >
    > > You admit you don't know much, but then you call them idiots
    because
    > > you assume they can't go back to the pre-patch setup? That's
    pretty
    > > cute.
    > >
    > > /Relurk
    > >
    >
    > I don't know much as far as details of what happened. What is
    > absolutely clear is that they don't have a backout strategy, or they
    > would have used it. Is that clear enough?

    So you think bringing up the servers, even though they don't know where
    the bug was that trashed everything, into a previous state (that
    possibly includes the bug) is a good idea. That way we can go through
    this all over again. Gotcha.
  2. Archived from groups: alt.games.everquest (More info?)

    On Sat, 18 Dec 2004 16:11:02 GMT, "Lou Vincze" <biglou@ix.netcom.com>
    wrote:

    >What's it been - 24 hours? Any eta on this mess?
    >
    >Lou
    >

    From EQ2 chat;

    GMGrog: ... back. I'll explain what I know about what happened for
    those that haven't heard.

    Manev: Norpan, there is no ETA about the servers coming up. My
    apologies for the inconvenience.

    GMGrog: (It's 8:24am PST now for time reference to your own time
    zone.) Yesterday at 7:00am servers went down briefly, usually
    morning reset. They came back up at 7:30am and folks logged in...

    GMGrog: Players logging in or after zoning a few times found their
    quest journal was empty. (Not everyone but very many.)

    GMGrog: The recipe books were empty, spells were missing like Call of
    Qeynos/Overlord or Orc Master Strike.

    GMGrog: They brought the servers back down at 10:00am. They'd been up
    for 2.5 hours.

    Manev: Thanks for the information Duvaries. Our development team
    working on it.

    GMGrog: Devs, ops, programmers, all were working on the problem by
    them already.

    GMGrog: The servers have been down since. They've made progress,
    servers are up internally now being tested. They still don't have an
    hard time for when the servers will be back up.

    GMGrog: They rolled everything back to 7:00am PST before the servers
    reset. That means anyone that was online, the things that happened
    during the 2.5 hour up time are gone as if they didn't happen.

    GMGrog: They know how frustrating this very long wait is. They'll be
    wiping all exp debt and giving free play time. They haven't explained
    that "free time" in detail yet, but will once everything is back on
    track.

    GMGrog: That's pretty much all I know.

    GMGrog: (cut & pasting that to save it because I keep typing all that
    out every 30 mins or so...)
  3. Archived from groups: alt.games.everquest (More info?)

    Meaffwin <suka_@cox.net> wrote:
    >
    > /Delurk
    >
    > You admit you don't know much, but then you call them idiots because
    > you assume they can't go back to the pre-patch setup? That's pretty
    > cute.
    >
    > /Relurk
    >

    I don't know much as far as details of what happened. What is
    absolutely clear is that they don't have a backout strategy, or they
    would have used it. Is that clear enough?

    --
    Thomas T. Veldhouse
    Key Fingerprint: 2DB9 813F F510 82C2 E1AE 34D0 D69D 1EDC D5EC AED1
    Spammers please contact me at renegade@veldy.net.
  4. Archived from groups: alt.games.everquest (More info?)

    On 18 Dec 2004 17:11:59 GMT, "Thomas T. Veldhouse" <veldy71@yahoo.com>
    wrote:

    >Meaffwin <suka_@cox.net> wrote:
    >>
    >> /Delurk
    >>
    >> You admit you don't know much, but then you call them idiots because
    >> you assume they can't go back to the pre-patch setup? That's pretty
    >> cute.
    >>
    >> /Relurk
    >>
    >
    >I don't know much as far as details of what happened. What is
    >absolutely clear is that they don't have a backout strategy, or they
    >would have used it. Is that clear enough?

    I don't think the delay is rolling back, I think the delay is finding
    which obscure line of code caused things to go to hell and back.

    And, you know, any IT guy can tell you that even the best "back out
    strategy" can fail.

    --
    Dark Tyger

    Sympathy for the retailer:
    http://www.actsofgord.com/index.html
    "Door's to your left" -Gord
    (I have no association with this site. Just thought it was funny as hell)

    Protect free speech: http://stopfcc.com/
  5. Archived from groups: alt.games.everquest (More info?)

    Dark Tyger wrote:
    > On 18 Dec 2004 17:11:59 GMT, "Thomas T. Veldhouse" <veldy71@yahoo.com>
    > wrote:
    >
    >
    >>Meaffwin <suka_@cox.net> wrote:
    >>
    >>>/Delurk
    >>>
    >>>You admit you don't know much, but then you call them idiots because
    >>>you assume they can't go back to the pre-patch setup? That's pretty
    >>>cute.
    >>>
    >>>/Relurk
    >>>
    >>
    >>I don't know much as far as details of what happened. What is
    >>absolutely clear is that they don't have a backout strategy, or they
    >>would have used it. Is that clear enough?
    >
    >
    > I don't think the delay is rolling back, I think the delay is finding
    > which obscure line of code caused things to go to hell and back.
    >
    > And, you know, any IT guy can tell you that even the best "back out
    > strategy" can fail.
    >
    I gather from what I have read that they just rebooted the machines and
    they went fubar in the next few hours. If that's the case they are
    tracking down an active bug that is very very serious, that just
    suddenly reared it's head.
  6. Archived from groups: alt.games.everquest (More info?)

    "Dark Tyger" <darktiger@somewhere.net> wrote in message
    news:icp8s0db047rumbbv5kddgm7nfha7s6m3d@4ax.com...
    > On 18 Dec 2004 17:11:59 GMT, "Thomas T. Veldhouse" <veldy71@yahoo.com>
    > wrote:
    >>
    >>I don't know much as far as details of what happened. What is
    >>absolutely clear is that they don't have a backout strategy, or they
    >>would have used it. Is that clear enough?
    >
    > I don't think the delay is rolling back, I think the delay is finding
    > which obscure line of code caused things to go to hell and back.
    >
    > And, you know, any IT guy can tell you that even the best "back out
    > strategy" can fail.
    >
    > --
    > Dark Tyger

    Working for one of the 'big boys' in IT myself, I can definitely confirm
    that as truth. And I also know from personal experience that when a down
    system event occurs, the source of the outage or offending line of code is
    not always easily found.


    ---
    Outgoing mail is certified Virus Free.
    Checked by AVG anti-virus system (http://www.grisoft.com).
    Version: 6.0.807 / Virus Database: 549 - Release Date: 12/7/2004
  7. Archived from groups: alt.games.everquest (More info?)

    In article <41c464df$0$200$8046368a@newsreader.iphouse.net>,
    "Thomas T. Veldhouse" <veldy71@yahoo.com> wrote:
    > I don't know much as far as details of what happened. What is
    > absolutely clear is that they don't have a backout strategy, or they
    > would have used it. Is that clear enough?

    More likely is that they have a backout strategy, but this was not a
    situation in which that helps. For example, here is one possible
    sequence of events that would give results like we are seeing:

    1. They roll out a patch. People play for a couple hours and serious
    problems are found (in this case, quests disappearing).

    2. They roll back the patch. That's not sufficient, though...all that
    will do is prevent further people from losing quests and whatever else
    was zapped for the people who played.

    3. They roll back the database to restore the deleted quests. Perhaps
    this is not really a roll back, but something more time consuming, such
    as restoring the database to an alternate DB server, and then
    selectively updating the character tables on the live database from the
    restored alternate, to try to restore the quests WITHOUT rolling back XP
    or items that people acquired while the bad patch was live.

    4. They test with the restored data and the pre-patched servers, and
    find out the problem is STILL there--quests are disappearing. Upon
    investigating further, they find that the problem was not due to the
    patch at all, but rather corruption in the database--it was just a
    horrible coincidence that this showed up right after a patch (anyone who
    works in software can tell you dozens of stories of coincidences like
    that). So, they have to do a full rollback on the database (or, rather,
    restore from backup, after reinitializing the database).

    5. However, before that, they have to run a thorough check of the
    hardware to make sure it wasn't bad hardware that corrupted the DB.

    --
    --Tim Smith

    --
    --Tim Smith
  8. Archived from groups: alt.games.everquest (More info?)

    On 19 Dec 2004 15:53:58 GMT, johndoe@example.com wrote:

    >I'm a software guy, buy I've seen
    >hardware guys at work, and usually they can identify issues pretty
    >quickly using self-diagnostics and such (assuming you're using "real"
    >hardware and have a "real" support contract).

    Key word here: -USUALLY-.

    --
    Dark Tyger

    Sympathy for the retailer:
    http://www.actsofgord.com/index.html
    "Door's to your left" -Gord
    (I have no association with this site. Just thought it was funny as hell)

    Protect free speech: http://stopfcc.com/
  9. Archived from groups: alt.games.everquest (More info?)

    johndoe@example.com wrote:
    ] What else could happen that would require a team of hardware engineers
    ] 40 hours to figure out the problem? I'm a software guy, buy I've seen
    ] hardware guys at work, and usually they can identify issues pretty
    ] quickly using self-diagnostics and such (assuming you're using "real"
    ] hardware and have a "real" support contract).

    It depends... I've seen Cray hardware engineers try for a day or two
    trying to figure out why a Cray YMP-2 was acting the way it was. So
    its not always obvious.

    JimP.
    --
    http://www.linuxgazette.net/ Linux Gazette
    http://blue7green.drivein-jim.net/ December 4, 2004
    http://www.drivein-jim.net/ October 24, 2004:
    http://crestar.drivein-jim.net/new.html Dec 5, 2004 AD&D
  10. Archived from groups: alt.games.everquest (More info?)

    In article <41c5a416$0$95329$a1866201@visi.com>, johndoe@example.com
    wrote:
    > As a professional computer geek, I'm terribly curious what went wrong.
    > I'm under the impression EQ1 uses a farm of Linux machines on the
    > server side, and assumed SOE would do the same for EQ2. If that's
    > true, it's probably not the servers themselves that had the issue, at
    > least I wouldn't think so, since they could be easily swapped. I
    > don't recall where I heard the rumor SOE uses a farm of Linux machines
    > to host EQ1, though, so I may be remembering wrong, and even if it's
    > true, who knows what they used for EQ2...

    I doubt they are using Linux for EQ1. EQ1 came out early enough that
    Linux was probably not even considered. Also, they are known to like
    Oracle for databases at SOE, and I am not sure Oracle was available for
    Linux when EQ1 came out.

    Linux certainly can be used for MMORPGs. DAoC is on Dell servers
    running a version of Red Hat customized by Mythic, and using MySQL for
    their database. But DAoC came out a couple years after EQ1, when Linux
    had advanced quite a bit.

    --
    --Tim Smith
  11. Archived from groups: alt.games.everquest (More info?)

    On 19 Dec 2004 15:53:58 GMT, johndoe@example.com wrotC:DRIVE_E

    >What else could happen that would require a team of hardware engineers
    >40 hours to figure out the problem? I'm a software guy, buy I've seen
    >hardware guys at work, and usually they can identify issues pretty
    >quickly using self-diagnostics and such (assuming you're using "real"
    >hardware and have a "real" support contract).

    I think what you need to also consider is the "Try
    this..test...nope..." cycle. For some systems, running through the
    checklist of fixes takes a few seconds to a few minutes each. Other
    times, it takes hours.

    A program I worked on would sometimes manifest bugs only under extreme
    -- but real world -- conditions. It was a financial simulator, and
    clients would set up runs for a weekend and then come back on Monday.
    If what they came back to was an error screen, I heard about it. Big
    time. In-house testing usually used ten minute to 1 hour runs; for
    obvious reasons, testing EVERY feature in a three day run was not
    practical. We'd set off one "big run" just to be sure, but since there
    were thousands of options, we couldn't test every combinatin of
    settings to be 100% sure one particular blend -- often with only one
    type of data -- wasn't going to crash it.

    If it takes even a half hour to apply a patch, boot, test, and try
    again, even a smallish checklist of "stuff to try" can take a long
    time. THEN, you have to apply it on every server, and check them all.

    This isn't "Windows crashed, reboot, what's the big deal?" stuff here.
    This is uber-complicated. (I know YOU probably know this, but a lot of
    people with no experience in complex networked systems think their
    methods of dealing with a single-system glitch scale effortlessly to a
    server farm running an astoundingly complex progrma.)

    And bugs aren't always evident. A long time ago, I was a 4D programmer
    with Peat Marwick. One of our divisions constantly lsot data with the
    program I was working on. I eventually tracked it down to the fact
    they had entered "Aetna insurance" with the "AE" ligature character,
    instead of "A" "E". 4Ds indexing for text couldn't handle high-ASCII
    in a field. Kaboom. This never showed in any test cases, because we
    never used high ASCII in names.

    People who call themselves "programmers" because they took a Visual
    Basic class during summer school very rarely have any grasp of what
    coding is like in the real world, and have ridiculous expecations of
    both the predictability of bugs and the difficulty of fixing them in a
    timely manner.
    *----------------------------------------------------*
    Evolution doesn't take prisoners:Lizard
    "I've heard of this thing men call 'empathy', but I've never
    once been afflicted with it, thanks the Gods." Bruno The Bandit
    http://www.mrlizard.com
  12. Archived from groups: alt.games.everquest (More info?)

    Graeme Faelban <RichardRapier@netscape.net> wrote:
    >
    > It actually sounded like they ran into a hardware issue. They
    > specifically mentioned working with their vendors to address some issues.
    >

    Yes they did mention that. However, I am not convinced that was the
    crux of the problem. First, if it was a routing issue, or a central
    database issues, then all the servers would have come up at the same
    time. However, they brought up machines one at a time over many many
    hours, so it appears that there was an issue with each of the servers.
    I HIGHLY doubt they had a hardware issue on each and every machine. If
    they did, then there is some very odd circumstances leading up to this
    indeed.

    --
    Thomas T. Veldhouse
    Key Fingerprint: 2DB9 813F F510 82C2 E1AE 34D0 D69D 1EDC D5EC AED1
    Spammers please contact me at renegade@veldy.net.
  13. Archived from groups: alt.games.everquest (More info?)

    Thomas T. Veldhouse <veldy71@yahoo.com> wrote:
    > Graeme Faelban <RichardRapier@netscape.net> wrote:
    >>
    >> It actually sounded like they ran into a hardware issue. They
    >> specifically mentioned working with their vendors to address some issues.
    >>

    > Yes they did mention that. However, I am not convinced that was the
    > crux of the problem. First, if it was a routing issue, or a central
    > database issues, then all the servers would have come up at the same
    > time. However, they brought up machines one at a time over many many
    > hours, so it appears that there was an issue with each of the servers.
    > I HIGHLY doubt they had a hardware issue on each and every machine. If
    > they did, then there is some very odd circumstances leading up to this
    > indeed.
    Aw, come on. Everything we can do is guess. Sure you may be able to make
    an educated guess - but it's a guess nonetheless. It could be the load
    balancing that has gone haywire and after replacing it and tweaking the
    new unit they did bring up the servers slowly. But as I said - even this
    is a guess. ;)

    I doubt SoE need to invent another reason for such a downtime if they
    DID have a real one to cope with. So they oversimplify the matter but it
    should be real though.


    Hagen
  14. Archived from groups: alt.games.everquest (More info?)

    Hagen Sienhold <durragon@web.de> wrote:
    > Aw, come on. Everything we can do is guess. Sure you may be able to make
    > an educated guess - but it's a guess nonetheless. It could be the load
    > balancing that has gone haywire and after replacing it and tweaking the
    > new unit they did bring up the servers slowly. But as I said - even this
    > is a guess. ;)
    >

    Load balancing would not have caused people to find that they quest log
    was empty and other such software issues.

    > I doubt SoE need to invent another reason for such a downtime if they
    > DID have a real one to cope with. So they oversimplify the matter but it
    > should be real though.
    >

    I don't think they invented the hardware problem. I do believe though
    that the hardware problem was probably not the sole cause of the outage
    and I might go so far to say that it was not the primary cause of the
    outage [based on their own notes about missing quests in the quest logs,
    etc] it doesn't look like hardware alone was the issue.

    --
    Thomas T. Veldhouse
    Key Fingerprint: 2DB9 813F F510 82C2 E1AE 34D0 D69D 1EDC D5EC AED1
    Spammers please contact me at renegade@veldy.net.
  15. Archived from groups: alt.games.everquest (More info?)

    On 20 Dec 2004 17:09:10 GMT, "Thomas T. Veldhouse" <veldy71@yahoo.com>
    wrotC:DRIVE_E

    >I don't think they invented the hardware problem. I do believe though
    >that the hardware problem was probably not the sole cause of the outage
    >and I might go so far to say that it was not the primary cause of the
    >outage [based on their own notes about missing quests in the quest logs,
    >etc] it doesn't look like hardware alone was the issue.

    It's very possible the initial bug wasn't due to hardware -- but
    hardware issue prevented a patch/rollback from working as desired. So
    one software bug runs into one hardware bug, and nine months from now,
    a lot of EQ widows give birth. :)
    *----------------------------------------------------*
    Evolution doesn't take prisoners:Lizard
    "I've heard of this thing men call 'empathy', but I've never
    once been afflicted with it, thanks the Gods." Bruno The Bandit
    http://www.mrlizard.com
  16. Archived from groups: alt.games.everquest (More info?)

    > People who call themselves "programmers" because they took a Visual
    > Basic class during summer school very rarely have any grasp of what
    > coding is like in the real world, and have ridiculous expecations of
    > both the predictability of bugs and the difficulty of fixing them in a
    > timely manner.

    I think many people who call themselves "programmers" because they've
    got several years experience doing development in real world situations
    are still dismal at designing robust systems.

    Not everyone of course... but far far far too many.

    Of course, I also happen to think the greatest source of bugs in the
    real world is the *deadline*. :p
  17. Archived from groups: alt.games.everquest (More info?)

    42 <nospam@nospam.com> wrote in news:MPG.1c3109fcd3b1bfdf989942@shawnews:

    >> People who call themselves "programmers" because they took a Visual
    >> Basic class during summer school very rarely have any grasp of what
    >> coding is like in the real world, and have ridiculous expecations of
    >> both the predictability of bugs and the difficulty of fixing them in a
    >> timely manner.
    >
    > I think many people who call themselves "programmers" because they've
    > got several years experience doing development in real world situations
    > are still dismal at designing robust systems.
    >
    > Not everyone of course... but far far far too many.
    >
    > Of course, I also happen to think the greatest source of bugs in the
    > real world is the *deadline*. :p
    >

    Yep, 22 years as an embedded systems software developer here, and that is
    still the biggest source of bugs. You just end up not having time to
    address and test everything.

    --
    On Erollisi Marr in <Sanctuary of Marr>
    Ancient Graeme Faelban, Barbarian Prophet of 69 seasons

    On Steamfont
    Graeme, 18 Dwarven Shaman, 15 Scholar
  18. Archived from groups: alt.games.everquest (More info?)

    On Mon, 20 Dec 2004 15:27:05 -0500, Lizard wrote:

    >So one software bug runs into one hardware bug, and nine months from now,
    >a lot of EQ widows give birth. :)

    Darn! I knew there was something I should have done this weekend rather than
    trying to login every half hour :)
    --
    Henrik Dissing
    Vork - Dwarf Warrior on Highkeep
    Member of Highkeep Ring

    (e-mail: hendis AT post DOT tele DOT dk)
  19. Archived from groups: alt.games.everquest (More info?)

    On Tue, 21 Dec 2004 00:11:21 GMT, 42 <nospam@nospam.com> wrotC:DRIVE_E

    >Of course, I also happen to think the greatest source of bugs in the
    >real world is the *deadline*. :p

    Well, of course.

    If you had infinite time to fix bugs, programs would be released
    bug-free.

    You don't. And there's a constant pressure, esp. in MMORPGs, to add
    new features (which means new bugs) all the time.

    If you're waiting for a bug-free game, you'll be waiting a long, long,
    time.

    I just said in an earlier post: Every product you buy was shipped with
    a list of 'known bugs' on some developers desk. It was determined that
    it was 'good enough', and that was that.
    *----------------------------------------------------*
    Evolution doesn't take prisoners:Lizard
    "I've heard of this thing men call 'empathy', but I've never
    once been afflicted with it, thanks the Gods." Bruno The Bandit
    http://www.mrlizard.com
Ask a new question

Read More

World Of Warcraft Games Servers Video Games