6 Bare Metal Backup and Recovery Options
We look at full disk backup and restore packages for Microsoft Small Business Server and Vista desktops to help you decide what's best for your networks and systems.
We've used Symantec's Backup Exec System Restore in it's various incarnations since they purchased it (when it was V2i Protector).
I can NOT recommend it to anyone. We have had no end of trouble with it on various servers, and endless support calls wasting many hours. It stability and reliability seems to have deteriorated with later versions. Problems seen include the backup service randomly stopping, out of memory, version 7 (vista) recovery environment not able to access some RAID cards even with Vista drivers, failure to restore to new hardware (2 months for them to find a work-around!)...and the list goes on...
By comparison, we are trialling Shadow Protect server version currently, and it is a dream in comparison - everything seems to just work - which is what is expected of software like this. Problems that choked the symantec product are no issue using Shadow Protect - it just performs simply and easily.
Wow, thanks Bryce, I'm glad you like our ShadowProtect disaster-recovery backup product. It's not a perfect product (is there such a thing?) but I think it's pretty sweet too. Although admittedly I'm biased.
I'd really like to see a follow up article with actual hard-hitting comparisons of the core functionality of these products. For instance, I would like a review that did something like this (which I think would be of interest to many business and enterprise users):
Perform the following test for all products under comparison (ShadowProtect, Backup Exec System Recovery, True Image, etc):
1) Start with a clean SBS 2003 with Exchange, on which none of the above products have ever been installed.
2) Check the Exchange database to ensure that it's good (not corrupt) before beginning tests
3) Run chkdsk on the volume containing the exchange database to ensure that the file system is not corrupt
4) Install the disaster-recovery backup product for this particular test
5) Configure the disaster-recovery backup product to backup the volume containing the Exchange database on a frequent schedule, using incremental imaging capability, preferably every 15 minutes if possible
6) Use LoadSim to place a heavy simulated load on the Exchange Server
7) Wait a few hours, allowing the backup product to generate a base/full and around 10 incremental images
8) Stop LoadSim
9) Stop the backup job
10) Test the Exchange database to see if the original database is now corrupt (before restoring anything)
11) Run chkdsk on the volume containing the exchange database to see if the file system is corrupt (before restoring anything)
12) Now restore each point-in-time, starting with the base, and then progressively restoring each incremental, and after each restore operation test the exchange database to see if it's corrupt and test the filesystem to see if it's corrupt.
13) Report the ghastly results.
These tests will reveal if the product corrupts your original data (a big "no no") or if its backups are actually useless for restore purposes.
A few things to note if you do run these tests:
1) Before beginning the tests, you will need to enable the Exchange VSS writer on SBS 2003 as Microsoft does not enable this writer by default for SBS. More detail here: http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q838183
2) You should apply the latest service packs for Windows Server 2003 as Microsoft's VSS framework has some bugs in non-service packed versions.
3) You should never install ShadowProtect and True Image at the same time. True Image's snapman.sys driver has a bug which will cause it to blue screen (usually) your machine if ShadowProtect is installed at the same time. Worse, the True Image uninstaller often will not remove snapman.sys, which means you'll have to manually remove it if you aren't rolling back the machine to a clean state as I suggested in the first step. Manual removal of snapman.sys requires you to unregister it as a PnP filter on the disk and volume device classes. If you merely remove the snapman.sys file then your system will not boot because the PnP Manager will try to load snapman.sys as a filter on all disk and volume devices and as if it can't find snapman.sys then it will panic and BSOD the OS. Like I say, though, it's best to start all such tests from a clean baseline where none of the reviewed products have ever been installed.
4) If you perform similar stress tests backing up SQL Server, you should first update SQL Server with a hotfix or else you may receive SQL VDI errors in the event log. Unfortunately you have to phone in to MS to obtain the hotfix (KB934396). See http://support.microsoft.com/kb/934396/en-us
If you actually take the time to do your due diligence and perform these tests, you'll be truly amazed at the results. And if you actually DO care about your data, you really should run these tests if you are considering these products.
ONE MORE NOTE OF IMPORTANCE:
If you perform stress tests backups of SQL Server, you should first update SQL Server with a hotfix or else you may receive SQL VDI errors in the event log. Unfortunately you have to phone in to MS to obtain the hotfix (KB934396). See http://support.microsoft.com/kb/934396/en-us
Please e-mail me at etittel at spamarrest dot com and let's get some sideband discussion going. It's going to take a while to gear up for this--not to mention all the other stuff I'm working on right now--but we should start talking about how best to approach this matter. I'd also like to rope site manager Barry Gerber (who also happens to be the author of <a href="http://www.amazon.com/Mastering-Microsoft-Exchange-Server-2007/dp/0470042893">Mastering Exchange Server 2007</a> along with four previous editions as far back as Exchange 5.5 if memory serves correctly. Among the three of us I think we could probably formulate some very interesting testing work...
Thanks for your detailed posts, and very interesting suggestion.
As my salutation should demonstrate, I did indeed get your message and would like to talk to you about gearing up for another round of testing. Alas, I've been fighting hardware problems (but interesting ones, which are more fun than difficult and boring ones) and also fighting off some kind of flu bug this week, so it's delayed my reply to you. Shoot your phone number to the address I gave you and let's talk by phone early next week.
Here are some more useful instructions regarding the suggested test (above):
5) When Microsoft's VSS framework is used by a backup application, the VSS framework will "quiesce" applications which are VSS-aware (such as Exchange 2003), causing these applications to flush their data to a clean state and to pause momentarily while the volume snapshot is established. Unfortunately, VSS is a rather complex collection of components and services (VSS Writers, VSS Requestors, VSS Providers, and the VSS Service itself), and if any of these components misbehave then VSS may not work properly. Some VSS Requestors (backup applications) are notorious for leaving various VSS components in bad states (amazingly ntbackup.exe is one such application). One can view the states of the writers and providers using the commands "vssadmin list writers" and "vssadmin list providers". It's important to understand that if VSS fails to work properly, ShadowProtect will still perform the backup (just without VSS's assistance). Whether or not ShadowProtect used VSS for a given backup can be viewed in the job's detailed log. It's also important to know that some ShadowProtect jobs allow you to specify that you do NOT want to use VSS for some backups (for some incrementals, for instance). If you are backing up Exchange then it's recommended that you use VSS for *all* backups - make sure this is set in the schedule page of the ShadowProtect backup job wizard. If Exchange is backed up without assistance from Microsoft's VSS framework (which ShadowProtect will do if VSS is in a bad state) then in such cases the Exchange database may be captured in a mid-transaction (dirty) state. Don't confuse "dirty" with "corrupt," which are very different things. A dirty database is not a bad database - it's just one that needs to apply completed transactions from its logs and discard any partial transactions. Corrupt means that, well, the database's own metadata (and also possibly data) are messed up.
6) Exchange's log files and your .edb files can be in different directories, and even on different volumes. If they are on different volumes then it is critical that you configure the backup job to backup all volumes on which .edb and log files exist as part of the same single job. This will ensure that all of the backup images are based on a multi-volume snapshot which atomically captures the states of the combined volumes at one moment in time.
7) Whenever you test for integrity and checksum tests you must include the log files along with the .edb
Exchange's VSS Writer will issue errors, and informational entries, in the event log. Check there to see if the Exchange Writer is working or failing.
9) eseutil /mh isn't really a very interesting test. It doesn't actually test the integrity of the database. More interesting are the eseutil /K and eseutil /G tests. In fact, if Microsoft's Exchange team's blog is to be trusted, we should note that eseutil /mh will always show the database as being dirty even if VSS is used. The solution is to mount the database and allow the logs to replay, then dismount the database and it should be clean. Then do full integrity tests with eseutil /K and /G:
"Whether you do the backup with generic shadow copy [VSS] capabilities or via the streaming API, the database will be "crash consistent" - meaning, it is marked as being in "Dirty Shutdown" state and must have log files replayed into it after restoration before it can be mounted. This "Dirty shutdown" state can be seen if you dump the database header using ESEUTIL /mh command. When databases do not require any logs to start, they are marked as being in "Clean shutdown" state. Please note here that if you restore a full online backup of the database and let the database go through the "recovery" (meaning the database replays the logs that were restored plus possibly some logs that were already on the disk) - the database will get into the "Clean shutdown" state without user intervention."
(see http://msexchangeteam.com/archive/2004/06/25/166104.aspx for the blog entry)