Automated Backup to Tape Not Working

Michael Paulmeno

Honorable
Aug 30, 2013
86
0
10,640
Recently, the university library I work for purchased a new tape drive for their server (a Dell PowerEdge 2800). The new unit is a DAT72 by Tandberg Data and replaces a similar one by Quantum. The reason we got rid of it was that the cleaning light was coming on every few days leading us to believe it was reaching the end of its life.

Anyway we have a backup script that shuts down a few services that run in the background, initializes the backup, and then ejects the tape. Well the problem is we can't seem to make a backup. The script shuts down what it needs to, but then nothing happens. The log file for that night will be blank and no backup was created.

The interesting thing is that we have managed to get it to work with two tapes that were never used. Previously we'd been using tapes that were only about three months old. We have about twelve of them we rotate between over a two week period (the library is closed Saturdays and so no backup is done then).

My assumption is that the tapes (HP brand) are at fault and we have ordered new ones. However last night a backup failed with a tape that had only been used once several months ago. The only thing all the failed backups have in common is that they were attempted using tapes that had been in the old drive.

So is it possible that a tape can only be used with one brand of tape drive? And has anyone encountered similar problems? While my assumption is that new tapes will solve it I am taking nothing for granted, but am absolutely baffled about what the cause could be.

Finally we are having a seperate problem with the tapes not ejecting despite the script saying they should. I have not given it much thought yet, but any input would be great.
 
Solution
I think I see the problem.

This is your backup script which looks normal enough:
C:\Windows\SYSTEM32\NTBACKUP BACKUP systemstate C: D: /P "4mm DDS" /N "SIRSI Backup - %DATE%" /v:yes /HC:ON /L:s /M normal /UM


This line is a refresh command being sent to the Removable Storage manager
RSM REFRESH /LF"Dell (tm) PowerVault (tm) 100T DAT72"

The /LF switch tells the RSM which library to refresh ('library' in this case being the stand alone tape drive) and I'm guessing this is the old tape drive. The reason the Refresh is being requested is that RSM has no way to know if the tape has been changed unless backup is still running, as in a multi-tape backup. If backup ends and you install another tape for tomorrow; RSM has no way to know this...

Michael Paulmeno

Honorable
Aug 30, 2013
86
0
10,640
The tapes are DAT72 and are not write protected. I do not know about the SCSI ID. However we had to put the new drive in a different slot than the old one as it was slightly larger. How would that affect the ability of the drive to write to tapes?
 

popatim

Titan
Moderator
If your script is trying to talk to scsi id 0 device 2 and you have this drive's on scsi id 1 device 5 .. well the new drive is not going to think your script is talking to it. Where's your Univeristy's IT dept?

typically scsid is set on the rear of the device. I hope you still have the old drive to compare to.

 

Michael Paulmeno

Honorable
Aug 30, 2013
86
0
10,640
The relationship between the University's IT department and the library is complicated. They were outsourced some years ago, but the library has remained independent and maintains its own computers. As the systems librarian the server falls within my realm of responsibility, hence I have come here as this is an issue we must solve on our own.

As for the SCSI ID, my colleague did have to change some of the pins in the back after she first installed it. So the SCSI ID may have been wrong initially before being corrected. We do have the original, however. I will ask about this on Tuesday.

Where the script is concern it does (mostly) work, except for the fact that it won't eject the tapes. We have it rename the log file to include the date, the proper background services are shut down (except Oracle although that hasn't stopped backups from being created and so I've discounted that as a possibility - in any case the vendor who provides our Integrated Library System also provides us with Oracle and we are working with customer service in this area), and a log is created. We have even made a backup and verified it is of the proper date (by checking a file that is modified daily). The only variable that changes between failed and successful backup attempts is the tapes themselves. The problem seems to be there. Except that that makes no sense, at least to me.

It might help to know that we are running Windows Server 2003 R2 and use NTBackup. However it hasn't failed us yet.
 

Michael Paulmeno

Honorable
Aug 30, 2013
86
0
10,640

NTBackup would not give me the option of formatting a tape so I used TDTool, which is from Tandberg. It has a function called "erase tape" which I assumed meant format. It did not help. No backup was created last night despite my having erased the contents of the tape that was in the machine.

However the backup did work on Wednesday. I had put that tape in on Tuesday and while it failed then, the attempt the next day (using the same tape) succeeded. Why exactly the same tape should work after a second attempt baffles me. Furthermore this was a tape I had dismissed as being bad after it failed the first time.

UPDATE:
When I open the command prompt, type diskpart and then list volume the tape drive does not show up. A colleague of mine said the old drive had shown up here. She also viewed the same information elsewhere in the OS, but I can't recall where. Now the tape drive does appear under device mananger (albeit under the name Hewlett Packard instead of Tandberg) and the removable storage subsection of computer management. Nevertheless I am wondering if this might be a SCSI issue after all.
 

Michael Paulmeno

Honorable
Aug 30, 2013
86
0
10,640
The script is below. Unicorn refers to our Integrated Library System. There must be something wrong with the script since it will not eject the tape (a problem that started when we got the new drive). But other than that I am not sure what could be wrong.

rem -------------------------------------------------------
rem - Rename the backup file to the date and time.
rem - In case the file exists before the backup starts.
rem -------------------------------------------------------
cd\
d:
cd\
C:
cd C:\Documents and Settings\Administrator\Local Settings\Application Data\Microsoft\Windows NT\NTBackup\data

rem -------------------Get the Date------------------------
FOR /f "tokens=6-8 delims=/ " %%G IN ('NET TIME \\merlin') DO (
SET _mm=%%G
SET _dd=%%H
SET _yy=%%I
)

rem Get the Time
FOR /f "tokens=1,2 delims=: " %%G IN ('time/t') DO (
SET _hr=%%G
SET _min=%%H
)


If exist backup01.log rename backup01.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup02.log rename backup02.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup03.log rename backup03.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup04.log rename backup04.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup05.log rename backup05.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup06.log rename backup06.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup07.log rename backup07.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup08.log rename backup08.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup09.log rename backup09.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup10.log rename backup10.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"

cd\
D:


rem ---------------Stop Unicorn Services-------------------
NET STOP "Unicorn Ipcm" /y
NET STOP "Unicorn Starter" /y
NET STOP "Unicorn Webstarter" /y
NET STOP "Unicorn Sockserver" /y
NET STOP "Unicorn Reportcron" /y
NET STOP "Unicron Sockwrapper" /y
NET STOP "Hyperion Arfserver" /y
NET STOP "Apache2.2" /y
rem --- NET STOP "OracleServicedelt" /y
rem --- New stop oracle command one line below with net stop
NET STOP "oradim -SHUTDOWN -sid delt -SHUTMODE immediate -SHUTTYPE srvc,inst" /y


rem -------------------------------------------------------


RSM REFRESH /LF"Dell (tm) PowerVault (tm) 100T DAT72"
C:\Windows\SYSTEM32\NTBACKUP BACKUP systemstate C: D: /P "4mm DDS" /N "SIRSI Backup - %DATE%" /v:yes /HC:ON /L:s /M normal /UM

rem ----------------Start Unicorn Services-----------------
rem --- NET START "OracleServicedelt" /y
rem ---New Oracle 10g start command one line below
NET START "oradim -STARTUP -sid delt -STARTTYPE srvc,inst" /y
NET START "Apache2.2" /y
NET START "Unicorn Ipcm" /y
NET START "Unicorn Starter" /y
NET START "Unicorn Webstarter" /y
NET START "Unicorn Sockserver" /y
NET START "Unicorn Reportcron" /y
NET START "Unicorn Sockwrapper" /y
NET START "Hyperion Arfserver" /y

rem -------------------------------------------------------

rem Eject tape upon completion
RSM EJECT /LF"Dell (tm) PowerVault (tm) 100T DAT72"

rem -------------------------------------------------------
rem - Rename the new backup file to the date and time.
rem -------------------------------------------------------

cd\
c:
cd C:\Documents and Settings\Administrator\Local Settings\Application Data\Microsoft\Windows NT\NTBackup\data

rem -------------------Get the Date------------------------
FOR /f "tokens=6-8 delims=/ " %%G IN ('NET TIME \\merlin') DO (
SET _mm=%%G
SET _dd=%%H
SET _yy=%%I
)

rem Get the Time
FOR /f "tokens=1,2 delims=: " %%G IN ('time/t') DO (
SET _hr=%%G
SET _min=%%H
)


If exist backup01.log rename backup01.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup02.log rename backup02.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup03.log rename backup03.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup04.log rename backup04.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup05.log rename backup05.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup06.log rename backup06.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup07.log rename backup07.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup08.log rename backup08.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup09.log rename backup09.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"
If exist backup10.log rename backup10.log "backup%_yy%-%_mm%-%_dd%-%_hr%%_min%.log"

cd\
 

popatim

Titan
Moderator
I think I see the problem.

This is your backup script which looks normal enough:
C:\Windows\SYSTEM32\NTBACKUP BACKUP systemstate C: D: /P "4mm DDS" /N "SIRSI Backup - %DATE%" /v:yes /HC:ON /L:s /M normal /UM


This line is a refresh command being sent to the Removable Storage manager
RSM REFRESH /LF"Dell (tm) PowerVault (tm) 100T DAT72"

The /LF switch tells the RSM which library to refresh ('library' in this case being the stand alone tape drive) and I'm guessing this is the old tape drive. The reason the Refresh is being requested is that RSM has no way to know if the tape has been changed unless backup is still running, as in a multi-tape backup. If backup ends and you install another tape for tomorrow; RSM has no way to know this occured otherwise.

So what I believe is happening, since you are able to get 1 backup done on a new tape, is that the the Refresh is not Refreshing the correct Library so RSM thinks you are 'accidentally' trying to overwrite its current tape and its not allowing it.

The corrective action would be to make a backup copy if this script, named something different of course like backup_script.OLD, and then change the Refresh line in the active one to reference the new library name.

At least that whats I think is happening without digging into your library & pools setup.

Please do some research and investigate your servers setup to find out the correct library name and whether you agree with my assessment.

Disclaimer: I assume no liability for any of my suggestions. LoL
 
Solution

Michael Paulmeno

Honorable
Aug 30, 2013
86
0
10,640
Your explanation might also be why the tape won't eject: the command is being sent to a tape drive that doesn't exist. I have backed up the script and made the change on the original. Now to wait 24 hours (or more) to see if it worked. Thank you for your help.