SCSI tape data loss

Kern Sibbald kern at sibbald.com
Tue Jun 3 06:07:40 PDT 2003


Hello,

Dan has now re-run our test of writing to two tapes. In
this test, he told Bacula not to attempt to re-read the
last block written, so Bacula wrote until -1 with errno=ENOSPC
was returned, wrote two EOF marks then put up
the next volume.

The results were the same (more or less) 12 blocks of
data were lost, which corresponds to the smaller size
of the restored file that was split across two tapes.

These 12 blocks were also at the end of the tape.  

During the restore, Bacula reported the following:

03-Jun-2003 05:01 undef-sd: RestoreFiles.2003-06-03_04.36.59 Error:
Invalid block number. Expected 6060, got 6072

and in Bacula's database, Bacula indicates that blocks
0 to 6072 were written to the first tape. In fact, only
blocks 0 to 6071 were written to the first tape -- I
see that Bacula has included the failed block in its
count, which is wrong, but this doesn't change the results
at all though.

Bottom line: 

Even when we eliminate the code that backs
up and re-reads the last block, we still see
the last 12 or 13 blocks being lost. They were
written by the program but are not physically 
on the tape.

Next step: 

Dan is now running a test where Bacula will stop
writing on the first tape before the EOM is reached.

Best regards,

Kern







More information about the freebsd-scsi mailing list