[Fwd: Re: SCSI tape data loss]

Kern Sibbald kern at sibbald.com
Sun Jun 1 14:41:10 PDT 2003


Oops. Sorry, I didn't include the list.

-----Forwarded Message-----

From: Kern Sibbald <kern at sibbald.com>
To: Justin T. Gibbs <gibbs at scsiguy.com>
Subject: Re: SCSI tape data loss
Date: 01 Jun 2003 23:37:09 +0200

On Sun, 2003-06-01 at 22:08, Justin T. Gibbs wrote:
> > Hello,
> > 
> > I'm the author of a GPL'ed network backup program called
> > Bacula (www.bacula.org). For the last three years, it
> > has been working flawlessly on Solaris and Linux systems.
> > When users attempted to use it recently on FreeBSD,
> > it did not work. I subsequently modified Bacula so that
> > it would work on FreeBSD -- basically, I had to program
> > around some important differences in the way FreeBSD 
> > handles EOFs compared to Solaris and Linux.  At some point
> > in the future, I would like to discuss the problems
> > I had in detail, if that interests you.
> 
> I would be interested as I'm sure would other readers of this
> list.

OK, in the next few days, I will document the differences
between Solaris/Linux and FreeBSD that I have run into.

> 
> > We've now worked on this problem for several weeks, and
> > I believe we have now isolated the problem (data loss) to occur
> > when the end of medium is reached.
> > 
> > We have now confirmed that Bacula correctly wrote
> > to the tape, but when it was read back 13 blocks
> > of 64512 bytes were missing.
> > 
> > Below, I have listed in pseudo-language what
> > Bacula was doing. Each write with the exception
> > of the first block on the second tape is 64512
> > bytes:
> > 
> >   first tape mounted
> >   write(block 1)
> >   ...
> >   write(block 1554);
> >   write(block 1555);   <=== block lost
> >   ...                  <=== blocks lost
> >   write(block 1567);   <=== block lost
> >   write(block 1568) failed because of EOM detected
> >   ioctl(MTIOCERRSTAT);
> 
> What was the residual reported by MTIOCERRSTAT?  If the
> device is in buffered mode, that residual can be larger than
> the last transaction that was failed.  My guess is that either
> MTIOCERRSTAT is not properly pulling the residual out of the
> info field, or you are not backing up far enough in the data
> stream when the EOM occurs.
> 
> > I have verified that Bacula did successfully write 1567 blocks to the
> > first tape, but in reading back the tape, blocks 1555-1567 are not
> > on the tape.
> > 
> > Now, the big question is: what caused the loss of those blocks?
> > The most likely causes I can think of are:
> > 
> > 1. Bacula is doing something (e.g. MTIOCERRSTAT, or the MTBSF)
> >    to cause the data to be lost.  If this is the case, it is
> >    something specific to FreeBSD since this sequence of commands
> >    works on both Solaris and Linux (except that MTIOCERRSTAT is
> >    MTIOCLRERR on those systems).
> 
> Perhaps both Linux and Solaris force the tape drives to run in
> unbuffered mode?

Both of these systems run in synchronous write (unbuffered)
mode by default. It is possible to run with asynchronous
writes (buffered mode), but I am not aware of any 
program that does so.  The mt program can be used to set
synchronous/asynchronous writes, or other modes such
as Sys V compatibility rather than BSD style.


> 
> > 2. The SCSI driver is doing asynchronous writes (very bad) and
> >    the End of Medium is not sent to Bacula until many writes after
> >    the end of the tape.
> 
> Disabling the tape drive's write buffer kills performance.  All
> of the information required to handle buffered writes should be
> available to you.

My personal preference is for data security before performance.

If you are in fact doing asynchronous writes (buffered mode), then
Bacula will not support FreeBSD without essentially duplicating the
driver's buffering code inside Bacula -- something I don't plan
to do in the near future, if for not other reason than doing so
would mean a different driver for every operating system.

I'm not convinced that there is really much loss in performance,
and even if I am wrong (quite possibly) 
it can be easily compensated by having Bacula
buffer itself and using a separate thread dedicated to writing
and using synchronous (non-buffered) writes in the OS driver.
 

How do you support tar?  Tar knows nothing about buffering --
at least not GNU tar to the best of my knowledge.

> 
> Perhaps we should also implement the MTCACHE/MTNOCACHE opcodes so
> that userland apps can control this.  It's not clear if this is
> exactly what they were created for, but it may be better to use
> these than to add some other opcodes.

>From my experience with Solaris/Linux (absolutely no problems in
3 years), I'd recommend implementing a non-buffered mode (your
MTNOCACHE I assume), and it should be the default.  In fact,
though it is certainly possible and possibly worth the effort,
I've never heard of any standard Unix program handling a 
buffered tape drive.  If you know one, I would certainly like to
know about it.

Exactly what ioctl() does what is not critical for me as I can
always code it -- what counts is that it is well documented.
Of course, the more things are standard across systems, the
easier it is to program.

Maybe I missed it, but I didn't see anything that indicated that
the FreeBSD does asynchronous writes.

> 
> > 3. The SCSI driver has some sort of bug that causes buffers to be
> >    lost.
> 
> I doubt that this would occur only at EOM.

Well, if the drive is running in asynchronous write mode, then
data loss will occur in every Unix program that I know of at EOM
and any time there is a tape write error.

Could someone confirm whether or not the driver is doing 
asynchronous writes, and whether or not I can turn it off?
(I think this is the case from your email, but am not
100% sure).

Best regards,

Kern



More information about the freebsd-scsi mailing list