SCSI tape data loss

Kern Sibbald kern at sibbald.com
Mon Jun 2 01:57:53 PDT 2003


On Mon, 2003-06-02 at 00:39, Justin T. Gibbs wrote:
> >> Perhaps both Linux and Solaris force the tape drives to run in
> >> unbuffered mode?
> > 
> > Both of these systems run in synchronous write (unbuffered)
> > mode by default. It is possible to run with asynchronous
> > writes (buffered mode), but I am not aware of any 
> > program that does so.  The mt program can be used to set
> > synchronous/asynchronous writes, or other modes such
> > as Sys V compatibility rather than BSD style.
> 
> Does Solaris have the drvbuffer command that is in Linux?

I'm not 100% sure -- they have just about everything, and their
documentation is very good.  All their documentation is online
at http://docs.sun.com  -- their AnswerBook.  However, if you have
not read their mt documentation, I recommend it -- that is the
definition of what I consider the "correct" driver behavior.

See for example: http://docs.sun.com/db/doc/802-5747-07/6i9g1cn4u?a=view

For me, it is the bible. Unfortunately, not all Unicies behave
like that.

> 
> >> > 2. The SCSI driver is doing asynchronous writes (very bad) and
> >> >    the End of Medium is not sent to Bacula until many writes after
> >> >    the end of the tape.
> >> 
> >> Disabling the tape drive's write buffer kills performance.  All
> >> of the information required to handle buffered writes should be
> >> available to you.
> > 
> > My personal preference is for data security before performance.
> 
> There is no potential for lost data if you handle the status that
> is presented to you.

Could you explain that more in detail?  If you mean dig into the
OS/driver specific details of an MTIOCERRSTAT packet. That *shouldn't*
be necessary -- at least it is not necessary on Solaris/Linux to
guarantee data integrity.  

> 
> > If you are in fact doing asynchronous writes (buffered mode), then
> > Bacula will not support FreeBSD without essentially duplicating the
> > driver's buffering code inside Bacula -- something I don't plan
> > to do in the near future, if for not other reason than doing so
> > would mean a different driver for every operating system.
> 
> The tape driver doesn't have any buffering code (unlike Linux which
> does).  The tape drive has a buffer.  We are just enabling the use
> of that buffer.  If you really want to do this simply, just do a
> write filemarks of 0 marks everytime you are about to switch input
> files.  The write marks flushes the device's buffer an guarantees
> that any residual will be within the fd that you are currently using.
> This would imply that you only need to explicitly buffer if you support
> backups from stdin.

I don't mind if the tape drive buffers data as long as it writes
*all* of that data to the tape and informs me on the next write
that the LEOM logical EOM in Solaris parlance (or early EOM)
has been hit.

If the drive cannot write *all* the data it has accepted to the
tape because of the EOM or whatever (I/O error), then I *much*
prefer to turn that mode off and write a block at a time.

Bacula in such a single write non-buffered mode Bacula is faster
than Networker, which for the moment is good enough for me. I
think that I can get even more speed by internally buffering and
possibly using asynchronous writes -- but that is for the pretty
far future and will undoubtedly be OS dependent since there seems
to be no standard interface for enabling/disabling such modes.

> 
> > I'm not convinced that there is really much loss in performance,
> > and even if I am wrong (quite possibly) 
> > it can be easily compensated by having Bacula
> > buffer itself and using a separate thread dedicated to writing
> > and using synchronous (non-buffered) writes in the OS driver.
> 
> You can never recover the round trip time on the SCSI bus unless
> you either have a device that allows you to queue more than one
> command at a time or that buffers.  I believe that only FC tape
> devices support queuing more than one command at a time, but few
> programs support this anyway (unless you lie and say that a previous
> write has completed).

I can see that performance concerns you because you wrote the
driver, but for me (and most users I believe) what counts is
data integrity first and then performance.  In addition for
me as a systems applications writer, I look for the common
denominator so that my program will work on the maximum 
machines.  Writing to a specific machine is very difficult
for me since I only have access to Linux and at times
Solaris machines with tape drives.

> 
> > How do you support tar?  Tar knows nothing about buffering --
> > at least not GNU tar to the best of my knowledge.
> 
> I think few people use tar for multi-volume backups unless they
> specify a specific tape length, but I really don't know.

I'm beginning to understand why Amanda doesn't handle multi-volume
backups.  I guess I can tell FreeBSD users that they can use the
tape drive *if* they specify a tape length, but that seems a pity.

> 
> >> Perhaps we should also implement the MTCACHE/MTNOCACHE opcodes so
> >> that userland apps can control this.  It's not clear if this is
> >> exactly what they were created for, but it may be better to use
> >> these than to add some other opcodes.
> > 
> >> From my experience with Solaris/Linux (absolutely no problems in
> > 3 years), I'd recommend implementing a non-buffered mode (your
> > MTNOCACHE I assume), and it should be the default.  In fact,
> > though it is certainly possible and possibly worth the effort,
> > I've never heard of any standard Unix program handling a 
> > buffered tape drive.  If you know one, I would certainly like to
> > know about it.
> 
> Standard program?  I don't know about that, but the commercial
> apps have always supported buffered mode.

Well, in the case of Networker on Solaris, that hasn't helped them
much -- in any case, I *will* support buffered mode someday 
even if it is my own buffering.

> 
> > Exactly what ioctl() does what is not critical for me as I can
> > always code it -- what counts is that it is well documented.
> > Of course, the more things are standard across systems, the
> > easier it is to program.
> 
> It's not clear to me that there is a standard.

Yes, it is a pity isn't it, and I'm certainly not blaming anyone
especially you.

> 
> > Maybe I missed it, but I didn't see anything that indicated that
> > the FreeBSD does asynchronous writes.
> 
> >From looking at the sa driver, it appears that it always tries to
> do buffered writes unless there is a device quirk indicating that
> mode select doesn't work.

Hmmm. Well short term, it looks like the user must specify the
size -- something almost impossible to do with any precision given
hardware compression on drives these days.  In the longer run, I
hope you will consider either turning off buffering by default or
at least letting me (in user land) do so.

Best regards,

Kern



More information about the freebsd-scsi mailing list