Differences between Solaris/Linux and FreeBSD
Matthew Jacob
mjacob at feral.com
Tue Jul 1 23:11:26 PDT 2003
the last I heard this was in the bacula court- I've been away, and will
be away again shortly for the fourth of july. there's a compat issue
for fixing stuff mentioned below. what do you think is broken at this
point?
On Tue, 1 Jul 2003, Dan Langille wrote:
> As a bacula fan I'd like to see it working on FreeBSD but I don't
> know what I can do in order to achieve that objective. Any ideas?
>
> On 3 Jun 2003 at 11:39, Matthew Jacob wrote:
>
> > > As promised, in this email, I will try my best to describe
> > > the differences I found between Solaris/Linux and FreeBSD
> > > concerning tape handling. There were five separate areas
> > > where I noticed differences:
> > >
> > > 1. On Solaris/Linux, the default behavior for ioctl(MTEOM)
> > > is to run in what they call slow mode. In this mode, the
> > > tape is positioned to the end of the data, and the driver
> > > returns the correct file number in the MTIOCGET packet.
> > > It is possible to enable fast-EOM, but no one uses it to
> > > my knowledge.
> > >
> > > On FreeBSD, you apparently always use the fast-EOM so that
> > > the tape position is unknown after the ioctl().
> >
> > You *could* read block position. Particularly for h/w blocks this works
> > very fast when you need to locate.
> >
> > NB: SCSI-3 changed the layout for h/w block position stuff and I haven't
> > updated the FreeBSD driver to handle this yet.
> >
> > > Bacula always knows how many files are on a tape, and when
> > > appending to a tape that is already written and newly opened,
> > > it MUST know where it is on the tape. As a consequence, on
> > > FreeBSD, I must explicitly use MTFSF with read()s in between
> > > to position to the end of the tape -- a fairly slow affair.
> >
> > Uh, this is how 'slow' EOM works. It's not really faster to do it in the
> > kernel as opposed to in the driver.
> >
> > I must point out that you cannot, and should not, depend absolutely on
> > reported position. For tape you can ensure BOT or end of recorded media,
> > but otherwise you really must use self-referential data on the tape if
> > tape location is important.
> >
> > > 2. Your handling of EOM differs from Solaris/Linux. On both of
> > > those systems, when the Bacula reads the first EOF, the driver
> > > returns 0 bytes read. On reading the second EOF, the driver
> > > returns 0 bytes read, but before returning backspaces over
> > > the EOF, leaving you positioned correctly for appending to the
> > > tape and having told you you are at the end of the tape by
> > > giving two consecutive 0 byte read. Any further read()
> > > request return an I/O error.
> > >
> > > On FreeBSD, reading the first EOF returns 0 bytes, reading
> > > the second EOF also returns 0 bytes (sometimes, I apparently
> > > get "Illegal operation"). However, the tape is left positioned
> > > after the second EOF, so appending from that point effectively
> > > "loses" the data.
> > >
> > > To handle this correctly the FreeBSD user must add a configuration
> > > statement to Bacula telling him to backspace file at EOM.
> >
> > Yes. This is a problem.
> >
> > But part of the problem here is that dual-filemark at EOM is only one
> > tape convention- and a poorly thought out one at best- it exists
> > *solely* because a *few* (ancient) tape drives would unwind off the feed
> > reel if you kept advancing them. For QIC drives, you *cannot* write dual
> > filemarks (really).
> >
> > Note that there is a setting that can change the model to single EOM. If
> > I could have gotten away with it, I would have made this the default.
> >
> > I think, though, I'd accept that the FreeBSD behaviour is a bug that
> > should be fixed. If we have a dual fmk EOT model and are advancing along
> > and hit two in a row, we *probably* should say we're at logical EOT and
> > backspace over one of them. After all, this is what we do when we're
> > *writing* to tape and close the no-rewind device.
> >
> > I also would agree that this situation is exacerbated by the 'space to
> > end of recorded data' model for the MTEOM command. This now leaves us
> > with a legacy of tapes with spurious dual filemarks in the middle.
> >
> > Oops. This means that I really can't fix things the way you'd like :-(.
> >
> > >
> > > 3. I have previously described this but will do so again for
> > > completeness here. On Solaris/Linux when Bacula does:
> > >
> > > write();
> > > ioctl(MTEOF);
> > > ioctl(MTEOF)
> > > ioctl(MTBSF);
> > > ioctl(MTBSF);
> > > ioctl(MTBSR);
> > > read();
> > >
> > > the read() re-reads the last write. On FreeBSD, the read returns
> > > 0 bytes (there is also a problem of freezing the tape wrapped into
> > > this example if I am not mistaken). Apparently the 0 bytes read is
> > > because FreeBSD adds an additional EOF mark (not necessary) and
> > > leaves the drive positioned *after* the mark thus re-reading the
> > > last record fails when it logically should not.
> >
> > I don't believe that FreeBSD adds an additional filemark here, but I
> > should add this as a test case. I have another tester program that I use
> > for testing block locate, but I haven't really validated it or finished
> > it yet.
> >
> > Why, btw, are you issuing two MTEOFs? The mtop has a count field y'know
> > :-).
> >
> > >
> > > 4. Tape freezing: On Solaris/Linux, the tape never "freezes". On
> > > FreeBSD it does freeze. As best I can determine, you freeze the
> > > drive when you lose track of where you are. Typically, this
> > > occurs when I do a MTBSR to re-read the last record. On Solaris/Linux
> > > the tape is never frozen, but when they don't know the position,
> > > they simply return -s in the MTIOCGET packet, which is fine with
> > > me because Bacula only uses that info when initially reading a
> > > tape to append to it.
> > >
> > > Freezing the tape causes all sorts of problems because it generates
> > > a flood of unexpected errors. Within a large complicated program like
> > > Bacula, when a low level routine re-reads a record during writing and
> > > the tape freezes, it cannot simply rewind the drive as this could
> > > cause chaos and possible overwriting of the beginning of the drive.
> > >
> > > I've attempted to overcome tape freezing by providing the user a
> > > means to turn off MTBSR (but they don't always do so), and by issuing
> > > ioctl(MTIOCERRSTAT) after every return of -1 from any I/O request.
> > >
> > > I recommend that you do away with freezing the drive -- it seems to
> > > me that it only causes more problems. In saying that I have to
> > > that I really do not understand tape freezing or why you do it since
> > > I found no documentation on it, and everything I write above I have
> > > deduced from what Dan has reported back to me.
> >
> > Freezing the drive is precisely what Solaris and Linux *should* do. If
> > you've lost position, you have to take some action to bring the tape to
> > a known position. The unaware application should not be allowed to
> > overwrite in random spots on the tape. If your low level read/write
> > routines get any kind of error, you have to move to a "what do I have in
> > my tape drive now?" state anyway.
> >
> > You know, I was pretty sure I'd documented the freeze option, but I
> > cannot find it in the man page (sa(4)) now at all.
> >
> >
> > >
> > > 5. I am quite fuzzy on this point because I forget exactly what happened
> > > and what I did about it.
> > >
> > > It seems to me that on Linux, if I read a block but specify a number
> > > of bytes less than the number actually in the block on the tape, the
> > > driver returns the data anyway. I then check if the block is
> > > internally complete and if not, increase my record size to the size
> > > indicated in the data received, backspace one record, and re-read it.
> > >
> > > If I am not mistaken, on FreeBSD, the first read returns an error,
> > > and Bacula just immediately gives up. Your documentation specifies
> > > that one can never read a partial record from a tape, but it does not
> > > specify what error code is generated. As a consequence, rather than
> > > recovering and re-reading the record, Bacula has to assume it was
> > > a fatal error.
> >
> > The reason linux 'succeeds' here is because linux internally reads all
> > tape data to an oversized buffer in kernel memory anyway. This means
> > that it doesn't suffer an 'overrun' condition which is what you are
> > doing if you attempt to read *less* than a tape record size. Solaris
> > will fail the same way, btw, as FreeBSD.
> >
> > What you should always do is start out by reading the largest possible
> > record size (a pathetic 64KB for FreeBSD) and adjust *downward* (if
> > desired and you are just autosizing to find a tape record size).
> >
> >
> > THanks for doing the critique. There's definitely food for thought here
> > and some changes that *should* be made.
>
> --
> Dan Langille : http://www.langille.org/
>
>
More information about the freebsd-scsi
mailing list