Differences between Solaris/Linux and FreeBSD
Dan Langille
dan at langille.org
Tue Jul 1 17:07:43 PDT 2003
As a bacula fan I'd like to see it working on FreeBSD but I don't
know what I can do in order to achieve that objective. Any ideas?
On 3 Jun 2003 at 11:39, Matthew Jacob wrote:
> > As promised, in this email, I will try my best to describe
> > the differences I found between Solaris/Linux and FreeBSD
> > concerning tape handling. There were five separate areas
> > where I noticed differences:
> >
> > 1. On Solaris/Linux, the default behavior for ioctl(MTEOM)
> > is to run in what they call slow mode. In this mode, the
> > tape is positioned to the end of the data, and the driver
> > returns the correct file number in the MTIOCGET packet.
> > It is possible to enable fast-EOM, but no one uses it to
> > my knowledge.
> >
> > On FreeBSD, you apparently always use the fast-EOM so that
> > the tape position is unknown after the ioctl().
>
> You *could* read block position. Particularly for h/w blocks this works
> very fast when you need to locate.
>
> NB: SCSI-3 changed the layout for h/w block position stuff and I haven't
> updated the FreeBSD driver to handle this yet.
>
> > Bacula always knows how many files are on a tape, and when
> > appending to a tape that is already written and newly opened,
> > it MUST know where it is on the tape. As a consequence, on
> > FreeBSD, I must explicitly use MTFSF with read()s in between
> > to position to the end of the tape -- a fairly slow affair.
>
> Uh, this is how 'slow' EOM works. It's not really faster to do it in the
> kernel as opposed to in the driver.
>
> I must point out that you cannot, and should not, depend absolutely on
> reported position. For tape you can ensure BOT or end of recorded media,
> but otherwise you really must use self-referential data on the tape if
> tape location is important.
>
> > 2. Your handling of EOM differs from Solaris/Linux. On both of
> > those systems, when the Bacula reads the first EOF, the driver
> > returns 0 bytes read. On reading the second EOF, the driver
> > returns 0 bytes read, but before returning backspaces over
> > the EOF, leaving you positioned correctly for appending to the
> > tape and having told you you are at the end of the tape by
> > giving two consecutive 0 byte read. Any further read()
> > request return an I/O error.
> >
> > On FreeBSD, reading the first EOF returns 0 bytes, reading
> > the second EOF also returns 0 bytes (sometimes, I apparently
> > get "Illegal operation"). However, the tape is left positioned
> > after the second EOF, so appending from that point effectively
> > "loses" the data.
> >
> > To handle this correctly the FreeBSD user must add a configuration
> > statement to Bacula telling him to backspace file at EOM.
>
> Yes. This is a problem.
>
> But part of the problem here is that dual-filemark at EOM is only one
> tape convention- and a poorly thought out one at best- it exists
> *solely* because a *few* (ancient) tape drives would unwind off the feed
> reel if you kept advancing them. For QIC drives, you *cannot* write dual
> filemarks (really).
>
> Note that there is a setting that can change the model to single EOM. If
> I could have gotten away with it, I would have made this the default.
>
> I think, though, I'd accept that the FreeBSD behaviour is a bug that
> should be fixed. If we have a dual fmk EOT model and are advancing along
> and hit two in a row, we *probably* should say we're at logical EOT and
> backspace over one of them. After all, this is what we do when we're
> *writing* to tape and close the no-rewind device.
>
> I also would agree that this situation is exacerbated by the 'space to
> end of recorded data' model for the MTEOM command. This now leaves us
> with a legacy of tapes with spurious dual filemarks in the middle.
>
> Oops. This means that I really can't fix things the way you'd like :-(.
>
> >
> > 3. I have previously described this but will do so again for
> > completeness here. On Solaris/Linux when Bacula does:
> >
> > write();
> > ioctl(MTEOF);
> > ioctl(MTEOF)
> > ioctl(MTBSF);
> > ioctl(MTBSF);
> > ioctl(MTBSR);
> > read();
> >
> > the read() re-reads the last write. On FreeBSD, the read returns
> > 0 bytes (there is also a problem of freezing the tape wrapped into
> > this example if I am not mistaken). Apparently the 0 bytes read is
> > because FreeBSD adds an additional EOF mark (not necessary) and
> > leaves the drive positioned *after* the mark thus re-reading the
> > last record fails when it logically should not.
>
> I don't believe that FreeBSD adds an additional filemark here, but I
> should add this as a test case. I have another tester program that I use
> for testing block locate, but I haven't really validated it or finished
> it yet.
>
> Why, btw, are you issuing two MTEOFs? The mtop has a count field y'know
> :-).
>
> >
> > 4. Tape freezing: On Solaris/Linux, the tape never "freezes". On
> > FreeBSD it does freeze. As best I can determine, you freeze the
> > drive when you lose track of where you are. Typically, this
> > occurs when I do a MTBSR to re-read the last record. On Solaris/Linux
> > the tape is never frozen, but when they don't know the position,
> > they simply return -s in the MTIOCGET packet, which is fine with
> > me because Bacula only uses that info when initially reading a
> > tape to append to it.
> >
> > Freezing the tape causes all sorts of problems because it generates
> > a flood of unexpected errors. Within a large complicated program like
> > Bacula, when a low level routine re-reads a record during writing and
> > the tape freezes, it cannot simply rewind the drive as this could
> > cause chaos and possible overwriting of the beginning of the drive.
> >
> > I've attempted to overcome tape freezing by providing the user a
> > means to turn off MTBSR (but they don't always do so), and by issuing
> > ioctl(MTIOCERRSTAT) after every return of -1 from any I/O request.
> >
> > I recommend that you do away with freezing the drive -- it seems to
> > me that it only causes more problems. In saying that I have to
> > that I really do not understand tape freezing or why you do it since
> > I found no documentation on it, and everything I write above I have
> > deduced from what Dan has reported back to me.
>
> Freezing the drive is precisely what Solaris and Linux *should* do. If
> you've lost position, you have to take some action to bring the tape to
> a known position. The unaware application should not be allowed to
> overwrite in random spots on the tape. If your low level read/write
> routines get any kind of error, you have to move to a "what do I have in
> my tape drive now?" state anyway.
>
> You know, I was pretty sure I'd documented the freeze option, but I
> cannot find it in the man page (sa(4)) now at all.
>
>
> >
> > 5. I am quite fuzzy on this point because I forget exactly what happened
> > and what I did about it.
> >
> > It seems to me that on Linux, if I read a block but specify a number
> > of bytes less than the number actually in the block on the tape, the
> > driver returns the data anyway. I then check if the block is
> > internally complete and if not, increase my record size to the size
> > indicated in the data received, backspace one record, and re-read it.
> >
> > If I am not mistaken, on FreeBSD, the first read returns an error,
> > and Bacula just immediately gives up. Your documentation specifies
> > that one can never read a partial record from a tape, but it does not
> > specify what error code is generated. As a consequence, rather than
> > recovering and re-reading the record, Bacula has to assume it was
> > a fatal error.
>
> The reason linux 'succeeds' here is because linux internally reads all
> tape data to an oversized buffer in kernel memory anyway. This means
> that it doesn't suffer an 'overrun' condition which is what you are
> doing if you attempt to read *less* than a tape record size. Solaris
> will fail the same way, btw, as FreeBSD.
>
> What you should always do is start out by reading the largest possible
> record size (a pathetic 64KB for FreeBSD) and adjust *downward* (if
> desired and you are just autosizing to find a tape record size).
>
>
> THanks for doing the critique. There's definitely food for thought here
> and some changes that *should* be made.
--
Dan Langille : http://www.langille.org/
More information about the freebsd-scsi
mailing list