sa: write returns 0 = LEOM?
Matthew Jacob
mj at feral.com
Wed Jun 16 23:31:48 UTC 2010
On 6/16/2010 3:52 PM, Dustin J. Mitchell wrote:
> I'm investigating a user bug report in Amanda:
> http://forums.zmanda.com/showthread.php?t=2832
>
> The problem boils down to a write(2) call for a SCSI tape device
> (/dev/nsa0) returning 0 after quite a bit of data and a number of
> filemarks have been written. Jean-Louis suspected that this was an
> early warning EOM indication, and that a subsequent write() would
> succeed, with Amanda having been duly warned that a physical EOM is
> coming up.
That is, I believe, a specific feature of Solaris (EOM detection
triggers a zero write, but allows for trailer records). I seem to
recall helping architect this back in 1996.
> But looking at scsi_sa.c, this doesn't seem to be the
> case. It looks like an early warning would result in a successful
> write instead, because resid is set to zero.
>
> cam/scsi/scsi_sa.c:
> 2418 /*
> 2419 * Handle filemark, end of tape, mismatched record sizes....
> 2420 * From this point out, we're only handling read/write cases.
> 2421 * Handle writes&& reads differently.
> 2422 */
> 2423
> 2424 if (csio->cdb_io.cdb_bytes[0] == SA_WRITE) {
> 2425 if (sense_key == SSD_KEY_VOLUME_OVERFLOW) {
> 2426 csio->resid = resid;
> 2427 error = ENOSPC;
> 2428 } else if (sense->flags& SSD_EOM) {
> 2429 softc->flags |= SA_FLAG_EOM_PENDING;
> 2430 /*
> 2431 * Grotesque as it seems, the few times
> 2432 * I've actually seen a non-zero resid,
> 2433 * the tape drive actually lied and had
> 2434 * written all the data!.
> 2435 */
> 2436 csio->resid = 0;
> 2437 }
>
>
Yes, I remember this code. I remember on doing test readbacks that the
residual reported was in fact incorrect- the data had actually been
written. But this was really a long while back (at least 8 years ago).
> That said, I don't know my way around the kernel source, so I'm
> probably missing something obvious. So:
>
> 1. What could cause a write syscall to return 0?
>
I'll try and look into this.
Do you happen to know whether the device you experienced this on was set
in fixed block or variable block mode?
> 2. Since we will be using early warning in the next version of Amanda,
> hints as to the best way to handle early warning from userspace would
> be appreciated.
>
>
Urrr....
I used to have opinions about this. Now I'm not so sure. Expecting
consistent behaviour from platform to platform is tough.
Can't you write until you get a hard failure, back up one record (which,
of course, you've hung onto), write a trailer label and then ask for a
new tape?
More information about the freebsd-scsi
mailing list