SCSI tape data loss
Kern Sibbald
kern at sibbald.com
Tue Jun 3 06:37:46 PDT 2003
Thanks for the lesson is how blocks are written to
a tape -- especially the example.
I'm now leaning strongly toward aligning
my buffers. However a couple more questions please.
- When using tar (or say Bacula), how do you know
your writes are split by the kernel? In the case
of Bacula, with the buffer size I use, it ALWAYS
gets back exactly what it wrote. From my userland
perspective I see no double writes.
- What is the "page" that you are referring to?
Paged memory? If I am not mistaken the page
size can be radically different depending on the
OS and hardware. I.e. 1024 to 4096 or even more.
- How does one determine what a page size is,
preferably in a system independent way?
Thanks,
Kern
On Tue, 2003-06-03 at 15:19, Carl Reisinger wrote:
> >Concerning the maximum buffer size: I have chosen
> >the default maximum buffer size to be 64512 bytes so
> >that it is smaller than 65536. In fact, 64512 bytes is
> >the size (126 blocks) that I used for tar in 1982
> >and never had any problems.
>
> Try using the FreeBSD tar with the multi-volume flag (-M) and
> your record size.
>
> Without the flag the writes are page aligned, with the flag
> the writes are offset some, either 512 or 1536 bytes (I forget
> which), and the writes will be split by the kernel physio
> function into a 60K and 3K write. (This is with the tar
> shipped with FreeBSD up to at least 4.2. Later ones may also
> do this, I have not tried them)
>
> >
> >>From what I understand the 65536 point at which
> >buffers are always split only applies to devices in
> >fixed block mode, and probably older devices at that.
>
> This magic number has nothing to do with the device. I've only
> used variable block mode and newer technologies, SDLT, LTO.
>
> >
> >Though Bacula can run in fixed block mode, the
> >default is variable block, so I don't see that as
> >an issue here -- unless I am missing something?
> >
> >Can you explain why you mention 61440 bytes? and
> >why it might be a better choice than 64512?
> >
>
> 61440 was mentioned since that is the largest write that can
> be done without the physio function doing some surprising and
> annoying things to your write. 61440 is the size that, no
> matter its address alignment, can always be mapped with one
> page register.
>
> If you are careful to page align all writes then you can write
> up to 65536 and have one record sent to the tape device.
>
> (Actually, with a minor change to scsi_sa.c and limiting one
> self to newer SCSI HBAs you can go as high as 128KB for
> read/write)
>
> An example:
>
> Write 64512 bytes with a starting address of 4096. Physio will
> take this, see that the address is paged aligned, check that
> it can be mapped with one page register and perform one write.
>
> Now lets write 64512 bytes but with an address of 5632. In
> this case physio will notice it is not paged aligned and
> adjust the starting address to be 4096. Now 66048 bytes need
> to be mapped which exceeds the default size of 65536. In this
> case physio will map the first 60K (64K to him because of the
> starting address change), write that and then map and write
> the remainder.
>
> Now when one goes back to read 64512 bytes, the first read
> returns 61440 bytes and the second 3072 instead of just one
> read retuning 64512.
>
> >On aligning the buffers on a page boundary: interesting
> >idea, I'll look into it, but I'm not too keen on the
> >idea.
> >
>
> If your software has no problem with short reads and records
> being split into two, then don't bother page aligning.
> But, if you want to read exectly what you know you wrote then
> alignment is a must.
>
> Carl
More information about the freebsd-scsi
mailing list