FreeBSD 10.2-RELEASE #0 r286666: Panic and crash

Karl Denninger karl at denninger.net
Mon Feb 6 21:23:33 UTC 2017


On 2/6/2017 15:01, Shawn Bakhtiar wrote:
> Hi all!
>
> http://pastebin.com/niXrjF0D
>
> Please refer to full output from crash above.
>
> This morning our IMAP server decided to go belly up. I could not remote in, and the machine would not respond to any pings.
>
> Checking the physical console I had the following worrisome messages on screen:
>
> • g_vfs_done():da1p1[READ(offset=7265561772032, length=32768)]error = 5
> • g_vfs_done():da1p1[WRITE(offset=7267957735424, length=131072)]error = 16
> • /mnt/USBBD: got error 16 while accessing filesystem
> • panic: softdep_deallocate_dependencies: unrecovered I/O error
> • cpuid = 5
>
> /mnt/USBDB is a MyBook USB 8TB drive that we use for daily backups of the IMAP data using rsync. Everything so far has worked without issue.
>
> I also noticed a bunch of:
>
> • fstat: can't read file 2 at 0x4000000001fffff
> • fstat: can't read file 4 at 0x780000ffff
> • fstat: can't read file 5 at 0x600000000
> • fstat: can't read file 1 at 0x200007fffffffff
> • fstat: can't read file 2 at 0x4000000001fffff
> • fstat: can't read file 4 at 0x780000ffff
> • fstat: can't read file 5 at 0x600000000
>
>
> but I have no idea what these are from.
>
> df -h output:
> /dev/da0p2    1.8T    226G    1.5T    13%    /
> devfs         1.0K    1.0K      0B   100%    /dev
> /dev/da1p1    7.0T    251G    6.2T     4%    /mnt/USBBD
>
>
> da0p2 is a RAID level 5 on an HP Smart Array
>
> Here is the output of dmsg after reboot:
> http://pastebin.com/rHVjgZ82
>
> Obviously both the RAID and USB drive did not walk away from the crash cleaning. Should I be running a fsck at this point on both from single user mode to verify and clean up. My concern is the:
> WARNING: /: mount pending error: blocks 0 files 26
> when mounting /dev/da0p2
>
> For some reason I was under the impression that fsck was run automatically on reboot.
>
> Any help in this matter would be greatly appreciated. I'm a little concerned that a backup strategy that has worked for us for many MANY years would so easily throw the OS into panic. If an I/O error occurred on the USB Drive I would frankly think it should just back out, without panic. Or am I missing something?
>
> Any recommendations / insights would be most welcome.
> Shawn
>
>
The "mount pending error" is normal on a disk that has softupdates
turned on; fsck runs in the background after the boot, and this is
"safe" because of how the metadata and data writes are ordered.  In
other words the filesystem in this situation is missing uncommitted
data, but the state of the system is consistent.  As a result the system
can mount root read-write without having to fsck it first and the
background cleanup is safe from a disk consistency problem.

The panic itself appears to have resulted from an I/O error that
resulted in a failed operation.

I was part of a thread in 2016 on this you can find here:
https://lists.freebsd.org/pipermail/freebsd-stable/2016-July/084944.html

The basic problem is that the softupdates code cannot deal with a hard
I/O error on write because it no longer can guarantee filesystem
integrity if it continues.  I argued in that thread that the superior
solution would be forcibly detach the volume, which would leave you with
a "dirty" filesystem and a failed operation but not a panic.  The
file(s) involved in the write error might be lost, but the integrity of
the filesystem is recoverable (as it is in the panic case) -- at least
it is if the fsck doesn't require writing to a block that *also* errors out.

The decision in the code is to panic rather than detach the volume,
however, so panic it is.  This one has bit me with sd cards in small
embedded-style machines (where turning off softupdates makes things VERY
slow) and at some point I may look into developing a patch to
forcibly-detach the volume instead.  That obviously won't help you if
the system volume is the one the error happens on (now you just forcibly
detached the root filesystem which is going to get you an immediate
panic anyway) but in the event of a data disk it would prevent the
system from crashing.

-- 
Karl Denninger
karl at denninger.net <mailto:karl at denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2993 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20170206/e60ac87f/attachment.bin>


More information about the freebsd-stable mailing list