panic while zfs scrubbing
Andriy Gapon
avg at FreeBSD.org
Wed Aug 22 13:11:46 UTC 2012
on 22/08/2012 00:09 Roger Hammerstein said the following:
>
>
> I have a zpool where scrub seems to cause panics.
>
> I do not have zfs in rc.conf, but import manually
> on boot.
>
> I start a scrub on a zpool, and some time through will get a panic
> and reboot.
> After panic and reboot, re-importing the pool and allowing
> the scrub to restart on its own will cause another panic.
> So I import and immediately stop the scrub for now.
>
> ls -la *.{9,8,10}
> -rw------- 1 root wheel 150744 Aug 21 16:46 core.txt.10
> -rw------- 1 root wheel 147280 Aug 21 11:04 core.txt.8
> -rw------- 1 root wheel 148572 Aug 21 14:53 core.txt.9
> -rw------- 1 root wheel 457 Aug 21 16:45 info.10
> -rw------- 1 root wheel 456 Aug 21 11:04 info.8
> -rw------- 1 root wheel 458 Aug 21 14:52 info.9
> -rw------- 1 root wheel 643919872 Aug 21 16:46 vmcore.10
> -rw------- 1 root wheel 767168512 Aug 21 11:04 vmcore.8
> -rw------- 1 root wheel 1097850880 Aug 21 14:53 vmcore.9
>
>
> 9.1-BETA1 FreeBSD 9.1-BETA1 #34: Thu Jul 12 05:57:44 EDT 2012
> amd64
> 4GB of ram, 4gb of swap.
>
>
> panic: integer divide fault
>
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
>
> Unread portion of the kernel message buffer:
>
>
> Fatal trap 18: integer divide fault while in kernel mode
> cpuid = 5; apic id = 05
> instruction pointer = 0x20:0xffffffff81674a14
> stack pointer = 0x28:0xffffff810c3d4520
> frame pointer = 0x28:0xffffff810c3d4540
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 9480 (txg_thread_enter)
> trap number = 18
> panic: integer divide fault
> cpuid = 5
>
> KDB: stack backtrace:
> #0 0xffffffff80920346 at kdb_backtrace+0x66
> #1 0xffffffff808ea35e at panic+0x1ce
> #2 0xffffffff80bd7a30 at trap_fatal+0x290
> #3 0xffffffff80bd80c5 at trap+0x105
> #4 0xffffffff80bc295f at calltrap+0x8
> #5 0xffffffff816818cf at vdev_mirror_io_start+0x2bf
> #6 0xffffffff81699542 at zio_vdev_io_start+0x232
> #7 0xffffffff81698fe3 at zio_execute+0xc3
> #8 0xffffffff8165ea1c at dsl_scan_scrub_cb+0x3ec
> #9 0xffffffff8165fe14 at dsl_scan_visitbp+0x534
> #10 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9
> #11 0xffffffff81660c84 at dsl_scan_visitdnode+0x84
> #12 0xffffffff81660070 at dsl_scan_visitbp+0x790
> #13 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9
> #14 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9
> #15 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9
> #16 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9
> #17 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9
> Uptime: 1h51m55s
> Dumping 614 out of 3818 MB:..3%..11%..21%..32%..42%..53%..63%..71%..81%..92%
>
>
> Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/zfs.ko
> Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/opensolaris.ko
> #0 doadump (textdump=Variable "textdump" is not available.
> ) at pcpu.h:224
> 224 pcpu.h: No such file or directory.
> in pcpu.h
> (kgdb) #0 doadump (textdump=Variable "textdump" is not available.
> ) at pcpu.h:224
> #1 0xffffffff808e9e41 in kern_reboot (howto=260)
> at /usr/src/sys/kern/kern_shutdown.c:448
> #2 0xffffffff808ea337 in panic (fmt=0x1 <Address 0x1 out of bounds>)
> at /usr/src/sys/kern/kern_shutdown.c:636
> #3 0xffffffff80bd7a30 in trap_fatal (frame=0x12, eva=Variable "eva" is not available.
> )
> at /usr/src/sys/amd64/amd64/trap.c:857
> #4 0xffffffff80bd80c5 in trap (frame=0xffffff810c3d4470)
> at /usr/src/sys/amd64/amd64/trap.c:599
> #5 0xffffffff80bc295f in calltrap ()
> at /usr/src/sys/amd64/amd64/exception.S:228
> #6 0xffffffff81674a14 in spa_get_random (range=0)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:1165
Not sure what triggers this problem but it looks like zio is issued for a
block-pointer with no valid DVA. It's either a result of some logical bug in ZFS
code or some severe on-disk corruption.
> #7 0xffffffff816818cf in vdev_mirror_io_start (zio=0xfffffe0037e5e000)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:89
Could you please print *zio and *zio->io_bp in this frame?
It might also be good idea to report this issue to zfs-discuss at opensolaris.org.
> #8 0xffffffff81699542 in zio_vdev_io_start (zio=0xfffffe0037e5e000)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:2305
> #9 0xffffffff81698fe3 in zio_execute (zio=0xfffffe0037e5e000)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1196
> #10 0xffffffff8165ea1c in dsl_scan_scrub_cb (dp=0xffffff810c3d4538,
> bp=0xffffff8003c53480, zb=0xffffff810c3d4970)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:1737
And *bp and *scn here too.
> #11 0xffffffff8165fe14 in dsl_scan_visitbp (bp=0xffffff8003c53480,
> zb=0xffffff810c3d4970, dnp=0xffffff8003642200, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:858
> #12 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff8003642240,
> zb=0xffffff810c3d4a00, dnp=0xffffff8003642200, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684
> #13 0xffffffff81660c84 in dsl_scan_visitdnode (scn=0xfffffe001523dc00,
> ds=0xfffffe0037abf400, ostype=DMU_OST_ZFS, dnp=0xffffff8003642200,
> buf=0xfffffe00befda9c0, object=291417, tx=0xfffffe00151fc400)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:770
> #14 0xffffffff81660070 in dsl_scan_visitbp (bp=0xffffff800359b900,
> zb=0xffffff810c3d4cb0, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:718
> #15 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff80033e5380,
> zb=0xffffff810c3d4e10, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684
> #16 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff80033df000,
> zb=0xffffff810c3d4f70, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684
> #17 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff80033db000,
> zb=0xffffff810c3d50d0, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684
> #18 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff8003451000,
> zb=0xffffff810c3d5230, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684
> #19 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff80033d7000,
> zb=0xffffff810c3d5390, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684
> #20 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xfffffe0008076040,
> zb=0xffffff810c3d5420, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684
> #21 0xffffffff81660c84 in dsl_scan_visitdnode (scn=0xfffffe001523dc00,
> ds=0xfffffe0037abf400, ostype=DMU_OST_ZFS, dnp=0xfffffe0008076000,
> buf=0xfffffe00375996e8, object=0, tx=0xfffffe00151fc400)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:770
> #22 0xffffffff8165ff9a in dsl_scan_visitbp (bp=0xfffffe003729e280,
> zb=0xffffff810c3d55f0, dnp=0x0, pbuf=Variable "pbuf" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:736
> #23 0xffffffff816600d7 in dsl_scan_visit_rootbp (scn=Variable "scn" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:872
> #24 0xffffffff81660172 in dsl_scan_visitds (scn=0xfffffe001523dc00, dsobj=21,
> tx=0xfffffe00151fc400)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:1099
> #25 0xffffffff81660695 in dsl_scan_sync (dp=0xfffffe0037335000,
> tx=0xfffffe00151fc400)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:1355
> #26 0xffffffff81667e30 in spa_sync (spa=0xfffffe0008161000, txg=97010)
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:5711
> #27 0xffffffff81678749 in txg_sync_thread (arg=Variable "arg" is not available.
> )
> at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:423
> #28 0xffffffff808bb4cf in fork_exit (
> callout=0xffffffff81678610 <txg_sync_thread>, arg=0xfffffe0037335000,
> frame=0xffffff810c3d5c40) at /usr/src/sys/kern/kern_fork.c:992
> #29 0xffffffff80bc2e8e in fork_trampoline ()
> at /usr/src/sys/amd64/amd64/exception.S:602
[snip]
--
Andriy Gapon
More information about the freebsd-fs
mailing list