g_mirror_access() dropping geom topology_lock [Was: Kernel crash trying to import a ZFS pool with log device]
Andriy Gapon
avg at FreeBSD.org
Fri Mar 21 09:39:02 UTC 2014
on 20/03/2014 23:13 Andreas Longwitz said the following:
[snip]
> But if I now run "zpool export" and "zpool import" the kernel crashes:
> ...
> vdev_geom_open_by_path:554[1]: Found provider by name /dev/label/C325BL31.
> vdev_geom_attach:102[1]: Attaching to label/C325BL31.
> g_access(0xffffff0096b23a00(label/C325BL31), 1, 0, 1)
> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0]
> 0xffffff0002c0d400(label/C325BL31)
> g_access(0xffffff0002ba7300(da2), 1, 0, 2)
> open delta:[r1w0e2] old:[r0w0e0] provider:[r0w0e0] 0xffffff0002a23800(da2)
> g_disk_access(da2, 1, 0, 2)
> vdev_geom_attach:123[1]: Created geom and consumer for label/C325BL31.
> vdev_geom_read_config:248[1]: Reading config from label/C325BL31...
> vdev_geom_open_by_path:569[1]: guid match for provider /dev/label/C325BL31.
> vdev_geom_open_by_path:554[1]: Found provider by name /dev/label/C330CJHW.
> vdev_geom_attach:102[1]: Attaching to label/C330CJHW.
> g_access(0xffffff00969a5280(label/C330CJHW), 1, 0, 1)
> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0]
> 0xffffff0002c02100(label/C330CJHW)
> g_access(0xffffff0002b96280(da3), 1, 0, 2)
> open delta:[r1w0e2] old:[r0w0e0] provider:[r0w0e0] 0xffffff0002a23b00(da3)
> g_disk_access(da3, 1, 0, 2)
> vdev_geom_attach:143[1]: Created consumer for label/C330CJHW.
> vdev_geom_read_config:248[1]: Reading config from label/C330CJHW...
> vdev_geom_open_by_path:569[1]: guid match for provider /dev/label/C330CJHW.
> vdev_geom_open_by_path:554[1]: Found provider by name /dev/mirror/gm0p3.
> vdev_geom_attach:102[1]: Attaching to mirror/gm0p3.
> g_access(0xffffff0096b24180(mirror/gm0p3), 1, 0, 1)
> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0]
> 0xffffff00969c7e00(mirror/gm0p3)
> g_part_access(mirror/gm0p3,1,0,1)
> g_access(0xffffff0096c0f800(mirror/gm0), 1, 0, 1)
> open delta:[r1w0e1] old:[r8w8e16] provider:[r8w8e16]
> 0xffffff00969c7d00(mirror/gm0)
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e1.
The following part of the log is very informative.
Thank you for capturing it.
> vdev_geom_attach:143[1]: Created consumer for mirror/gm0p3.
> vdev_geom_read_config:248[1]: Reading config from mirror/gm0p3...
> vdev_geom_open_by_path:569[1]: guid match for provider /dev/mirror/gm0p3.
I read the above as thread A entering vdev_geom_open_by_path and trying to
"taste" /dev/mirror/gm0p3.
> g_post_event_x(0xffffffff80b16830, 0xffffff0096b24180, 2, 0)
> vdev_geom_detach:163[1]: Closing access to mirror/gm0p3.
> g_access(0xffffff0096b24180(mirror/gm0p3), -1, 0, -1)
> open delta:[r-1w0e-1] old:[r1w0e1] provider:[r1w0e1]
> 0xffffff00969c7e00(mirror/gm0p3)
Simultaneously thread B is closing access /dev/mirror/gm0p3.
It is not clear from the quoted log when and how this thread B got access to the
device in the first place.
> g_part_access(mirror/gm0p3,-1,0,-1)
> g_access(0xffffff0096c0f800(mirror/gm0), -1, 0, -1)
> open delta:[r-1w0e-1] old:[r9w8e17] provider:[r9w8e17]
> 0xffffff00969c7d00(mirror/gm0)
> GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e-1.
> vdev_geom_open_by_path:554[1]: Found provider by name /dev/mirror/gm0p3.
> vdev_geom_attach:102[1]: Attaching to mirror/gm0p3.
> vdev_geom_attach:128[1]: Found consumer for mirror/gm0p3.
> g_access(0xffffff0096b24180(mirror/gm0p3), 1, 0, 1)
> open delta:[r1w0e1] old:[r1w0e1] provider:[r1w0e1]
> 0xffffff00969c7e00(mirror/gm0p3)
Thread A is accessing the device.
> g_part_access(mirror/gm0p3,1,0,1)
> g_access(0xffffff0096c0f800(mirror/gm0), 1, 0, 1)
> Open delta:[r1w0e1] old:[r9w8e17] provider:[r9w8e17]
> 0xffffff00969c7d00(mirror/gm0)
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e1.
> vdev_geom_detach:167[1]: Destroyed consumer to mirror/gm0p3.
> g_detach(0xffffff0096b24180)
> g_destroy_consumer(0xffffff0096b24180)
Thread B is destroying a special ZFS "taster" consumer.
> vdev_geom_attach:147[1]: Used existing consumer for mirror/gm0p3.
Thread A is trying to re-use the taster consumer, which has just been destroyed.
> vdev_geom_read_config:248[1]:
>
> Fatal trap 12: page fault while in kernel mode
Boom!
I see two issues here.
First, the ZFS tasting code could be made more robust. If it never tried to
re-use the consumer and always created a new one, then most likely this crash
could be avoided. But there is no bug in the code. The code is correct and it
it uses GEOM topology lock to avoid any concurrency issues.
But GEOM mirror code breaks a contract on which the ZFS code relies.
g_access() must be called with the topology lock hold.
I extend this requirement to a requirement that access method of any GEOM
provider must operate under the topology lock and must never drop it.
In other words, if a caller must acquire g_topology_lock before calling
g_access, then in return it must have a guarantee that the GEOM topology stays
unchanged across the call to g_access().
g_mirror_access() breaks the above contract.
So, the code in vdev_geom_attach() obtains g_topology_lock, then it finds an
existing valid consumer and calls g_access() on it. It reasonably expects that
the consumer remains valid, but because g_mirror_access() drops and requires the
topology lock, there is a chance that the topology can change and the consumer
may become invalid.
I am not very familiar with gmirror code, so I am not sure how to fix the
problem from that end.
> cpuid = 1; apic id = 01
> fault virtual address = 0x0
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff80b16f01
> stack pointer = 0x28:0xffffff82452325b0
> frame pointer = 0x28:0xffffff8245232650
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 15494 (initial thread)
> [thread pid 15494 tid 100151 ]
> Stopped at vdev_geom_read_config+0x71: movq (%rdx),%rsi
>
> (kgdb) where
> ...
> #9 0xffffffff805dce1b in trap (frame=0xffffff8245232500) at
> /usr/src/sys/amd64/amd64/trap.c:457
> #10 0xffffffff805c3024 in calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:228
> #11 0xffffffff80b16f01 in vdev_geom_read_config (cp=0xffffff0096b24180,
> config=0xffffff8245232670)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:248
> #12 0xffffffff80b17194 in vdev_geom_read_guid (cp=<value optimized out>)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:454
> #13 0xffffffff80b172f0 in vdev_geom_open_by_path (vd=0xffffff0002b2f000,
> check_guid=1)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:559
> #14 0xffffffff80b17528 in vdev_geom_open (vd=0xffffff0002b2f000,
> psize=0xffffff8245232760, max_psize=0xffffff8245232758,
> ashift=0xffffff8245232750) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:608
> #15 0xffffffff80aca87a in vdev_open (vd=0xffffff0002b2f000)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1153
> #16 0xffffffff80acac5e in vdev_reopen (vd=0xffffff0002b2f000)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1514
> #17 0xffffffff80ab84e0 in spa_load (spa=0xffffff0002b85000,
> state=SPA_LOAD_TRYIMPORT, type=SPA_IMPORT_EXISTING,
> mosconfig=<value optimized out>) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1654
> #18 0xffffffff80abaa40 in spa_tryimport (tryconfig=0xffffff00024b2260)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:4184
> #19 0xffffffff80afb486 in zfs_ioc_pool_tryimport (zc=0xffffff8001f1d000)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:1630
> #20 0xffffffff80afea7f in zfsdev_ioctl (dev=<value optimized out>,
> zcmd=<value optimized out>, arg=0xffffff00966154c0 "\003",
> flag=3, td=<value optimized out>) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:5945
> #21 0xffffffff8037729b in devfs_ioctl_f (fp=0xffffff0002c98960,
> com=3222821382, data=<value optimized out>,
> cred=<value optimized out>, td=0xffffff0096017000) at
> /usr/src/sys/fs/devfs/devfs_vnops.c:700
> #22 0xffffffff80444b22 in kern_ioctl (td=<value optimized out>,
> fd=<value optimized out>, com=3222821382,
> data=0xffffff00966154c0 "\003") at file.h:277
> #23 0xffffffff80444d5d in ioctl (td=0xffffff0096017000,
> uap=0xffffff8245232bb0) at /usr/src/sys/kern/sys_generic.c:679
> #24 0xffffffff805dbca4 in amd64_syscall (td=0xffffff0096017000,
> traced=0) at subr_syscall.c:114
> #25 0xffffffff805c331c in Xfast_syscall () at
> /usr/src/sys/amd64/amd64/exception.S:387
> #26 0x0000000180fcec2c in ?? ()
>
> (kgdb) f 11
> #11 0xffffffff80b16f01 in vdev_geom_read_config (cp=0xffffff0096b24180,
> config=0xffffff8245232670)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:248
> 248 ZFS_LOG(1, "Reading config from %s...", pp->name);
> (kgdb) list
> 243 int error, l, len;
> 244
> 245 g_topology_assert_not();
> 246
> 247 pp = cp->provider;
> 248 ZFS_LOG(1, "Reading config from %s...", pp->name);
> 249
> 250 psize = pp->mediasize;
> 251 psize = P2ALIGN(psize, (uint64_t)sizeof(vdev_label_t));
> 252
> (kgdb) p *cp
> $1 = {geom = 0xffffff0002c0dd00, consumer = {le_next =
> 0xffffff00969a5280, le_prev = 0xffffff0002c0dd20}, provider = 0x0,
> consumers = {le_next = 0xffffff009607bb00, le_prev =
> 0xffffff00969c7e20}, acr = 1, acw = 0, ace = 1, spoiled = 0,
> stat = 0xffffff0002bed5a0, nstart = 17, nend = 17, private = 0x0,
> index = 0}
> (kgdb) p *cp->provider
> Cannot access memory at address 0x0
>
> (kgdb) f 12
> #12 0xffffffff80b17194 in vdev_geom_read_guid (cp=<value optimized out>)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:454
> 454 if (vdev_geom_read_config(cp, &config) == 0) {
> (kgdb) p *cp
> $2 = {geom = 0xffffff0002c0dd00, consumer = {le_next =
> 0xffffff00969a5280, le_prev = 0xffffff0002c0dd20}, provider = 0x0,
> consumers = {le_next = 0xffffff009607bb00, le_prev =
> 0xffffff00969c7e20}, acr = 1, acw = 0, ace = 1, spoiled = 0,
> stat = 0xffffff0002bed5a0, nstart = 17, nend = 17, private = 0x0,
> index = 0}
> (kgdb) info local
> config = (nvlist_t *) 0xffffff0002b85000
> guid = 0
> (kgdb) list
> 449 uint64_t guid;
> 450
> 451 g_topology_assert_not();
> 452
> 453 guid = 0;
> 454 if (vdev_geom_read_config(cp, &config) == 0) {
> 455 guid = nvlist_get_guid(config);
> 456 nvlist_free(config);
> 457 }
> 458 return (guid);
>
> (kgdb) f 13
> #13 0xffffffff80b172f0 in vdev_geom_open_by_path (vd=0xffffff0002b2f000,
> check_guid=1)
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:559
> 559 guid = vdev_geom_read_guid(cp);
> (kgdb) list
> 554 ZFS_LOG(1, "Found provider by name %s.",
> vd->vdev_path);
> 555 cp = vdev_geom_attach(pp);
> 556 if (cp != NULL && check_guid &&
> ISP2(pp->sectorsize) &&
> 557 pp->sectorsize <= VDEV_PAD_SIZE) {
> 558 g_topology_unlock();
> 559 guid = vdev_geom_read_guid(cp);
> 560 g_topology_lock();
> 561 if (guid != vd->vdev_guid) {
> 562 vdev_geom_detach(cp, 0);
> 563 cp = NULL;
> (kgdb) info local
> pp = (struct g_provider *) 0xffffff00969c7e00
> cp = (struct g_consumer *) 0xffffff0096b24180
> guid = <value optimized out>
> __func__ = "ÿÿ\000\000H\213uÀ\211Ø\211\235üþÿÿ\203À\001\205ÀH\213"
> (kgdb) p *cp
> $3 = {geom = 0xffffff0002c0dd00, consumer = {le_next =
> 0xffffff00969a5280, le_prev = 0xffffff0002c0dd20}, provider = 0x0,
> consumers = {le_next = 0xffffff009607bb00, le_prev =
> 0xffffff00969c7e20}, acr = 1, acw = 0, ace = 1, spoiled = 0,
> stat = 0xffffff0002bed5a0, nstart = 17, nend = 17, private = 0x0,
> index = 0}
> (kgdb) p *pp
> $4 = {name = 0xffffff00969c7e88 "mirror/gm0p3", provider = {le_next =
> 0xffffff009698a100, le_prev = 0xffffff0002c03208},
> geom = 0xffffff0002c68000, consumers = {lh_first =
> 0xffffff009607bb00}, acr = 1, acw = 0, ace = 1, error = 0, orphan = {
> tqe_next = 0x0, tqe_prev = 0x0}, mediasize = 8589934592, sectorsize
> = 512, stripesize = 0, stripeoffset = 226575360,
> stat = 0xffffff0002be7120, nstart = 157, nend = 157, flags = 0,
> private = 0xffffff00969c7c00, index = 2}
> (kgdb) quit
>
> The technical reason for the crash is that in "f 11" we have pp =
> cp->provider = 0.
> I can give more information from the kernel dump, also I can easy repeat
> the crash.
>
--
Andriy Gapon
More information about the freebsd-geom
mailing list