ZFS crashing during snapdir lookup for non-existent snapshot...
Andriy Gapon
avg at FreeBSD.org
Thu Oct 11 09:32:57 UTC 2012
on 11/10/2012 01:45 Andriy Gapon said the following:
>
> [restoring mailing list cc]
>
> on 11/10/2012 00:58 Sean Chittenden said the following:
>>>> I don't have a dump from this particular system, only the backtrace from the crash. The system is ZFS only and I only have a ZFS swapdir. :-/
>>>>
>>>> I have the kernel still so I can poke at the code and the compiled kernel (kernel.symbols). ? What are you looking for? -sc
>>>>
>>>
>>> list *zfsctl_snapdir_lookup+0x124 in kgdb
>>
>> (kgdb) list *zfsctl_snapdir_lookup+0x124
>> 0xffffffff816e9384 is in zfsctl_snapdir_lookup (/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c:992).
>> 987 *direntflags = ED_CASE_CONFLICT;
>> 988 #endif
>> 989 }
>> 990
>> 991 mutex_enter(&sdp->sd_lock);
>> 992 search.se_name = (char *)nm;
>> 993 if ((sep = avl_find(&sdp->sd_snaps, &search, &where)) != NULL) {
>> 994 *vpp = sep->se_root;
>> 995 VN_HOLD(*vpp);
>> 996 err = traverse(vpp, LK_EXCLUSIVE | LK_RETRY);
>
> It seems that the problem is in Solaris-ism that remained in the code.
> I think that zfsctl_snapdir_inactive should not destroy sdp, that should be a
> job of vop_reclaim. Otherwise, if the vnode is re-activated its v_data points
> to freed memory.
>
Particularly I have this scenario in mind:
- one thread, T1, performs a vput-ish operation which leads to vop_inactive on a
current vnode that represents ".zfs/snapshot"
- at the same time T2 executes a lookup that goes into zfsctl_root_lookup
- let's assume that at some point T1 is at the very start of
zfsctl_snapdir_inactive, it holds just a vnode lock
- at the same time T2 is in gfs_dir_lookup->gfs_dir_lookup_static and it has
gfs_dir_lock
- so T2 finds the 'snapshot' static entry in gfsd_static[]
- T2 finds the cached vnode and adds a reference
- T2 does gfs_dir_unlock and returns the vnode
- now T1 proceeds through zfsctl_snapdir_inactive and destroys the v_data (but
without clearing the pointer, even)
- T2 uses the vnode and gets a crash
Possible resolutions:
- make vop_inactive a noop and make vop_reclaim call the current inactive methods
- check v_usecount in gfs_file_inactive after gfs_dir_lock is obtained and bail
out if it is > 0 (somewhat similar to what zfs_zinactive does)
- something else?
Easy way to reproduce the problem in one way or another - run many of the
following in parallel:
while true; do ls -l /pool/fs/.zfs/ >/dev/null; done
Here is another panic that is a variation of the above scenario. Duplicate
gfs_vop_inactive is called after a "harmless" vop_pathconf call (doesn't touch a
vnode). In this case the "shares" entry appears to be a random victim:
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address = 0x18
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff825fe7dd
stack pointer = 0x28:0xffffff80e040b800
frame pointer = 0x28:0xffffff80e040b830
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, IOPL = 0
current process = 712 (ls)
trap number = 12
panic: page fault
cpuid = 1
curthread: 0xfffffe0003d8a9a0
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff802d2bba = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0xffffffff805596fa = kdb_backtrace+0x3a
panic() at 0xffffffff8051c2a6 = panic+0x266
trap_fatal() at 0xffffffff8070741d = trap_fatal+0x3ad
trap_pfault() at 0xffffffff8070756c = trap_pfault+0x12c
trap() at 0xffffffff80707d19 = trap+0x4f9
calltrap() at 0xffffffff806ef903 = calltrap+0x8
--- trap 0xc, rip = 0xffffffff825fe7dd, rsp = 0xffffff80e040b800, rbp =
0xffffff80e040b830 ---
gfs_vop_inactive() at 0xffffffff825fe7dd = gfs_vop_inactive+0x1d
VOP_INACTIVE_APV() at 0xffffffff80782fb4 = VOP_INACTIVE_APV+0x114
vinactive() at 0xffffffff805c84ad = vinactive+0x15d
vputx() at 0xffffffff805ca962 = vputx+0x4d2
vput() at 0xffffffff805ca9ce = vput+0xe
kern_pathconf() at 0xffffffff805cd44e = kern_pathconf+0x10e
sys_lpathconf() at 0xffffffff805cd4aa = sys_lpathconf+0x1a
amd64_syscall() at 0xffffffff80706953 = amd64_syscall+0x313
Xfast_syscall() at 0xffffffff806efbe7 = Xfast_syscall+0xf7
--
Andriy Gapon
More information about the freebsd-fs
mailing list