Processes get stuck in "ufs" state
Oleg Derevenetz
oleg at vsi.ru
Sun Mar 25 23:09:03 UTC 2007
Цитирую Oleg Derevenetz <oleg at vsi.ru>:
> On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote:
>
> >> Sometimes (once a week approximately) I have a problem with the same
> >> symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD
> Opteron(tm)
> >> Processor 850:
> >>
> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat=
> >>
> >> Sometimes (apparently when CPU load suddenly goes up) all processes
> that
> >> interacts with disk gets stuck in "ufs" state, but in my case
> >> SIGSTOP/SIGCONT seemingly does not help.
> >
> > See developer handbook, Deadlock Debugging chapter for instruction
> what
> > information shall be gathered to debug the problem.
>
> OK, I built kernel with debug options and will wait for stuck. By the
> way, when debug options turned on, I see this message on every
> boot when nullfs mounting in progress:
>
> acquiring duplicate lock of same type: "vnode interlock"
> 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806
> 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040
> KDB: stack backtrace:
> kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at
> kdb_backtrace+0x29
> witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578
> _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at
> _mtx_lock_flags+0x78
> vrefcnt(cfd5c414) at vrefcnt+0x20
> null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56
> null_lock(f02f1a68) at null_lock+0x66
> VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87
> vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac
> nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407)
> at nullfs_root+0x26
> vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf)
> at vfs_domount+0x975
> vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9
> nmount(cfc60300,f02f1d04) at nmount+0x8b
> syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp =
> 0xbf7fe5bc, ebp = 0xbf7fee38 ---
>
> This host have nullfs filesystems. Is this can be related to deadlock ?
FYI: after replacing nullfs filesystems with unionfs (using new unionfs
implementation):
http://people.freebsd.org/~daichi/unionfs/
all deadlocks are gone. It seems to be a problem in current nullfs
implementation, but I can't debug it properly because deadlock cases are
relatively rare and machine that uses nullfs is heavily loaded so WITNESS and
DEBUG options leads to unacceptable performance penalty.
More information about the freebsd-stable
mailing list