panic: sleeping thread on r352386
Konstantin Belousov
kostikbel at gmail.com
Tue Sep 17 08:07:19 UTC 2019
On Tue, Sep 17, 2019 at 02:42:51PM +0900, Masachika ISHIZUKA wrote:
> >> This panic happens on 1300047 (both r352239 and r352386) with core
> >> i5-7500 as follows. This panic dose not happen on r351728 (1300044).
> >> (The following lines were typed by hand so they might have some miss
> >> typed letters.)
> >>
> >> ==
> >> Sleeping thread (tid 100177, pid 1814) owns a non-sleepable lock
> >> KDB: stack backtrace of thread 100177:
> >
> >
> > https://svnweb.freebsd.org/base?view=revision&revision=352393
>
> Thank you for reply.
>
> I updated to r352431 and this does not panic. Thank you very much.
> But 'make buildworld' fails by segment fault like below.
> (buildworld is running over the nfs file system.)
>
> --- modules-all ---
> --- ath_hal_ar5211.ko.debug ---
> objcopy --only-keep-debug ath_hal_ar5211.ko.full ath_hal_ar5211.ko.debug
> Segmentation fault (core dumped)
> *** [ath_hal_ar5211.ko.debug] Error code 139
> make[4]: stopped in /usr/altlocal/freebsd-current/src/sys/modules/ath_hal_ar52111 error
>
> The position of segment fault is diffrent each time.
> The below is output of another 'make buildworld'.
>
> --- kernel.full ---
> Segmentation fault (core dumped)
> *** [kernel.full] Error code 139
> make[2]: stopped in /usr/altlocal/freebsd-current/obj/usr/altlocal/freebsd-current/src/amd64.amd64/sys/GENERIC
>
> /var/log/messages is shown as bellow.
>
> Sep 17 11:22:56 okra kernel: Failed to fully fault in a core file segment at VA
> 0x800a00000 with size 0x163000 to be written at offset 0x84a000 for process nm
> Sep 17 11:22:56 okra kernel: pid 53593 (nm), jid 0, uid 16220: exited on signal
> 11 (core dumped)
> Sep 17 11:22:57 okra kernel: Failed to fully fault in a core file segment at VA
> 0x800a00000 with size 0x163000 to be written at offset 0x88b000 for process objcopy
> Sep 17 11:22:57 okra kernel: pid 53603 (objcopy), jid 0, uid 16220: exited on signal 11 (core dumped)
>
> Retry 'make buildworld'
>
> Sep 17 12:24:05 okra kernel: Failed to fully fault in a core file segment at VA
> 0x8002f6000 with size 0x93000 to be written at offset 0x239000 for process nm
> Sep 17 12:24:05 okra kernel: pid 96873 (nm), jid 0, uid 16220: exited on signal
> 11 (core dumped)
> Sep 17 12:24:05 okra kernel: Failed to fully fault in a core file segment at VA
> 0x80035f000 with size 0x93000 to be written at offset 0x281000 for process objcopy
> Sep 17 12:24:06 okra kernel: pid 96889 (objcopy), jid 0, uid 16220: exited on signal 11 (core dumped)
>
> Retry 'make buildworld'
>
> Sep 17 14:01:39 okra kernel: Failed to fully fault in a core file segment at VA
> 0x8048da000 with size 0x112000 to be written at offset 0x1a33000 for process ld.lld
> Sep 17 14:01:51 okra kernel: Failed to fully fault in a core file segment at VA
> 0x8117cc000 with size 0x1e7000 to be written at offset 0xe925000 for process ld.lld
> Sep 17 14:01:53 okra kernel: pid 50292 (ld.lld), jid 0, uid 16220: exited on signal 11 (core dumped)
>
> I can 'make buildworld' successfully on r351728(1300044).
Try the following change, which more accurately tries to avoid
vnode_pager_setsize(). The real cause requires much more extensive
changes.
diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c
index 63ea4736707..16dc7745c77 100644
--- a/sys/fs/nfsclient/nfs_clport.c
+++ b/sys/fs/nfsclient/nfs_clport.c
@@ -414,12 +414,11 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper,
struct nfsnode *np;
struct nfsmount *nmp;
struct timespec mtime_save;
- u_quad_t nsize;
- int setnsize, error, force_fid_err;
+ u_quad_t nsize, osize;
+ int error, force_fid_err;
+ bool setnsize;
error = 0;
- setnsize = 0;
- nsize = 0;
/*
* If v_type == VNON it is a new node, so fill in the v_type,
@@ -439,6 +438,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper,
nmp = VFSTONFS(vp->v_mount);
vap = &np->n_vattr.na_vattr;
mtime_save = vap->va_mtime;
+ osize = vap->va_size;
if (writeattr) {
np->n_vattr.na_filerev = nap->na_filerev;
np->n_vattr.na_size = nap->na_size;
@@ -511,8 +511,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper,
* zero np->n_attrstamp to indicate that
* the attributes are stale.
*/
- nsize = vap->va_size = np->n_size;
- setnsize = 1;
+ vap->va_size = np->n_size;
np->n_attrstamp = 0;
KDTRACE_NFS_ATTRCACHE_FLUSH_DONE(vp);
} else if (np->n_flag & NMODIFIED) {
@@ -526,22 +525,9 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper,
np->n_size = vap->va_size;
np->n_flag |= NSIZECHANGED;
}
- nsize = np->n_size;
- setnsize = 1;
- } else if (vap->va_size < np->n_size) {
- /*
- * When shrinking the size, the call to
- * vnode_pager_setsize() cannot be done
- * with the mutex held, so delay it until
- * after the mtx_unlock call.
- */
- nsize = np->n_size = vap->va_size;
- np->n_flag |= NSIZECHANGED;
- setnsize = 1;
} else {
- nsize = np->n_size = vap->va_size;
+ np->n_size = vap->va_size;
np->n_flag |= NSIZECHANGED;
- setnsize = 1;
}
} else {
np->n_size = vap->va_size;
@@ -579,6 +565,21 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper,
if (np->n_attrstamp != 0)
KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, error);
#endif
+ nsize = vap->va_size;
+ if (nsize == osize) {
+ setnsize = false;
+ } else if (nsize > osize) {
+ vnode_pager_setsize(vp, nsize);
+ setnsize = false;
+ } else {
+ /*
+ * When shrinking the size, the call to
+ * vnode_pager_setsize() cannot be done with the mutex
+ * held, because we might need to wait for a busy
+ * page. Delay it until after the node is unlocked.
+ */
+ setnsize = true;
+ }
NFSUNLOCKNODE(np);
if (setnsize)
vnode_pager_setsize(vp, nsize);
More information about the freebsd-current
mailing list