debugging process in bovlbx state
Benjamin Kaduk
kaduk at MIT.EDU
Mon Dec 20 00:25:10 UTC 2010
Hi all,
I'm working on bringing the out-of-tree OpenAFS network filesystem
up-to-date for FreeBSD 7.3-RELEASE, and I think I need some help to fix
this bug.
I should preface my discourse with the fact that there is a whole slow of
lock order reversals that I haven't even tried to track down, but I do not
believe that this hang is deadlock since 'show alllocks' in DDB does not
show anything that seems interesting.
Any pointers for things to look at would be appreciated; more details of
the failing case below.
In order to get the afs kernel module to load, I needed to tweak a few
lines of code in getpages(), as I had previously cribbed a bunch of
changes/updates from the experimental NFS client while getting AFS to work
on current freebsd. In particular, vm_page_set_valid is not present in
7.3, so I am currently running with:
--- a/src/afs/FBSD/osi_vnodeops.c
+++ b/src/afs/FBSD/osi_vnodeops.c
@@ -890,12 +890,8 @@ afs_vop_getpages(struct vop_getpages_args *ap)
* Read operation filled a partial page.
*/
m->valid = 0;
- vm_page_set_valid(m, 0, size - toff);
-#ifndef AFS_FBSD80_ENV
- vm_page_undirty(m);
-#else
+ vm_page_set_validclean(m, 0, size - toff);
KASSERT(m->dirty == 0, ("afs_getpages: page %p is dirty", m));
-#endif
}
But my knowledge of vm_page_* is approximately nil, so there's no reason
to think everything was correct even before that patch.
Anyway, my test case is running libarchive's configure script with source
and destination directories in (different places in) AFS. It only gets
twenty lines in, ending with:
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc...
^Tload: 0.04 cmd: cp 1250 [bovlbx] 0.00u 0.00
procstat -kk reports:
mega-man# procstat -kk 1250
PID TID COMM TDNAME KSTACK
1250 100060 cp - mi_switch+0x233
sleepq_switch+0xe9 sleepq_wait+0x44 _sleep+0x3a0 vm_object_pip_wait+0x4e
bufobj_invalbuf+0x10e afs_GetVCache+0x2f7
The call to vinvalbuf in afs_GetVCache is here:
1646 iheldthelock = VOP_ISLOCKED(vp, curthread);
1647 if (!iheldthelock)
1648 vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, curthread);
1649 AFS_GUNLOCK();
1650 vinvalbuf(vp, V_SAVE, curthread, PINOD, 0);
1651 AFS_GLOCK();
1652 if (!iheldthelock)
1653 VOP_UNLOCK(vp, LK_EXCLUSIVE, curthread);
Which is not very enlightening. I kind of suspect that some flags on the
bufobj were erroneously set elsewhere and it is only now popping up.
afs_GetVCache is in this source file:
http://git.openafs.org/?p=openafs.git;a=blob;f=src/afs/afs_vcache.c;h=26ed2c2be271048509425583f0cc2de6c4166c4b;hb=HEAD
and {get,put}pages in this:
http://git.openafs.org/?p=openafs.git;a=blob;f=src/afs/FBSD/osi_vnodeops.c;h=7ae6571adb74d69cfe25e3190ade3b22dc8cdab8;hb=HEAD
Thanks,
Ben Kaduk
More information about the freebsd-fs
mailing list