Packet loss every 30.999 seconds
Kostik Belousov
kostikbel at gmail.com
Sat Dec 22 12:16:26 PST 2007
On Sun, Dec 23, 2007 at 04:08:09AM +1100, Bruce Evans wrote:
> On Sat, 22 Dec 2007, Kostik Belousov wrote:
> >Yes, rewriting the syncer is the right solution. It probably cannot be done
> >quickly enough. If the yield workaround provide mitigation for now, it
> >shall go in.
>
> I don't think rewriting the syncer just for this is the right solution.
> Rewriting the syncer so that it schedules actual i/o more efficiently
> might involve a solution. Better scheduling would probably take more
> CPU and increase the problem.
I think that we can easily predict what vnode(s) become dirty at the
places where we do vn_start_write().
>
> Note that MNT_VNODE_FOREACH() is used 17 times, so the yielding fix is
> needed in 17 places if it isn't done internally in MNT_VNODE_FOREACH().
> There are 4 places in vfs and 13 places in 6 file systems:
>
> % ./ufs/ffs/ffs_snapshot.c: MNT_VNODE_FOREACH(xvp, mp, mvp) {
> % ./ufs/ffs/ffs_snapshot.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./ufs/ffs/ffs_vfsops.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./ufs/ffs/ffs_vfsops.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./ufs/ufs/ufs_quota.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./ufs/ufs/ufs_quota.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./ufs/ufs/ufs_quota.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./fs/msdosfs/msdosfs_vfsops.c: MNT_VNODE_FOREACH(vp, mp, nvp) {
> % ./fs/coda/coda_subr.c: MNT_VNODE_FOREACH(vp, mp, nvp) {
> % ./gnu/fs/ext2fs/ext2_vfsops.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./gnu/fs/ext2fs/ext2_vfsops.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./kern/vfs_default.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./kern/vfs_subr.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./kern/vfs_subr.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./nfs4client/nfs4_vfsops.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
> % ./nfsclient/nfs_subs.c: MNT_VNODE_FOREACH(vp, mp, nvp) {
> % ./nfsclient/nfs_vfsops.c: MNT_VNODE_FOREACH(vp, mp, mvp) {
>
> Only file systems that support writing need it (for VOP_SYNC() and for
> MNT_RELOAD), else there would be many more places. There would also
> be more places if MNT_RELOAD support were not missing for some file
> systems.
Ok, since you talked about this first :). I already made the following
patch, but did not published it since I still did not inspected all
callers of MNT_VNODE_FOREACH() for safety of dropping mount interlock.
It shall be safe, but better to check. Also, I postponed the check
until it was reported that yielding does solve the original problem.
diff --git a/sys/kern/vfs_mount.c b/sys/kern/vfs_mount.c
index 14acc5b..046af82 100644
--- a/sys/kern/vfs_mount.c
+++ b/sys/kern/vfs_mount.c
@@ -1994,6 +1994,12 @@ __mnt_vnode_next(struct vnode **mvp, struct mount *mp)
mtx_assert(MNT_MTX(mp), MA_OWNED);
KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch"));
+ if ((*mvp)->v_yield++ == 500) {
+ MNT_IUNLOCK(mp);
+ (*mvp)->v_yield = 0;
+ uio_yield();
+ MNT_ILOCK(mp);
+ }
vp = TAILQ_NEXT(*mvp, v_nmntvnodes);
while (vp != NULL && vp->v_type == VMARKER)
vp = TAILQ_NEXT(vp, v_nmntvnodes);
diff --git a/sys/sys/vnode.h b/sys/sys/vnode.h
index dc70417..6e3119b 100644
--- a/sys/sys/vnode.h
+++ b/sys/sys/vnode.h
@@ -131,6 +131,7 @@ struct vnode {
struct socket *vu_socket; /* v unix domain net (VSOCK) */
struct cdev *vu_cdev; /* v device (VCHR, VBLK) */
struct fifoinfo *vu_fifoinfo; /* v fifo (VFIFO) */
+ int vu_yield; /* yield count (VMARKER) */
} v_un;
/*
@@ -185,6 +186,7 @@ struct vnode {
#define v_socket v_un.vu_socket
#define v_rdev v_un.vu_cdev
#define v_fifoinfo v_un.vu_fifoinfo
+#define v_yield v_un.vu_yield
/* XXX: These are temporary to avoid a source sweep at this time */
#define v_object v_bufobj.bo_object
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20071222/6e21f3e5/attachment.pgp
More information about the freebsd-net
mailing list