Packet loss every 30.999 seconds
Mark Fullmer
maf at eng.oar.net
Tue Dec 18 13:49:31 PST 2007
A little progress.
I have a machine with a KTR enabled kernel running.
Another machine is running David's ffs_vfsops.c's patch.
I left two other machines (GENERIC kernels) running the packet loss test
overnight. At ~ 32480 seconds of uptime the problem starts. This is
really
close to a 16 bit overflow... See http://www.eng.oar.net/~maf/bsd6/
p1.png and
http://www.eng.oar.net/~maf/bsd6/p2.png. The missing impulses at 31
second
marks are the intervals between test runs. The window of missing
packets
(timestamps between two packets where a sequence number is missing)
is usually less than 4us, altough I'm not sure gettimeofday() can be
trusted for measuring this. See https://www.eng.oar.net/~maf/bsd6/
p3.png
Things I'll try tonight:
o check on the patched kernel
o Try KTR debugging enabled before and after an expected high
latency period.
o Dump all files to /dev/null to trigger the behavior.
I would expect the vnode problem to look a little different on the
packet
loss graphs over time. If this leads anywher I'll add a counter
before the msleep() and see how often it's getting there.
On Dec 17, 2007, at 5:24 AM, David G Lawrence wrote:
> I noticed this as well some time ago. The problem has to do with
> the
> processing (syncing) of vnodes. When the total number of allocated
> vnodes
> in the system grows to tens of thousands, the ~31 second periodic sync
> process takes a long time to run. Try this patch and let people
> know if
> it helps your problem. It will periodically wait for one tick (1ms)
> every
> 500 vnodes of processing, which will allow other things to run.
>
> Index: ufs/ffs/ffs_vfsops.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_vfsops.c,v
> retrieving revision 1.290.2.16
> diff -c -r1.290.2.16 ffs_vfsops.c
> *** ufs/ffs/ffs_vfsops.c 9 Oct 2006 19:47:17 -0000 1.290.2.16
> --- ufs/ffs/ffs_vfsops.c 25 Apr 2007 01:58:15 -0000
> ***************
> *** 1109,1114 ****
> --- 1109,1115 ----
> int softdep_deps;
> int softdep_accdeps;
> struct bufobj *bo;
> + int flushed_count = 0;
>
> fs = ump->um_fs;
> if (fs->fs_fmod != 0 && fs->fs_ronly != 0) { /* XXX */
> ***************
> *** 1174,1179 ****
> --- 1175,1184 ----
> allerror = error;
> vput(vp);
> MNT_ILOCK(mp);
> + if (flushed_count++ > 500) {
> + flushed_count = 0;
> + msleep(&flushed_count, MNT_MTX(mp), PZERO, "syncw", 1);
> + }
> }
> MNT_IUNLOCK(mp);
> /*
>
> -DG
>
> David G. Lawrence
> President
> Download Technologies, Inc. - http://www.downloadtech.com - (866)
> 399 8500
> The FreeBSD Project - http://www.freebsd.org
> Pave the road of life with opportunities.
More information about the freebsd-net
mailing list