Packet loss every 30.999 seconds
Bruce Evans
brde at optusnet.com.au
Tue Dec 18 04:37:02 PST 2007
On Mon, 17 Dec 2007, Scott Long wrote:
> Bruce Evans wrote:
>> On Mon, 17 Dec 2007, David G Lawrence wrote:
>>
>>> One more comment on my last email... The patch that I included is not
>>> meant as a real fix - it is just a bandaid. The real problem appears to
>>> be that a very large number of vnodes (all of them?) are getting synced
>>> (i.e. calling ffs_syncvnode()) every time. This should normally only
>>> happen for dirty vnodes. I suspect that something is broken with this
>>> check:
>>>
>>> if (vp->v_type == VNON || ((ip->i_flag &
>>> (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0 &&
>>> vp->v_bufobj.bo_dirty.bv_cnt == 0)) {
>>> VI_UNLOCK(vp);
>>> continue;
>>> }
>>
>> Isn't it just the O(N) algorithm with N quite large? Under ~5.2, on
> Right, it's a non-optimal loop when N is very large, and that's a fairly
> well understood problem. I think what DG was getting at, though, is
> that this massive flush happens every time the syncer runs, which
> doesn't seem correct. Sure, maybe you just rsynced 100,000 files 20
> seconds ago, so the upcoming flush is going to be expensive. But the
> next flush 30 seconds after that shouldn't be just as expensive, yet it
> appears to be so.
I'm sure it doesn't cause many bogus flushes. iostat shows zero writes
caused by calling this incessantly using "while :; do sync; done".
> This is further supported by the original poster's
> claim that it takes many hours of uptime before the problem becomes
> noticeable. If vnodes are never truly getting cleaned, or never getting
> their flags cleared so that this loop knows that they are clean, then
> it's feasible that they'll accumulate over time, keep on getting flushed
> every 30 seconds, keep on bogging down the loop, and so on.
Using "find / >/dev/null" to grow the problem and make it bad after a
few seconds of uptime, and profiling of a single sync(2) call to show
that nothing much is done except the loop containing the above:
under ~5.2, on a 2.2GHz A64 UP ini386 mode:
after booting, with about 700 vnodes:
% % cumulative self self total
% time seconds seconds calls ns/call ns/call name
% 30.8 0.000 0.000 0 100.00% mcount [4]
% 14.9 0.001 0.000 0 100.00% mexitcount [5]
% 5.5 0.001 0.000 0 100.00% cputime [16]
% 5.0 0.001 0.000 6 13312 13312 vfs_msync [18]
% 4.3 0.001 0.000 0 100.00% user [21]
% 3.5 0.001 0.000 5 11321 11993 ffs_sync [23]
after "find / >/dev/null" was stopped after saturating at 64000 vnodes
(desiredvodes is 70240):
% % cumulative self self total
% time seconds seconds calls ns/call ns/call name
% 50.7 0.008 0.008 5 1666427 1667246 ffs_sync [5]
% 38.0 0.015 0.006 6 1041217 1041217 vfs_msync [6]
% 3.1 0.015 0.001 0 100.00% mcount [7]
% 1.5 0.015 0.000 0 100.00% mexitcount [8]
% 0.6 0.015 0.000 0 100.00% cputime [22]
% 0.6 0.016 0.000 34 2660 2660 generic_bcopy [24]
% 0.5 0.016 0.000 0 100.00% user [26]
vfs_msync() is a problem too. It uses an almost identical loop for
the case where the vnode is not dirty (but has a different condition
for being dirty). ffs_sync() is called 5 times because there are 5
ffs file systems mounted r/w. There is another ffs file system mounted
r/o and that combined with a missing r/o optimization might give the
extra call to vfs_msync(). With 64000 vnodes, the calls take 1-2 ms
each. That is already quite a lot, and there are many calls. Each
call only looks at vnodes under the mount point so the number of mounted
file systems doesn't affect the total time much.
ffs_sync() i taking 125 ns per vnode. That is a more than I would have
expected.
Bruce
More information about the freebsd-net
mailing list