cvs commit: src/sys/kern vfs_subr.c src/sys/sys buf.h bufobj.h vnode.h

Don Lewis truckman at FreeBSD.org
Wed Oct 27 02:15:37 PDT 2004


On 27 Oct, Poul-Henning Kamp wrote:
> In message <200410270824.i9R8OqGc019841 at gw.catspoiler.org>, Don Lewis writes:
>>On 27 Oct, Poul-Henning Kamp wrote:
>>> phk         2004-10-27 08:05:03 UTC
>>> 
>>>   FreeBSD src repository
>>> 
>>>   Modified files:
>>>     sys/kern             vfs_subr.c 
>>>     sys/sys              buf.h bufobj.h vnode.h 
>>>   Log:
>>>   Move the syncer linkage from vnode to bufobj.
>>>   
>>>   This is not quite a perfect separation: the syncer still think it knows
>>>   that everything is a vnode.
>>
>>This change strikes me as wrong.  The syncer has to handle things like
>>inode timestamps (utimes(2)) and fifos, which I wouldn't expect to have
>>bufobjs.
> 
> The syncers job is to push dirty buffers onto disk.  In the process it
> will need to call back into whoever owns the buffer so they can do their
> private housekeeping as necessary.
> 
> So the syncer doesn't deal with timestamps, the filesystems do.

One of the things that is on my list of things to do is to handle
timestamp updates by putting them on the syncer worklist.

I stumbled across a bug a while back that causes files being written to
be synced twice as often as they should.  The file gets synced once when
the syncer comes across it it on the worklist, and it gets synced again
when the syncer encounters the file system syncer vnode, which results
in a call to ffs_sync(), which syncs all the vnodes that have pending
inode timestamp changes.

This second sync can result in a large burst of activity if there are a
lot of pending timestamp changes.  This is quite noticable when
unpacking a large tarball or doing something similar that writes to a
lot of files.  There are large bursts of disk activity every 30 seconds
and the machine gets noticeably sluggish.  If you monitor the length of
the syncer worklist, it will vary in a sawtooth manner.

I discussed this privately with mckusick a while back and he told me
that the original intent was to not walk the vnode list in ffs_sync() in
the MNT_LAZY case.  The problem was that timestamp updates could end up
being deferred indefinitely if no buffers were dirtied.

Skipping the vnode list traversal in ffs_sync() in the MNT_LAZY case
would also be a nice optimization just in terms of CPU time because this
list can be quite long.

It also makes sense to sync the timestamps stored in the inode and the
file data blocks at the same time because the block pointers stored in
the inode may need to be updated.



More information about the cvs-src mailing list