Change default VFS timestamp precision?
John Baldwin
jhb at freebsd.org
Wed Dec 17 14:58:54 UTC 2014
On Wednesday, December 17, 2014 12:38:44 AM Jilles Tjoelker wrote:
> On Tue, Dec 16, 2014 at 01:48:41PM -0500, John Baldwin wrote:
> > We still ship with vfs.timestamp_precision=0 by default meaning that
> > VFS timestamps have a granularity of one second. It is not unusual on
> > modern systems for multiple updates to a file or directory to occur
> > within a single second (and thus share the same effective timestamp).
> > This can break things that depend on timestamps to know when something
> > has changed or is stale (such as make(1) or NFS clients). On hardware
> > that has a cheap timecounter, I we should use the most-precise
> > timestamps (vfs.timestamp_precision=3). However, I'm less sure of
> > what to do for other cases such as i386/amd64 when not using TSC, or
> > on other platforms. OTOH, perhaps you aren't doing lots of heavy I/O
> > access on a system with a slow timecounter (or if you are doing heavy
> > I/O, slow timecounter access won't be your bottleneck)?
> >
> > I can think of a few options:
> > 1) Change vfs.timestamp_precision default to 3 for all systems.
> >
> > 2) Only change vfs.timestamp_precision default to 3 for amd64/i386 using
> > an
> >
> > #ifdef.
> >
> > 3) Something else?
>
> Although some breakage may be caused, increasing precision sounds fine
> to me, but only to the level of microseconds, since there is no way to
> set a timestamp to the nanosecond (this would be futimens/utimensat). It
> is easy to be surprised when cp -p creates an file that appears older
> than the original.
Note that vfs_timestamp() always returns a timespec, but 2 would do
microseconds. The important difference for settings >= 2 is that it queries
the timecounter on each call rather than using a global value that is only
updated either once a second or once a millisecond or so.
> To avoid cross-arch surprises with applications that use
> second-resolution APIs, either all or no architectures should generate
> timestamps more accurate than seconds.
Actually, it will improve our interoperability with other OS's that already
use sub-second timestamps when sharing filesystems over NFS, for example.
> There is no benefit for the particular case of make(1), since it only
> uses timestamps in seconds.
My bad for not checking that further but for assuming make would be impacted.
The use case I _am_ familiar with is NFS servers and NFS v3 clients that
depend on the mtime of a directory to know when the lookup cache for a
directory can be invalidated. Our NFS client now defaults to only trusting
cached lookups for 60 seconds to workaround races due to seconds-granularity
in timestamps from some NFS servers at the cost of reducing its effectiveness
by a fair amount. Note that Isilon already defaults vfs.timestamp_precision
to 3 on their appliances, and I recently convinced the folks at TrueNAS to do
the same. However, it would also make stock FreeBSD NFS servers more reliable
for NFS v3 if we changed our default.
--
John Baldwin
More information about the freebsd-arch
mailing list