amd64/74811: df, nfs mount, negative Avail -> 32/64-bit
confusion
Bruce Evans
bde at zeta.org.au
Thu Dec 9 16:30:30 PST 2004
The following reply was made to PR amd64/74811; it has been noted by GNATS.
From: Bruce Evans <bde at zeta.org.au>
To: Palle Girgensohn <girgen at freebsd.org>
Cc: FreeBSD-gnats-submit at freebsd.org, freebsd-amd64 at freebsd.org
Subject: Re: amd64/74811: df, nfs mount, negative Avail -> 32/64-bit confusion
Date: Fri, 10 Dec 2004 11:25:42 +1100 (EST)
On Tue, 7 Dec 2004, Palle Girgensohn wrote:
> >Description:
>
> using FreeBSD 5.3 amd64 as nfs client
> FreeBSD 4.10 i386 as nfs server
The combination of client and server is critical for demonstrating this
bug. Broken servers don't implement negative avail counts. FreeBSD-5's
server was broken in rev.1.140 of nfs_serv.c to "fix" the problem
reported in this PR. FreeBSD-4's server remains unbroken.
> when a disk is filled up over 100%, Avail becomes negative on the
> server, but hugely postive on the 64-bit platform. Not very
> surprising, but still a bug... :)
>
> 4.10 i386 server:
> Filesystem 1K-blocks Used Avail Capacity Mounted on
> /dev/da6s1f 17388202 16153532 -156386 101% /dumps/0
>
> 5.3 amd64 client:
> Filesystem 1K-blocks Used Avail Capacity Mounted on
> banan:/dumps/0 17388202 16153532 18014398509325598 0% /mnt
This is caused by sign extension/overflow bugs in nfs_vfsops.c. From
the version in FreeBSD-5.3 (rev.1.158):
% u_quad_t tquad;
% ...
% tquad = fxdr_hyper(&sfp->sf_abytes);
% if (((long)(tquad / bsize) > LONG_MAX) ||
^^^^^^^^^^^^^^^^^^^^^
% ((long)(tquad / bsize) < LONG_MIN))
^^^^^^^^^^^^^^^^^^^^^
% continue;
% sbp->f_bavail = tquad / bsize;
% ^^^^^^^^^^^^^
-156386 1K-blocks is passed by the server as (uint64_t)(-156386 * 1024) =
(2**64 - 156386 * 1024). It needs to be converted back to a signed
quantity before dividing it by bsize, but this is not done. tquad is
still (2**64 - 156386 * 1024). bsize is always 512 in FreeBSD-5.3
(**). The division gives the wrong value (2**55 - 156386 * 2). This
is passed back to userland. It is a block count in 512-blocks, so df
divides it by 2 to convert to 1K-blocks. The final value printed is
(2**54 - 156386) = 18014398509325598.
The magic number 18014398509325598 is easy to recognize. 2**64 is
1844..., so huge values starting with the digits 18 are often
misrepresentations of small negative values converted to uint64_t.
Here the value is 1801... instead of 1844..., and on closer examination
has 3 fewer digits. It is just the corresponding 1844... value divided
by 2**10 = 1024 to convert to 1K-blocks. Applications like df could
recognize such magic numbers (not so) similarly and fix them up, but
shouldn't have to.
(*) Other aspects of this bug include the code that doubles bsize
actually being executed in some versions of FreeBSD on some machines,
including -current on i386's. It is broken and gives a kernel panic
for division by bsize = 0 for about half of all possible values for
negative available space, including all values that are likely to occur
(small negative ones). See PR 56606 for more details of older aspects
of this bug suite. There are many newer ones.
Bruce
More information about the freebsd-amd64
mailing list