amd64/74811: df, nfs mount, negative Avail -> 32/64-bit confusion

Bruce Evans bde at zeta.org.au
Thu Dec 9 16:25:46 PST 2004


On Tue, 7 Dec 2004, Palle Girgensohn wrote:

> >Description:
>
> using FreeBSD 5.3 amd64 as nfs client
>       FreeBSD 4.10 i386 as nfs server

The combination of client and server is critical for demonstrating this
bug.  Broken servers don't implement negative avail counts.  FreeBSD-5's
server was broken in rev.1.140 of nfs_serv.c to "fix" the problem
reported in this PR.  FreeBSD-4's server remains unbroken.

> when a disk is filled up over 100%, Avail becomes negative on the
> server, but hugely postive on the 64-bit platform. Not very
> surprising, but still a bug... :)
>
> 4.10 i386 server:
> Filesystem         1K-blocks     Used   Avail Capacity  Mounted on
> /dev/da6s1f         17388202 16153532 -156386   101%    /dumps/0
>
> 5.3  amd64 client:
> Filesystem                    1K-blocks     Used             Avail Capacity  Mounted on
> banan:/dumps/0                 17388202 16153532 18014398509325598     0%    /mnt

This is caused by sign extension/overflow bugs in nfs_vfsops.c.  From
the version in FreeBSD-5.3 (rev.1.158):

% 	u_quad_t tquad;
% ...
% 			tquad = fxdr_hyper(&sfp->sf_abytes);
% 			if (((long)(tquad / bsize) > LONG_MAX) ||
  			     ^^^^^^^^^^^^^^^^^^^^^
% 			    ((long)(tquad / bsize) < LONG_MIN))
  			     ^^^^^^^^^^^^^^^^^^^^^
% 				continue;
% 			sbp->f_bavail = tquad / bsize;
% 			                ^^^^^^^^^^^^^

-156386 1K-blocks is passed by the server as (uint64_t)(-156386 * 1024) =
(2**64 - 156386 * 1024).  It needs to be converted back to a signed
quantity before dividing it by bsize, but this is not done.  tquad is
still (2**64 - 156386 * 1024).  bsize is always 512 in FreeBSD-5.3
(**).  The division gives the wrong value (2**55 - 156386 * 2).  This
is passed back to userland.  It is a block count in 512-blocks, so df
divides it by 2 to convert to 1K-blocks.  The final value printed is
(2**54 - 156386) = 18014398509325598.

The magic number 18014398509325598 is easy to recognize.  2**64 is
1844..., so huge values starting with the digits 18 are often
misrepresentations of small negative values converted to uint64_t.
Here the value is 1801... instead of 1844..., and on closer examination
has 3 fewer digits.  It is just the corresponding 1844... value divided
by 2**10 = 1024 to convert to 1K-blocks.  Applications like df could
recognize such magic numbers (not so) similarly and fix them up, but
shouldn't have to.

(*) Other aspects of this bug include the code that doubles bsize
actually being executed in some versions of FreeBSD on some machines,
including -current on i386's.  It is broken and gives a kernel panic
for division by bsize = 0 for about half of all possible values for
negative available space, including all values that are likely to occur
(small negative ones).  See PR 56606 for more details of older aspects
of this bug suite.  There are many newer ones.

Bruce


More information about the freebsd-amd64 mailing list