NFS 4.1 RECLAIM_COMPLETE FS failed error
NAGY Andreas
Andreas.Nagy at
Mon Jul 9 06:48:12 UTC 2018
Hi! Sorry, I did not forget the traces, but had no time so far and as I am actually setting up several servers on the system I don't want to break anything by performing tests. I will send them as soon I have finished my actual work. Will be at least end of this week.
As I am actually setting up/cloning 80 VMs that are stored on the NFS datastore I can just report that the setup performs well and seems to be stable. Only thing that happened twice while working with ZFS snapshots/clones was that the ESXi host lost the connection to the NFS datastore. Don't know if it was while creating or deleting a clone, but the only way to recover from this was to restart nfsd or to switchover HAST/CARP, but all without crashing any VM.
-----Original Message-----
From: owner-freebsd-stable at [mailto:owner-freebsd-stable at] On Behalf Of Rick Macklem
Sent: Montag, 9. Juli 2018 04:11
To: Daniel Engel <daniel at>; freebsd-stable at
Subject: Re: NFS 4.1 RECLAIM_COMPLETE FS failed error
Daniel Engel wrote:
[stuff snipped]
>I traced the commits that Rick has made since that thread and merged them 'head' >into 'stable':
> 'svnlite checkout'
> 'svnlite merge -c 332790'
> 'svnlite merge -c 333508'
> 'svnlite merge -c 333579'
> 'svnlite merge -c 333580'
> 'svnlite merge -c 333592'
> 'svnlite merge -c 333645'
> 'svnlite merge -c 333766'
> 'svnlite merge -c 334396'
> 'svnlite merge -c 334492'
> 'svnlite merge -c 327674'
Yes, you have all the commits to head related to the 4.1 server that might affect the ESXi client, plus a bunch that should be harmless, but I don't think affect the ESXi client mounts. (Most of these will get MFC'd to stable/11, but I haven't gotten around to it yet.)
The ones that might be in 6.7 (they were in 6.5) that may bite you are:
- The client does an OpenDownGrade with all OPEN_SHARE_ACCESS and
OPEN_SHARE_DENY bits set for something it calls a "drive lock".
(Adding bits is supposed to be done via an Open/ClaimNull and not
OpenDowngrade.) I'd really like to know if this still happens for 6.7?
- Something about "directory modified too often" when doing deletion of a bunch
of files. (I have no idea what this one means, but apparently it was seen for
other NFSv4.1 servers.)
- Some warnings about "wrong reason for not issuing a delegation". I have a fix
for this one in PR#226650, but they are just warnings and don't seem to
matter much.
The rest of the really nasty stuff happens after a server reboot. The recovery code seemed to be badly broken in the 6.5 client. (All sorts of fun stuff like the client looping doiing ExchangeID operations forever. VM crashes...)
>That completely fixed the connection instability, but the NFS share was still mounting >read-only with a RECLAIM_COMPLETE error. So, I manually applied the first patch >from the previous thread and everything started working:
> --- fs/nfsserver/nfs_nfsdserv.c.savrecl 2018-02-10 20:34:31.166445000 -0500
> +++ fs/nfsserver/nfs_nfsdserv.c 2018-02-10 20:36:07.947490000 -0500
> @@ -4226,10 +4226,9 @@ nfsrvd_reclaimcomplete(struct nfsrv_desc
> goto nfsmout;
> }
> NFSM_DISSECT(tl, uint32_t *, NFSX_UNSIGNED);
> + nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
> if (*tl == newnfs_true)
> - nd->nd_repstat = NFSERR_NOTSUPP;
> - else
> - nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
> + nd->nd_repstat = 0;
I think this patch is ok to use, since no other extant client does a ReclaimComplete with "one_fs == true". It does kinda violate the RFC.
The problem is that FreeBSD exports a hierarchy of file systems and telling the server that one of them has been reclaimed is useless. (This hack just assumes the client meant to say "one_fs == false".) There was also a case (I think it was after a server reboot) where the client would do one of these after doing a ReclaimComplete with "one_fs == false" and that is definitely bogus (the server would reply NFS4ERR_ALREADY_COMPLETE without the above hack) since the "one_fs == false" operation means all file systems have been reclaimed.
Anyhow, once I get some packet traces from Andreas for 6.7, I'll try and figure out how to handle at least some of the outstanding issues.
Good luck with it, rick
freebsd-stable at mailing list
To unsubscribe, send any mail to "freebsd-stable-unsubscribe at"
More information about the freebsd-stable
mailing list