ESXi NFSv4.1 client id is nasty
Steve Wills
swills at FreeBSD.org
Mon Jun 18 21:21:26 UTC 2018
Would it be possible or reasonable to use the client ID to log a message
telling the admin to enable a sysctl to enable the hacks?
Steve
On 06/17/18 08:35, Rick Macklem wrote:
> Hi,
>
> Andreas Nagy has been doing a lot of testing of the NFSv4.1 client in ESXi 6.5u1
> (VMware) against the FreeBSD server. I have given him a bunch of hackish patches
> to try and some of them do help. However not all issues are resolved.
> The problem is that these hacks pretty obviously violate the NFSv4.1 RFC (5661).
> (Details on these come later, for those interested in such things.)
>
> I can think of three ways to deal with this:
> 1 - Just leave the server as is and point people to the issues that should be addressed
> in the ESXi client.
> 2 - Put the hacks in, but only enable them based on a sysctl not enabled by default.
> (The main problem with this is when the server also has non-ESXi mounts.)
> 3 - Enable the hacks for ESXi client mounts only, using the implementation ID
> it presents at mount time in its ExchangeID arguments.
> - This is my preferred solution, but the RFC says:
> An example use for implementation identifiers would be diagnostic
> software that extracts this information in an attempt to identify
> interoperability problems, performance workload behaviors, or general
> usage statistics. Since the intent of having access to this
> information is for planning or general diagnosis only, the client and
> server MUST NOT interpret this implementation identity information in
> a way that affects interoperational behavior of the implementation.
> The reason is that if clients and servers did such a thing, they
> might use fewer capabilities of the protocol than the peer can
> support, or the client and server might refuse to interoperate.
>
> Note the "MUST NOT" w.r.t. doing this. Of course, I could argue that, since the
> hacks violate the RFC, then why not enable them in a way that violates the RFC.
>
> Anyhow, I would like to hear from others w.r.t. how they think this should be handled?
>
> Here's details on the breakage and workarounds for those interested, from looking
> at packet traces in wireshark:
> Fairly benign ones:
> - The client does a ReclaimComplete with one_fs == false and then does a
> ReclaimComplete with one_fs == true. The server returns
> NFS4ERR_COMPLETE_ALREADY for the second one, which the ESXi client
> doesn't like.
> Woraround: Don't return an error for the one_fs == true case and just assume
> that same as "one_fs == false".
> There is also a case where the client only does the
> ReclaimComplete with one_fs == true. Since FreeBSD exports a hierarchy of
> file systems, this doesn't indicate to the server that all reclaims are done.
> (Other extant clients never do the "one_fs == true" variant of
> ReclaimComplete.)
> This case of just doing the "one_fs == true" variant is actually a limitation
> of the server which I don't know how to fix. However the same workaround
> as listed about gets around it.
>
> - The client puts random garbage in the delegate_type argument for
> Open/ClaimPrevious.
> Workaround: Since the client sets OPEN4_SHARE_ACCESS_WANT_NO_DELEG, it doesn't
> want a delegation, so assume OPEN_DELEGATE_NONE or OPEN_DELEGATE_NONE_EXT
> instead of garbage. (Not sure which of the two values makes it happier.)
>
> Serious ones:
> - The client does a OpenDowngrade with arguments set to OPEN_SHARE_ACCESS_BOTH
> and OPEN_SHARE_DENY_BOTH.
> Since OpenDowngrade is supposed to decrease share_access and share_deny,
> the server returns NFS4ERR_INVAL. OpenDowngrade is not supposed to ever
> conflict with another Open. (A conflict happens when another Open has
> set an OPEN_SHARE_DENY that denies the result of the OpenDowngrade.)
> with NFS4ERR_SHARE_DENIED.
> I believe this one is done by the client for something it calls a
> "device lock" and really doesn't like this failing.
> Workaround: All I can think of is ignore the check for new bits not being set
> and reply NFS_OK, when no conflicting Open exists.
> When there is a conflicting Open, returning NFS4ERR_INVAL seems to be the
> only option, since NFS4ERR_SHARE_DENIED isn't listed for OpenDowngrade.
>
> - When a server reboots, client does not serialize ExchangeID/CreateSession.
> When the server reboots, a client needs to do a serialized set of RPCs
> with ExchangeID followed by CreateSession to confirm it. The reply to
> ExchangeID has a sequence number (csr_sequence) in it and the
> CreateSession needs to have the same value in its csa_sequence argument
> to confirm the clientid issued by the ExchangeID.
> The client sends many ExchangeIDs and CreateSessions, so they end up failing
> many times due to the sequence number not matching the last ExchangeID.
> (This might only happen in the trunked case.)
> Workaround: Nothing that I can think of.
>
> - ExchangeID sometimes sends eia_clientowner.co_verifier argument as all zeros.
> Sometimes the client bogusly fills in the eia_clientowner.co_verifier
> argument to ExchangeID with all 0s instead of the correct value.
> This indicates to the server that the client has rebooted (it has not)
> and results in the server discarding any state for the client and
> re-initializing the clientid.
> Workaround: The server can ignore the verifier changing and make the recovery
> work better. This clearly violates RFC5661 and can only be done for
> ESXi clients, since ignoring this breaks a Linux client hard reboot.
>
> - The client doesn't seem to handle NFS4ERR_GRACE errors correctly.
> These occur when any non-reclaim operations are done during the grace
> period after a server boot.
> (A client needs to delay a while and then retry the operation, repeating
> for as long as NFS4ERR_GRACE is received from the server. This client
> does not do this.)
> Workaround: Nothing that I can think of.
>
> Thanks in advance for any comments, rick
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
>
More information about the freebsd-current
mailing list