AW: NFS Mount Hangs
Scheffenegger, Richard
Richard.Scheffenegger at netapp.com
Fri Mar 19 16:14:04 UTC 2021
Hi Rick,
I did some reshuffling of socket-upcalls recently in the TCP stack, to prevent some race conditions with our $work in-kernel NFS server implementation.
Just mentioning this, as this may slightly change the timing (mostly delay the upcall until TCP processing is all done, while before an in-kernel consumer could register for a socket upcall, do some fancy stuff with the data sitting in the socket bufferes, before returning to the tcp processing).
But I think there is no socket data handling being done in the upstream in-kernel NFS server (and I have not even checked, if it actually registers an socket-upcall handler).
https://reviews.freebsd.org/R10:4d0770f1725f84e8bcd059e6094b6bd29bed6cc3
If you can reproduce this easily, perhaps back out this change and see if that has an impact...
NFS server is to my knowledge the only upstream in-kernel TCP consumer which may be impacted by this.
Richard Scheffenegger
-----Ursprüngliche Nachricht-----
Von: owner-freebsd-net at freebsd.org <owner-freebsd-net at freebsd.org> Im Auftrag von Rick Macklem
Gesendet: Freitag, 19. März 2021 16:58
An: tuexen at freebsd.org
Cc: Scheffenegger, Richard <Richard.Scheffenegger at netapp.com>; freebsd-net at freebsd.org; Alexander Motin <mav at FreeBSD.org>
Betreff: Re: NFS Mount Hangs
NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Michael Tuexen wrote:
>> On 18. Mar 2021, at 21:55, Rick Macklem <rmacklem at uoguelph.ca> wrote:
>>
>> Michael Tuexen wrote:
>>>> On 18. Mar 2021, at 13:42, Scheffenegger, Richard <Richard.Scheffenegger at netapp.com> wrote:
>>>>
>>>>>> Output from the NFS Client when the issue occurs # netstat -an |
>>>>>> grep NFS.Server.IP.X
>>>>>> tcp 0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 FIN_WAIT2
>>>>> I'm no TCP guy. Hopefully others might know why the client would
>>>>> be stuck in FIN_WAIT2 (I vaguely recall this means it is waiting
>>>>> for a fin/ack, but could be wrong?)
>>>>
>>>> When the client is in Fin-Wait2 this is the state you end up when the Client side actively close() the tcp session, and then the server also ACKed the FIN.
>> Jason noted:
>>
>>> When the issue occurs, this is what I see on the NFS Server.
>>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.51550 CLOSE_WAIT
>>>
>>> which corresponds to the state on the client side. The server
>>> received the FIN from the client and acked it.
>>> The server is waiting for a close call to happen.
>>> So the question is: Is the server also closing the connection?
>> Did you mean to say "client closing the connection here?"
>Yes.
>>
>> The server should call soclose() { it never calls soshutdown() } when
>> soreceive(with MSG_WAIT) returns 0 bytes or an error that indicates
>> the socket is broken.
Btw, I looked and the soreceive() is done with MSG_DONTWAIT, but the EWOULDBLOCK is handled appropriately.
>> --> The soreceive() call is triggered by an upcall for the rcv side of the socket.
>> So, are you saying the FreeBSD NFS server did not call soclose() for this case?
>Yes. If the state at the server side is CLOSE_WAIT, no close call has happened yet.
>The FIN from the client was received, it was ACKED, but no close() call
>(or shutdown(..., SHUT_WR) or shutdown(..., SHUT_RDWR)) was issued.
>Therefore, no FIN was sent and the client should be in the FINWAIT-2
>state. This was also reported. So the reported states are consistent.
For a test, I commented out the soclose() call in the server side krpc and, when I dismounted, it did leave the server socket in CLOSE_WAIT.
For the FreeBSD client, it did the dismount and the socket was in FIN_WAIT2 for a little while and then disappeared (someone mentioned a short timeout and that seems to be the case).
I might argue that the Linux client should not get hung when this occurs, but there does appear to be an issue on the FreeBSD end.
So it does appear you have a case where the soclose() call is not happening on the FreeBSD NFS server. I am a little surprised since I don't think I've heard of this before and the code is at least 10years old (at least the parts related to this).
For the soclose() to not happen, the reference count on the socket structure cannot have gone to zero. (ie a SVC_RELEASE() was missed) Upon code inspection, I was not able to spot a reference counting bug.
(Not too surprising, since a reference counting bug should have shown up long ago.)
The only thing I spotted that could conceivably explain this is that the function svc_vc_stat() which returns the indication that the socket has been closed at the other end did not bother to do any locking when it checked the status. (I am not yet sure if this could result in the status of XPRT_DIED being missed by the call, but if so, that would result in the soclose() call not happening.)
I have attached a small patch, which I think is safe, that adds locking to svc_vc_stat(),which I am hoping you can try at some point.
(I realize this is difficult for a production server, but...) I have tested it a little and will test it some more, to try and ensure it does not break anything.
I have also cc'd mav@, since he's the guy who last worked on this code, in case he has any insight w.r.t. how the soclose() might get missed (or any other way the server socket gets stuck in CLOSE_WAIT).
rick
ps: I'll create a PR for this, so that it doesn't get forgotten.
Best regards
Michael
>
> rick
>
> Best regards
> Michael
>> This will last for ~2 min or so, but is asynchronous. However, the same 4-tuple can not be reused during this time.
>>
>> With other words, from the socket / TCP, a properly executed active
>> close() will end up in this state. (If the other side initiated the
>> close, a passive close, will not end in this state)
>>
>>
>> _______________________________________________
>> freebsd-net at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>
>
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
_______________________________________________
freebsd-net at freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
More information about the freebsd-net
mailing list