NFS Mount Hangs
tuexen at freebsd.org
tuexen at freebsd.org
Sat Apr 10 16:12:44 UTC 2021
> On 10. Apr 2021, at 17:56, Rick Macklem <rmacklem at uoguelph.ca> wrote:
>
> Scheffenegger, Richard <Richard.Scheffenegger at netapp.com> wrote:
>>> Rick wrote:
>>> Hi Rick,
>>>
>>>> Well, I have some good news and some bad news (the bad is mostly for Richard).
>>>>
>>>> The only message logged is:
>>>> tcpflags 0x4<RST>; tcp_do_segment: Timestamp missing, segment processed normally
>>>>
> Btw, I did get one additional message during further testing (with r367492 reverted):
> tcpflags 0x4<RST>; syncache_chkrst: Our SYN|ACK was rejected, connection attempt aborted
> by remote endpoint
>
> This only happened once of several test cycles.
That is OK.
>
>>>> But...the RST battle no longer occurs. Just one RST that works and then the SYN gets SYN,ACK'd by the FreeBSD end and off it goes...
>>>>
>>>> So, what is different?
>>>>
>>>> r367492 is reverted from the FreeBSD server.
>>>> I did the revert because I think it might be what otis@ hang is being caused by. (In his case, the Recv-Q grows on the socket for the stuck Linux client, while others work.
>>>>
>>>> Why does reverting fix this?
>>>> My only guess is that the krpc gets the upcall right away and sees a EPIPE when it does soreceive()->results in soshutdown(SHUT_WR).
> This was bogus and incorrect. The diagnostic printf() I saw was generated for the
> back channel, and that would have occurred after the socket was shut down.
>
>>>
>>> With r367492 you don't get the upcall with the same error state? Or you don't get an error on a write() call, when there should be one?
> If Send-Q is 0 when the network is partitioned, after healing, the krpc sees no activity on
> the socket (until it acquires/processes an RPC it will not do a sosend()).
> Without the 6minute timeout, the RST battle goes on "forever" (I've never actually
> waited more than 30minutes, which is close enough to "forever" for me).
> --> With the 6minute timeout, the "battle" stops after 6minutes, when the timeout
> causes a soshutdown(..SHUT_WR) on the socket.
> (Since the soshutdown() patch is not yet in "main". I got comments, but no "reviewed"
> on it, the 6minute timer won't help if enabled in main. The soclose() won't happen
> for TCP connections with the back channel enabled, such as Linux 4.1/4.2 ones.)
I'm confused. So you are saying that if the Send-Q is empty when you partition the
network, and the peer starts to send SYNs after the healing, FreeBSD responds
with a challenge ACK which triggers the sending of a RST by Linux. This RST is
ignored multiple times.
Is that true? Even with my patch for the the bug I introduced?
What version of the kernel are you using?
Best regards
Michael
>
> If Send-Q is non-empty when the network is partitioned, the battle will not happen.
>
>>
>> My understanding is that he needs this error indication when calling shutdown().
> There are several ways the krpc notices that a TCP connection is no longer functional.
> - An error return like EPIPE from either sosend() or soreceive().
> - A return of 0 from soreceive() with no data (normal EOF from other end).
> - A 6minute timeout on the server end, when no activity has occurred on the
> connection. This timer is currently disabled for NFSv4.1/4.2 mounts in "main",
> but I enabled it for this testing, to stop the "RST battle goes on forever"
> during testing. I am thinking of enabling it on "main", but this crude bandaid
> shouldn't be thought of as a "fix for the RST battle".
>
>>>
>>> From what you describe, this is on writes, isn't it? (I'm asking, at the original problem that was fixed with r367492, occurs in the read path (draining of ths so_rcv buffer in the upcall right away, which subsequently influences the ACK sent by the stack).
>>>
>>> I only added the so_snd buffer after some discussion, if the WAKESOR shouldn't have a symmetric equivalent on WAKESOW....
>>>
>>> Thus a partial backout (leaving the WAKESOR part inside, but reverting the WAKESOW part) would still fix my initial problem about erraneous DSACKs (which can also lead to extremely poor performance with Linux clients), but possible address this issue...
>>>
>>> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 for the revert only on the so_snd upcall?
> Since the krpc only uses receive upcalls, I don't see how reverting the send side would have
> any effect?
>
>> Since the release of 13.0 is almost done, can we try to fix the issue instead of reverting the commit?
> I think it has already shipped broken.
> I don't know if an errata is possible, or if it will be broken until 13.1.
>
> --> I am much more concerned with the otis@ stuck client problem than this RST battle that only
> occurs after a network partitioning, especially if it is 13.0 specific.
> I did this testing to try to reproduce Jason's stuck client (with connection in CLOSE_WAIT)
> problem, which I failed to reproduce.
>
> rick
>
> Rs: agree, a good understanding where the interaction btwn stack, socket and in kernel tcp user breaks is needed;
>
>>
>> If this doesn't help, some major surgery will be necessary to prevent NFS sessions with SACK enabled, to transmit DSACKs...
>
> My understanding is that the problem is related to getting a local error indication after
> receiving a RST segment too late or not at all.
>
> Rs: but the move of the upcall should not materially change that; i don’t have a pc here to see if any upcall actually happens on rst...
>
> Best regards
> Michael
>>
>>
>>> I know from a printf that this happened, but whether it caused the RST battle to not happen, I don't know.
>>>
>>> I can put r367492 back in and do more testing if you'd like, but I think it probably needs to be reverted?
>>
>> Please, I don't quite understand why the exact timing of the upcall would be that critical here...
>>
>> A comparison of the soxxx calls and errors between the "good" and the "bad" would be perfect. I don't know if this is easy to do though, as these calls appear to be scattered all around the RPC / NFS source paths.
>>
>>> This does not explain the original hung Linux client problem, but does shed light on the RST war I could create by doing a network partitioning.
>>>
>>> rick
>>
>> _______________________________________________
>> freebsd-net at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
More information about the freebsd-net
mailing list