Re: -current dropping ssh connections

From: Michael Gmelin <grembo_at_freebsd.org>
Date: Wed, 21 Jun 2023 19:33:46 UTC

> On 21. Jun 2023, at 20:03, bob prohaska <fbsd@www.zefox.net> wrote:
> 
> On Wed, Jun 21, 2023 at 10:45:25AM -0700, Mark Millard wrote:
>>> On Jun 21, 2023, at 10:24, bob prohaska <fbsd@www.zefox.net> wrote:
>>> 
>>> I've got a Pi4 running -current that seems to selectively drop ssh connections.
>> 
>> Only when the ssh has text streaming over it? Even when it
>> is idle? Any other types of context differences that lead
>> to observable differences of some type related to the
>> disconnects (vs. lack of them)?
> 
> I can't detect any consistent pattern. For a while I thought load on the
> sshd-host end made a difference, but the latest disconnect was on an idle
> system with serial console output the only traffic on the dropped connection. 
> 
>>> Connections running a shell seem to stay up, but a session running tip to a
>>> usb-serial adapter (FTDI TTL232R-3V3) seems go away within a few hours. 
>> 
>> The way that reads, ssh to a shell and then running tip in
>> that shell would stay up. (Does it?) tip is being run
>> without ssh running a shell? May be more detail about the
>> two contexts of establishing the connection is needed here?
>> 
> 
> No, other way 'round. In both cases an ssh connection was made which
> started a shell. In one a tip session was started, which seems prone 
> to dropping. In the other an active shell (typically running buildworld, 
> or maybe idle) kept running. This makes me think (perhaps wrongly) that 
> tip is involved with the disconnection. Both shells are started as a
> regular user and then su-d to root.
> 
> I'm fairly confident this isn't a client-side or NAT problem, simply because
> there are a dozen or so other ssh sessions running from the ssh client to the
> various Pi2/3/4 hosts in my collection which stay up basically until they're
> taken down deliberately.
> 
> I seem to (vaguely) recall a discussion of ssh problems over NAT some months 
> ago, something about tolerating misssing ts (timestamps?). Is that still possible?

You can check if systctl net.inet.tcp.tolerate_missing_ts=1.

It should be set to 1 by default since 13.1, but maybe it’s different in current.

Best