Re: Pi3 answers ssh only if outbound ping is running on -current
Date: Sat, 12 Feb 2022 21:32:25 UTC
On 2022-Feb-12, at 10:56, bob prohaska <fbsd@www.zefox.net> wrote: > For a few weeks now a Pi3 running -current will not respond to > an incoming ssh connection unless an outbound ping process is running. > > Once the outbound ping is started via the serial console, incoming > ssh connections are answered normally. Uname -a reports > FreeBSD www.zefox.org 14.0-CURRENT FreeBSD 14.0-CURRENT #10 main-n253073-6db44b0158c: Sat Feb 12 04:30:21 PST 2022 bob@www.zefox.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64 > > A Pi4 running -current of a few days ago exhibits no such problems. > > Another Pi3 running stable/13 has been behaving in the same way. > > Both Pi3s successfully set time via ntp on reboot and will > very briefly (one or two minutes) prompt for an ssh password, > but no further progress is made and the login attempt times out. > If the ssh login is attempted a second time, not even a password > prompt comes back. > > Ping times (to an adjacent machine on the same subnet are > 64 bytes from 50.1.20.26: icmp_seq=2 ttl=64 time=0.978 ms > 64 bytes from 50.1.20.26: icmp_seq=3 ttl=64 time=0.967 ms > 64 bytes from 50.1.20.26: icmp_seq=4 ttl=64 time=1.088 ms > 64 bytes from 50.1.20.26: icmp_seq=5 ttl=64 time=0.983 ms > 64 bytes from 50.1.20.26: icmp_seq=6 ttl=64 time=1.007 ms > 64 bytes from 50.1.20.26: icmp_seq=7 ttl=64 time=1.075 ms > 64 bytes from 50.1.20.26: icmp_seq=8 ttl=64 time=1.020 ms > 64 bytes from 50.1.20.26: icmp_seq=9 ttl=64 time=1.044 ms > 64 bytes from 50.1.20.26: icmp_seq=10 ttl=64 time=1.026 ms > 64 bytes from 50.1.20.26: icmp_seq=11 ttl=64 time=0.908 ms > > That might be considered slow, but the correspondent machine > is only a Pi2 running > FreeBSD www.zefox.com 14.0-CURRENT FreeBSD 14.0-CURRENT #3 main-71d2d5adfe: Tue Dec 21 00:23:51 PST 2021 bob@www.zefox.com:/usr/obj/usr/freebsd-src/arm.armv7/sys/GENERIC arm > > If the outbound ping is started, an incoming ssh connection established > and the outbound ping subsequently stopped the running ssh connection > silently freezes; no disconnect, but no response, not even echo. Some > tens of seconds later, all inputs were responded to. Tried a second time, > the stoppage recurred, restarting the outbound ping eventually restored > responsiveness. > > With the outbound ping stopped, an inbound ssh attempt silently failed: > > bob@raspberrypi:~ $ ssh -vvv 50.1.20.28 > OpenSSH_7.9p1 Raspbian-10+deb10u2+rpt1, OpenSSL 1.1.1d 10 Sep 2019 > debug1: Reading configuration data /etc/ssh/ssh_config > debug1: /etc/ssh/ssh_config line 19: Applying options for * > debug2: resolve_canonicalize: hostname 50.1.20.28 is address > debug2: ssh_connect_direct > debug1: Connecting to 50.1.20.28 [50.1.20.28] port 22. > [enter key echoed] > debug1: connect to address 50.1.20.28 port 22: Connection timed out > ssh: connect to host 50.1.20.28 port 22: Connection timed out > bob@raspberrypi:~ $ > > Thanks for reading and any insights. If I've omitted useful > details or tests please indicate. > You have made multiple reports to the arm list for this issue without anyone having managed to help. This report does have more comparative context, which might help someone help. It may be time to try other lists like freebsd-net and, possibly, freebsd-hackers or freebsd-stable or freebsd-current . However, the best thing no matter where you go would be to (approximately) bisect toward the back-to-back FreeBSD version-pair on, say, stable/13 at which the the problem goes from not-there to happening. ( stable/13 changes slower and so has fewer versions to deal with. Also its KBI may grow but is constrained to otherwise be more stable [ relative to releng/13.0 ]. So you are less likely to run into version compatibility problems for the below suggestion.) I'd recommend using kernel and world materials from: https://artifact.ci.freebsd.org/snapshot/stable-13/?C=M&O=D on a separate microsd card updated from a normal context, avoiding builds. Remember that older stable/13 worlds can run on newer kernels generally. So you might only need to update the kernel after getting an initial, somewhat older context in place. (It is not obvious if it is a kernel-only problem or not.) If it is a kernel problem, you might be able to put down a releng/13.0 world and never change it during the approximate bisect activity. For what https://artifact.ci.freebsd.org/snapshot/ has available, this avoids having to build the versions. It also allows checking if your builds are behaving differently than the official snapshots do. https://artifact.ci.freebsd.org/snapshot/ may not be able to get you to the back-to-back FreeBSD version-pair: the range might be wider. Sometimes the wider range is enough by inspection of the types of commmits in the range. So I'd report whatever range you find wihtout having done any builds. I'll note that I have no problem with connecting via ssh to a RPi3B running my build of (line split for readability): # uname -apKU FreeBSD Rock64_RPi_4_3_2v1p2 14.0-CURRENT FreeBSD 14.0-CURRENT #28 main-n252475-e76c0108990b-dirty: Sat Jan 15 23:39:27 PST 2022 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA53-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA53 arm64 aarch64 1400047 1400047 I have no stable/13 context set up for a RPi3B, only stable/13's that have an untuned ZFS context. Still, I wonder if that might operate well enough to test the issue, despite the 1 GiByte of RAM limitation. I may test that later today. === Mark Millard marklmi at yahoo.com