Re: USB-serial adapter suggestions needed

From: Mark Millard <marklmi_at_yahoo.com>
Date: Wed, 10 Jan 2024 19:30:28 UTC
On Jan 10, 2024, at 11:21, Mark Millard <marklmi@yahoo.com> wrote:

> On Jan 10, 2024, at 10:16, bob prohaska <fbsd@www.zefox.net> wrote:
> 
>> On Tue, Jan 09, 2024 at 05:03:42PM -0800, Mark Millard wrote:
>>> On Jan 9, 2024, at 14:47, bob prohaska <fbsd@www.zefox.net> wrote:
>>> 
>> [transcript of ssh-tip disconnect omitted]
>>> 
>>> Interesting.
>>> 
>>> www.zefox.org is the aarch64 that is not configured in config.txt
>>> in the normal aarch64 manor. As I've requested before, please test
>>> using a config.txt that instead has:
>>> 
>>> QUOTE
>>> [all]
>>> arm_64bit=1
>>> dtparam=audio=on,i2c_arm=on,spi=on
>>> dtoverlay=mmc
>>> dtoverlay=disable-bt
>>> device_tree_address=0x4000
>>> kernel=u-boot.bin
>>> 
>>> [pi4]
>>> hdmi_safe=1
>>> armstub=armstub8-gic.bin
>>> 
>>> # Local addition:
>>> [all]
>>> force_mac_address=b8:27:eb:71:46:4f
>>> END QUOTE
>>> 
>>> Please do not use a configuration based in part on armv7 FreeBSD
>>> config.txt material any more for the testing activity: Just use
>>> the FreeBSD normal aarch64 configuration with the force_mac_address
>>> related material added at the end.
>>> 
>>> I want to know if this also fails when powerd is not in
>>> use anywhere.
>>> 
>> 
>> I'd like to let the the present OS build/install cycle complete.
>> Then I'll replace config.txt on www.zefox.org and reboot.
>> That should be done in another day or two. The remote console
>> disconnect reported above hasn't happened again, all consoles
>> have stayed connected and responsive.
>> 
>> 
>>> [I assume that the "The Pi4 workstation" is the "pi4 RasPiOS
>>> workstation". True? Presuming yes: Is the RasPiOS the 64 bit
>>> variation? (Just my curiosity.)]
>>> 
>> Yes. Uname -a reports 
>> Linux raspberrypi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr  3 17:24:16 
>> BST 2023 aarch64 GNU/Linux
>> 
>>> Do you run the buildworld on www.zefox.org and such via the
>>> tip session on pelorus.zefox.org ? Via an ssh session from the
>>> "pi4 RasPiOS workstation" to www.zefox.org ? More generally:
>>> 
>>> A) What runs (if anything) via the tip session started from
>>>   pelorus.zefox.org ?
>>> 
>>> B) What runs (if anything) via the ssh session connected to
>>>   www.zefox.org ?
>>> 
>> 
>> In general the tip session is used only for observation or
>> troubleshooting. Ssh connections are used for other activity, 
>> including OS build/install cycles, poudriere, etc. They are
>> usually placed in the background, writing to log files so that
>> accidental disconnects from the workstation don't stop them.
> 
> Are you using:
> 
> NAME
>     nohup – invoke a utility immune to hangups
> 
> SYNOPSIS
>     nohup [--] utility [arguments]
> 
> DESCRIPTION
>     The nohup utility invokes utility with its arguments and at this time
>     sets the signal SIGHUP to be ignored.  If the standard output is a
>     terminal, the standard output is appended to the file nohup.out in the
>     current directory.  If standard error is a terminal, it is directed to
>     the same place as the standard output.
> 
>     Some shells may provide a builtin nohup command which is similar or
>     identical to this utility.  Consult the builtin(1) manual page.
> 
> ?
> 
>>> A useful test would be to not have the tip command running
>>> on polaris.zefox.org and to just use the ssh to www.zfox.org
>>> instead to start the buildworld/buildkernel. So: No use
>>> of the serial connection when the buildworld is started or
>>> during the build(s). Using tip before that but quitting tip
>>> before starting to load the RPi4B would be okay for this type
>>> of test. The question would be if the:
>>> 
>>> client_loop: send disconnect: Broken pipe
>>> 
>>> still happens.
>>> 
>>> (I'm not claiming that recovery if it fails would be nice. But
>>> finding out if it fails looks to be important.)
>>> 
>>> The contrasting useful test would be to start the buildworld
>>> from the tip session on polaris.zefox.org and to not have any
>>> ssh session to www.zefox.org . The question would be if a
>>> failure of some kind still happens. (The tip session does not
>>> have a pipe in use as far as I know so the detail for
>>> identifying faulure would likely be different.)
>>> 
>> 
>> Normal practice is to leave the tip sessions displaying the 
>> console host's login prompt. So long as the console login is 
>> responsive I can assume that host isn't hung.
>> 
>>> Another question would be: do both such tests fail? Just one
>>> (which)? None? So trying both tests eventually would be
>>> important.
>> 
>> In general, ssh sessions behave completely independently. 
>> Ssh connections to tip sessions commonly fail but no other 
>> ssh connection to that terminal server is disturbed visibly.

Silly me: The above is already the "ssh without tip" case and
so is already covered.

That just leaves the "tip not inside an ssh session" case to
expirment with.

>>> It is important to have only one of the 2 types of connections
>>> in use during the buildworld/buildkernel and such activity for
>>> this type of test --and only the one instance of which ever
>>> type the active test is for.
>>> 
>>> 
>> 
>> Apologies if I didn't answer your question; I'm missing the gist.
> 
> I only want one source of hangups/failure, no worries about which
> one (network vs. serial) lead to the activity if a failure happens.
> 
> That only ssh sessions that in turn run tip fail suggests that the
> tip session gets the initial problem and then things propagate. I
> want more than a suggestion. For example: direct tip runs that
> are not in any ssh session: still get some form of failures?

Only the above needs a new experiment.

> For
> another: no tip use, just ssh: still get failures?

The just above is already covered.

> Do both ways
> still get failures?

Already known: no.

> Yes, the implication is that some experiments that do not have
> your normal structure are involved and there may be risk of not
> being able to use a tip session as a responsiveness test during
> such an experiment. I'm not suggesting any such thing for normal
> operation once such experiments are finished.
 
Actually the no-tip case is already covered so the responsiveness
testing is not an issue.

>> It remains unclear where the disconnects to tip originate.
> 
> That is part of what I'm requesting exploration of via
> different techniques than past attempts that did not provide
> the information.
> 
>> If the tip
>> session is stopped by typing ~~. from the originating ssh instance I'm 
>> returned to the shell on the terminal server. Ssh isn't disturbed. If 
>> I type ~. the ssh session terminates and I'm back to the workstation's 
>> shell. Would it be informative to start a tip session, then ssh in 
>> separately and try to kill tip?
> 
> A question is of SIGHUP is happening. If it is, then the kill that would
> simulate the issue would be via sending SIGHUP. But this may be only one
> of however many alternatives there may be. I prefer to explore what
> is actually happening than attempted simulations via guesses at what is
> happening.
> 
>> I'd expect the ssh part of the link
>> to remain up. If not, would it be significant? 
>> 
>> Occastionally warnings like
>> Jan 10 00:23:30 ns1 sshd[925]: error: beginning MaxStartups throttling
>> show up in console messages. Might those be relevant in some way?  
> 
> Hmm. Intersting. Looking around I see notation like:
> 
> MaxStartups 10:30:100
> 
> where (mostly copy/pasted wording from an example, other than detailed formatting):
> 
> 10: concurrent unauthenticated sessions before it begins rejecting some subsequent connections
> 30: The percent of subsequent connections that are rejected [but see below]
> 100: At this many concurrent unauthenticated sessions, sshd rejects all subsequent connections
> 
> Looking, "man sshd_config" reports:
> 
>     MaxStartups
>             Specifies the maximum number of concurrent unauthenticated
>             connections to the SSH daemon.  Additional connections will be
>             dropped until authentication succeeds or the LoginGraceTime
>             expires for a connection.  The default is 10:30:100.
> 
>             Alternatively, random early drop can be enabled by specifying the
>             three colon separated values start:rate:full (e.g. "10:30:60").
>             sshd(8) will refuse connection attempts with a probability of
>             rate/100 (30%) if there are currently start (10) unauthenticated
>             connections.  The probability increases linearly and all
>             connection attempts are refused if the number of unauthenticated
>             connections reaches full (60).
> 
> 
> It does suggest that testing isolated from the source(s) of
> unauthenticated sessions could be worth while in case handling
> the load from such sessions when already heavily loaded with
> buildworld/builkernel or the like leads to other problems (and
> denial of service consequences?).
> 
> I do not expect that this issue is all that likely but
> expectations are not evidence of their own accuracy/inaccuracy.



===
Mark Millard
marklmi at yahoo.com