Re: Shutdown -r under -current hangs on RPi3

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sat, 23 Sep 2023 21:00:47 UTC
On Sep 23, 2023, at 12:41, Mark Millard <marklmi@yahoo.com> wrote:

> On Sep 23, 2023, at 12:16, bob prohaska <fbsd@www.zefox.net> wrote:
> 
>> On Sat, Sep 23, 2023 at 11:34:41AM -0700, Mark Millard wrote:
>>> On Sep 23, 2023, at 11:26, Mark Millard <marklmi@yahoo.com> wrote:
>>> 
>>>> On Sep 23, 2023, at 08:52, bob prohaska <fbsd@www.zefox.net> wrote:
>>>> 
>>>>> From time to time, but seemingly more often lately, a Pi3
>>>>> get stuck during shutdown -r. The machine is running -current
>>>>> from a mechanical usb hard disk through a powered hub. No micro
>>>>> SD card is used. Once up it's quite stable.
>>>>> 
>>>>> The console reports
>>>>> 
>>>>> login: Sep 23 08:20:37 pelorus shutdown[224]: reboot by bob:
>>>>> Stopping sshd.
>>>>> Waiting for PIDS: 1063.
>>>>> Stopping cron.
>>>>> Waiting for PIDS: 1073.
>>>>> Stopping powerd.
>>>>> Waiting for PIDS: 1002.
>>>>> Stopping devd.
>>>>> Waiting for PIDS: 752.
>>>>> Writing entropy file: .
>>>>> Writing early boot entropy file: .
>>>>> .
>>>>> Terminated
>>>>> Sep 23 08:20:43 pelorus syslogd: exiting on signal 15
>>>>> Waiting (max 60 seconds) for system process `vnlru' to stop... done
>>>>> 
>>>>> Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 4 0 0 0 done
>>>>> All buffers synced.
>>>>> Uptime: 23h31m45s
>>>>> Khelp module "ertt" can't unload until its refcount drops from 1 to 0.
>>>> 
>>>> I've gotten the above message on rare occasions over the years on
>>>> the amd64 system (ThreadRipper 1950X). (But reboots and shutdowns
>>>> are not frequent for this system normally.)
>>>> 
>>>> There is a recent: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271677
>>>> 
>>>> It is not just your context with the issue.
>>>> 
>>>>> Resetting system ...
>>>>> 
>>>>> At that point all activity ceases. The only clue I can recognize is that
>>>>> the red power LED remains off, as if FreeBSD never relinquishes control
>>>>> to the Pi firmware, which turns the LED back on at powerup. Power-cycling
>>>>> results in a normal reboot.
>>>> 
>>> 
>>> The system might produce more messages about the
>>> shutdown activity if it has been booted via:
>>> 
>>> boot -v
>>> 
>>> This might narrow down the context some for someone
>>> familiar with the messages --if oyu are lucky enough
>>> to eventually get an example from a boot -v context.
>> 
>> I'm confused here. Boot is normal, and it doesn't
>> seem to even reach the boot stage during reboot. Can
>> boot -v affect the _next_ boot?
> 
> boot -v can add messages both to boot-time and to the
> later reboot/shutdown-time.
> 
> You would have to keep booting with "boot -v", hoping
> that the later reboot/shutdown would report extra
> messages but would also include the 'help module "ertt"
> can't unload' message in the sequence. Once you had
> such a boot, you would report the text around the
> 'help module "ertt" can't unload' message to show the
> extra context.

Actually, the extra messages before the hangup may be
useful even if no 'help module "ertt" can't unload'
message occurs: it still gives a better idea of the
staging.

>> It isn't clear to me that the bug report is related. It
>> seems to focus on the ertt message, while my problems
>> wait until the system claims to be resetting and then
>> gets stuck.
> 
> I've had to force poweroff after some of the messages.
> 
> The "can't unload until its refcount drops from 1 to 0"
> suggests that it is waiting to unload. (But I've no
> low level detail establishing that is actually what
> was going on.)
> 
> Have you recently had it get stuck without first having
> the 'Khelp module "ertt" can't unload' message? (I've
> not had other reboot hangups except for other known
> problems that were later fixed. I've not had other
> reboot/shutdown hangups in a very long time.)
> 
>> Maybe the ertt error is related, but the
>> association isn't consistent 
> 
> Are you saying that in recent times you have had hangups
> that did not first show the 'Khelp module "ertt" can't
> unload' message?
> 
> The message suggests that the refcount decreasing would
> lead to it not waiting any longer. (But I've no low
> level detail establishing that is actually what was
> going on when I did not have to force power off.)
> 
>> Unfortunately the hang does not seem easily reproducible.
> 
> Consistent with my "rare".
> 
>> Several consecutive shutdown -r reboots were successful.
> 
> To my knowledge/memory, I've never gotten the message after
> being booted for only a short time (little activity).
> 
>> The hangs seen so far have all followed OS build/install
>> sessions.
> 
> So a "boot -v" before such sessions might, eventually, prove
> useful.
> 



===
Mark Millard
marklmi at yahoo.com