From nobody Tue Jun 27 17:16:57 2023 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QrBG70pVNz4f2ZJ for ; Tue, 27 Jun 2023 17:16:47 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (www.zefox.net [50.1.20.27]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "www.zefox.com", Issuer "www.zefox.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QrBG662ywz3M3p for ; Tue, 27 Jun 2023 17:16:46 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Authentication-Results: mx1.freebsd.org; none Received: from www.zefox.net (localhost [127.0.0.1]) by www.zefox.net (8.17.1/8.15.2) with ESMTPS id 35RHGw9h083015 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 27 Jun 2023 10:16:58 -0700 (PDT) (envelope-from fbsd@www.zefox.net) Received: (from fbsd@localhost) by www.zefox.net (8.17.1/8.15.2/Submit) id 35RHGwSt083014; Tue, 27 Jun 2023 10:16:58 -0700 (PDT) (envelope-from fbsd) Date: Tue, 27 Jun 2023 10:16:57 -0700 From: bob prohaska To: Mark Millard Cc: freebsd-arm@freebsd.org Subject: Re: -current on armv7 stuck with flashing disk light Message-ID: References: <066FD282-1637-448C-99FF-BA62718386F0@yahoo.com> List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <066FD282-1637-448C-99FF-BA62718386F0@yahoo.com> X-Rspamd-Queue-Id: 4QrBG662ywz3M3p X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:7065, ipnet:50.1.16.0/20, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N On Tue, Jun 27, 2023 at 09:59:40AM -0700, Mark Millard wrote: > On Jun 27, 2023, at 09:47, Mark Millard wrote: > > > On Jun 27, 2023, at 09:29, bob prohaska wrote: > > > >> On Mon, Jun 26, 2023 at 07:57:05PM -0700, Mark Millard wrote: > >>> On Jun 26, 2023, at 19:12, bob prohaska wrote: > >>> > >>>> A Pi2 freshly updated to > >>>> FreeBSD 14.0-CURRENT #41 main-c3e58ace31: Mon Jun 26 17:06:01 PDT 2023 > >>>> bob@www.zefox.com:/usr/obj/usr/src/arm.armv7/sys/GENERIC arm > >>>> got stuck with a flashing USB disk LED after starting a -j3 buildworld. > >>>> No response to debugger escape, had to pull the plug. > > > > I'm confused. > > > > That says "stuck with a flashing USB disk LED". But: > > > > http://nemesis.zefox.com/~bob/fbsd/rpi2/20230623/readme > > > > says: "the disk had gone to sleep mode. Both LEDs were off" > > > > Are these two different examples with variable behavior > > across the examples? > > Yes, I got mixed up. There have been several failures, some belated and the most recent one which was prompt (with a new kernel). > >>> If I understand right, the LED flashing means the disk > >>> had not stopped doing I/O: the system was still running, > >>> doing disk activity. (But I do not have a description > >>> of what your drive documentation says about how the > >>> drive handles the LED and what various patterns/colors > >>> may mean.) > >>> > >>> If the processes associated with processing input that > >>> would identify the debugger escape had the kernel stacks > >>> involved swapped out to swap space, I doubt that the > >>> debugger escape would work until/unless the kernel > >>> stacks are brought back into kernel RAM. > >>> > >>> Avoiding the specific way of losing control is why I > >>> have in /etc/sysctl.conf : > >>> > >>> # > >>> # Together this pair avoids swapping out the process kernel stacks. > >>> # This avoids processes for interacting with the system from being > >>> # hung-up by such. > >>> vm.swap_enabled=0 > >>> vm.swap_idle_enabled=0 > >>> > >> > >> This combination was tried and didn't seem to have any consistent > >> effect. It's commented out at the moment. > > > > By not having them, we have no way to know if the > > relevant kernel stacks had been moved to swap space. > > Having them is part of problem isolation/identification > > even when other forms of loss of control happen. > > > > The 2 lines serve more than one goal. > > > >>> (No claim such is the only way to lose control.) > >>> > >>> You might be able to get a clue if their was disk I/O going > >>> on based on modification times on files you know would have > >>> been modified periodically for some time (minutes) before > >>> you pulled the plug --but not modified on reboot and later > >>> activity. May be a log file that would only be modified by > >>> the build that you had been trying to do? > >>> > >> > >> There are log files for build and disk activity (for a cold > >> hang, no disk activity at all) at > >> http://nemesis.zefox.com/~bob/fbsd/rpi2/20230623/ > > > > So this is a different hangup? > > j4swapscript.log has internal timestamp pairs: > > Wed Jun 21 16:34:06 PDT 2023 > . . . > Fri Jun 23 07:26:10 PDT 2023 > > It would be interesting to know if "Jun 23 07:26:10" > was after the appearent hangup was identified vs. > before. > > >> In this case the top window was via ssh. Lately I've > >> taken to running top on the serial console in hopes > >> that will help distinguish system hangs from USB hangs. > > > > If you want to identify system hangs, please > > put back: > > > > vm.swap_enabled=0 > > vm.swap_idle_enabled=0 > > They're reinstated now, but I don't want to disturb the system while it seems to be building world acceptably. Sorry for mixing things up! bob prohaska