Re: latest current fails to boot.
- In reply to: Tomoaki AOKI : "Re: latest current fails to boot."
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 25 Sep 2021 02:20:14 UTC
On Sat, Sep 25, 2021 at 11:00:50AM +0900, Tomoaki AOKI wrote: > On Fri, 24 Sep 2021 01:33:33 +0300 > Konstantin Belousov <kostikbel@gmail.com> wrote: > > > On Thu, Sep 23, 2021 at 09:20:51PM +0200, Johan Hendriks wrote: > > > > > > On 23/09/2021 19:52, Konstantin Belousov wrote: > > > > On Fri, Sep 24, 2021 at 12:43:01AM +0900, Tomoaki AOKI wrote: > > > > > On Wed, 22 Sep 2021 23:09:05 +0900 > > > > > Tomoaki AOKI <junchoon@dec.sakura.ne.jp> wrote: > > > > > > > > > > > On Wed, 22 Sep 2021 05:47:46 -0700 > > > > > > David Wolfskill <david@catwhisker.org> wrote: > > > > > > > > > > > > > On Wed, Sep 22, 2021 at 02:39:37PM +0200, Johan Hendriks wrote: > > > > > > > > I did a git pull this morning and it fails to boot. > > > > > > > > I hangs at Setting hostid : 0x917bf354 > > > > > > > > > > > > > > > > This is a vm running on vmware. > > > > > > > > If i boot the old kernel from yesterday it boots normally. > > > > > > > > > > > > > > > > uname -a > > > > > > > > FreeBSD varnish-cdn-node03 14.0-CURRENT FreeBSD 14.0-CURRENT #0 > > > > > > > > main-n249518-5572fda3a2f: Tue Sep 21 14:40:22 CEST 2021 > > > > > > > > root@varnish-cdn-node03:/usr/obj/usr/src/amd64.amd64/sys/KRNL amd64 > > > > > > > > .... > > > > > > > I had no issues with my build machine or either of two laptops, either > > > > > > > from yesterday: > > > > > > > > > > > > > > FreeBSD g1-55.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #358 main-n249518-5572fda3a2f3: Tue Sep 21 05:15:22 PDT 2021 root@g1-55.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 1400033 1400033 > > > > > > > > > > > > > > or today: > > > > > > > > > > > > > > FreeBSD g1-55.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #359 main-n249556-c96da1994587: Wed Sep 22 04:24:17 PDT 2021 root@g1-55.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 1400033 1400033 > > > > > > > > > > > > > > [uname strings from my main laptop shown, but I keep the machines > > > > > > > in sync rather aggressively.] > > > > > > > > > > > > > > Perhaps the issue you are encountering involves things not in my > > > > > > > environment (such as VMs or ZFS)? > > > > > > > > > > > > > > Peace, > > > > > > > david > > > > > > > -- > > > > > > > David H. Wolfskill david@catwhisker.org > > > > > > > Life is not intended to be a zero-sum game. > > > > > > > > > > > > > > See https://www.catwhisker.org/~david/publickey.gpg for my public key. > > > > > > For me, on bare metal (non-vm) amd64 with root-on-ZFS, > > > > > > > > > > > > Fails to boot to multiuser at git: 8db1669959ce > > > > > > Boot fine at git: 0b79a76f8487 > > > > > > > > > > > > Boot to singleuser is fine even with failed revision. > > > > > > > > > > > > Failure mode: > > > > > > Hard hangup or spinning and non-operable. Hard power-off needed. > > > > > > Seems to happen after starting rc.conf processing and before setting > > > > > > hostid. > > > > > > > > > > > > -- > > > > > > Tomoaki AOKI <junchoon@dec.sakura.ne.jp> > > > > > > > > > > > Additional info and correction. > > > > > *Hung up before setting hostuuid, not hostid. > > > > > > > > > > *^T doesn't respond at all, only hard power off worked. > > > > > > > > > > *`kldload nvidia-modeset.ko` on single user mode sanely work. > > > > > > > > > > > > > > > Why I could know rc.conf is started to be processed: > > > > > > > > > > I have lines below at the end of /etc/rc.conf and its output is always > > > > > the first line related to /etc/rc.conf, at least for non-verbose boot. > > > > > The next line is normally "Setting hostuuid: " line, which was not > > > > > displayed when boot hung up. > > > > > > > > > > > > > > > kldstat -q -n nvidia.ko > > > > > if [ 0 -ne $? ] ; then > > > > > echo "Loading nvidia-driver modules via rc.conf." > > > > > if [ -e /boot/modules/nvidia-modeset.ko ] ; then > > > > > kld_list="${kld_list} nvidia-modeset.ko" > > > > > else > > > > > kld_list="${kld_list} nvidia.ko" > > > > > fi > > > > > fi > > > > If you do not load nvidia-modeset.ko at all, does the boot proceed? > > > > > > > > When the boot hangs, can you enter into ddb? > > > > > > > > > > > I do not load a nvidia-modeset.ko kernel module and it will not boot. It > > > hangs with Setting hostid : as the last message. Then only a powercycle gets > > > me back. If i boot in single user mode all is fine, but as soon as i exit > > > single user mode it hangs at the same spot. > > > > Can you enter ddb at the hang point? > > It depends. In most cases, nothing other than power cycle works, but I > could get into ddb by ctrl-alt-esc only once. `bt` was like below. > Converted from photo using Google Lens, and hand-fixed mis-conversion > as much as possible, but there can be remaining mis-conversion. > > > ===== `bt` output ===== > > KDB: enter: manual escape to debugger > [ thread pid 12 tid 100041 ] > Stopped at kdb_enter+0x37: movq $0,0x103aale (Xrip) > db> bt > Tracing pid 12 tid 100041 td 0xfffffe00e32c0000 > kdb_enter() at kdb_enter+0x37/frame 0xfffffe00e2e80d40 > vt kbdevent() at vt_kbdevent+0x22f/frame 0xfffffe00e2e80da0 > kbdmux_intr() at kbdmux_intr+0x45/frame Oxfffffe00e2e80dc0 > taskqueue_run_locked() at taskqueue_run_locked+0x197/frame > Oxfffffe00e2e80e40 taskqueue_run() at taskqueue_run+0x68/frame > 0xfffffe00e2e80e60 ithread_loop() at ithread_loop+0x25f/frame > Oxfffffe00e2e80ef0 fork_exit() at fork_exit+0x8e/frame > Oxfffffe00e2e80f30 fork_trampoline() at fork_trampoline+0xe/frame > Oxfffffe00e2e80f30 > --- trap 0, rip = 0x301700000000000, rsp = 0, rbp= 0xffffffff81d047d0--- > ??() at 0x301700000000000/frame Oxffffffff81d047d0 > ??() at Oxfffff80001a59b80/frame Oxfffff80001a59c00 > taskqueue_swi_run() at taskqueue_swi_run > > ===== End `bt` output ===== This is software interrupt thread processing task for keyboard, nothing unexpected or unusual. Next time please do 'ps' at least. You might note some 'interesting' processes in the ps output right away, in which case please do bt for them as well. I think it would be fine to post images somewhere. > > > > Do you load any other modules besides nvidia, from rc.conf? > > Yes, but doesn't seem to be loaded when hung up. > > ddb says... > > > ===== `kldstat` output by ddb ===== > > db> kidstat > Id Refs Address Size Name > 1 56 0xffffffff80200000 1f31a70 kernel > 2 1 0xffffffff82132000 2b88 acpi_call.ko > 3 1 0xffffffff82135000 290940 iwm9000fw.ko > 4 1 0xffffffff823c6000 8248 acpi_ibm.ko > 5 1 0xffffffff823cf000 6260 filemon.ko > 6 1 0xffffffff823d7000 2cf8 udf_iconv.ko > 7 4 0xffffffff823da000 9388 libiconv.ko > 8 2 0xffffffff823e4000 9f48 udf.ko > 9 1 0xffffffff823ee000 26228 if_iwm.ko > 10 1 0xffffffff82415000 2d58 msdosfs_iconv.ko > 11 1 0xffffffff82418000 2d40 cd9660_iconv.ko > 12 1 0xffffffff828c7000 63e0 usbhid.ko > 13 2 0xffffffff828ce000 6db0 hidbus.ko > 14 1 0xffffffff828d5000 5b4ec8 zfs.ko > 15 1 0xffffffff82e8a000 2e48 nvram.ko > 16 1 0xffffffff82e8d000 3ca0 smb.ko > 17 2 0xffffffff82e91000 3d50 smbus.ko > 18 1 0xffffffff82e95000 4358 cpuct1.ko > db> > > ===== End `kldstat` output by ddb ===== > > All are loaded by /boot.loader.conf. > fdescfs.ko seems to missing compared to k`kldstat` from single user > shell, but it could be loaded on remounting (for rw access) fs. > > Modules loaded via rc.conf on sane boot is as follows. > It should include modules automatically loaded by defd. > Address should be different, as revision is different. > Manually kldload'ing No.20 though 28 (including auto-loaded as > dependency: 21,25 and 27) did't cause hang up on single user sh of > affected revision. > > ===== Additional modules ===== > > 20 1 0xffffffff83614000 174c8 smbfs.ko > 21 2 0xffffffff8362c000 3090 libmchain.ko > 22 1 0xffffffff83630000 5658 tpm.ko > 23 1 0xffffffff83636000 11ea0 fusefs.ko > 24 1 0xffffffff83648000 106310 nvidia-modeset.ko > 25 1 0xffffffff83800000 1fa1a48 nvidia.ko > 26 2 0xffffffff8374f000 2cce0 linux.ko > 27 6 0xffffffff8377c000 9ea8 linux_common.ko > 28 1 0xffffffff83786000 4350 acpi_video.ko > 29 1 0xffffffff8378b000 3378 acpi_wmi.ko > 30 2 0xffffffff8378f000 21d8 hconf.ko > 31 1 0xffffffff83792000 21e8 hcons.ko > 32 3 0xffffffff83795000 30a8 hidmap.ko > 33 1 0xffffffff83799000 21e8 hms.ko > 34 1 0xffffffff8379c000 32c0 hmt.ko > 35 1 0xffffffff837a0000 21e8 hpen.ko > 36 1 0xffffffff837a3000 3250 ichsmb.ko > 37 1 0xffffffff837a7000 6c9c ig4.ko > 38 1 0xffffffff837ae000 433c iicbus.ko > 39 1 0xffffffff837b3000 2110 pchtherm.ko > 40 1 0xffffffff837b6000 28f40 linux64.ko > 41 1 0xffffffff837df000 2260 pty.ko > 42 1 0xffffffff837e2000 639c linprocfs.ko > 43 1 0xffffffff837e9000 3284 linsysfs.ko > 44 1 0xffffffff837ed000 4c20 ng_ubt.ko > 45 6 0xffffffff837f2000 aac8 netgraph.ko > 46 2 0xffffffff857a2000 9238 ng_hci.ko > 47 3 0xffffffff837fd000 25a8 ng_bluetooth.ko > 48 1 0xffffffff857ac000 d250 ng_l2cap.ko > 49 1 0xffffffff857ba000 1bef8 ng_btsocket.ko > 50 1 0xffffffff857d6000 39d0 ng_socket.ko > 51 1 0xffffffff857da000 27040 ipfw.ko > > ===== End additional modules ===== > > > -- > Tomoaki AOKI <junchoon@dec.sakura.ne.jp>