For snapshot builds: armv7 chroot on aarch64 has kyua test -k /usr/tests/Kyuafile sys/kern/kern_copyin hung up [in getpid?], unkillable, prevents reboot
- Reply: Mark Millard : "Re: For snapshot builds: armv7 chroot on aarch64 has kyua test -k /usr/tests/Kyuafile sys/kern/kern_copyin hung up [in getpid?], unkillable, prevents reboot"
- Reply: John F Carr : "Re: For snapshot builds: armv7 chroot on aarch64 has kyua test -k /usr/tests/Kyuafile sys/kern/kern_copyin hung up [in getpid?], unkillable, prevents reboot"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 26 Jun 2023 00:16:09 UTC
Using the likes of: FreeBSD-14.0-CURRENT-arm64-aarch64-ROCK64-20230622-b95d2237af40-263748.img and: FreeBSD-14.0-CURRENT-arm-armv7-GENERICSD-20230622-b95d2237af40-263748.img I have shown the following behavior after setting up storage media based on them. (This was a test that my builds were not odd for the issue.) Boot the aarch64 media and log in. (Note: I logged in as root.) mount the armv7 media (-noatime is just my habit) and then put it to use: # mount -onoatime /dev/da1s2a /mnt # chroot /mnt/ # kyua test -k /usr/tests/Kyuafile sys/kern/kern_copyin sys/kern/kern_copyin:kern_copyin -> On the serial console: # ps -xu USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 11 1498.4 0.0 0 256 - RNL 23:24 542:52.92 [idle] root 1174 100.0 0.0 0 16 - Rs 23:37 0:00.00 /usr/tests/sys/kern/kern_copyin -vunprivileged-user=tests -r/tmp/kyua.9YUttj/2/result.atf kern_copyin root 0 0.0 0.0 0 1616 - DLs 23:24 0:00.50 [kernel] root 1 0.0 0.0 11704 1288 - ILs 23:24 0:00.02 /sbin/init root 2 0.0 0.0 0 256 - WL 23:24 0:00.26 [clock] root 3 0.0 0.0 0 272 - DL 23:24 0:00.00 [crypto] root 4 0.0 0.0 0 80 - DL 23:24 0:00.95 [cam] root 5 0.0 0.0 0 16 - DL 23:24 0:00.00 [busdma] root 6 0.0 0.0 0 16 - DL 23:24 0:00.03 [rand_harvestq] root 7 0.0 0.0 0 48 - DL 23:24 0:00.06 [pagedaemon] root 8 0.0 0.0 0 16 - DL 23:24 0:00.00 [vmdaemon] root 9 0.0 0.0 0 160 - DL 23:24 0:00.38 [bufdaemon] root 10 0.0 0.0 0 16 - DL 23:24 0:00.00 [audit] root 12 0.0 0.0 0 880 - WL 23:24 0:11.81 [intr] root 13 0.0 0.0 0 48 - DL 23:24 0:00.04 [geom] root 14 0.0 0.0 0 16 - DL 23:24 0:00.00 [sequencer 00] root 15 0.0 0.0 0 160 - DL 23:24 0:06.42 [usb] root 16 0.0 0.0 0 16 - DL 23:24 0:00.10 [acpi_thermal] root 17 0.0 0.0 0 16 - DL 23:24 0:00.00 [acpi_cooling0] root 18 0.0 0.0 0 16 - DL 23:24 0:00.04 [syncer] root 19 0.0 0.0 0 16 - DL 23:24 0:00.00 [vnlru] root 671 0.0 0.0 13260 2600 - Is 23:25 0:00.00 dhclient: system.syslog (dhclient) root 674 0.0 0.0 13260 2752 - Is 23:25 0:00.00 dhclient: dpni0 [priv] (dhclient) root 761 0.0 0.0 14572 3972 - Ss 23:25 0:00.02 /sbin/devd root 964 0.0 0.0 12832 2764 - Is 23:25 0:00.02 /usr/sbin/syslogd -s root 1033 0.0 0.0 13012 2604 - Ss 23:25 0:00.01 /usr/sbin/cron -s root 1058 0.0 0.0 21052 8308 - Is 23:25 0:00.01 sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups (sshd) root 1078 0.0 0.0 21288 9304 - Is 23:26 0:00.09 sshd: root@pts/0 (sshd) root 1175 0.0 0.0 21288 9496 - Is 23:37 0:00.04 sshd: root@pts/1 (sshd) root 1074 0.0 0.0 13380 3008 u0 Is 23:25 0:00.01 login [pam] (login) root 1075 0.0 0.0 13460 3292 u0 S 23:25 0:00.02 -sh (sh) root 1233 0.0 0.0 13588 3016 u0 R+ 00:00 0:00.00 ps -xu root 1081 0.0 0.0 13460 3328 0 Is 23:26 0:00.02 -sh (sh) root 1170 0.0 0.0 5788 2884 0 I 23:36 0:00.02 /bin/sh -i root 1172 0.0 0.0 10408 7192 0 I+ 23:37 0:00.30 kyua test -k /usr/tests/Kyuafile sys/kern/kern_copyin root 1178 0.0 0.0 13460 3320 1 Is+ 23:38 0:00.01 -sh (sh) 1174 is stuck, even if one waits for 30min+. kill and kill -9 will not kill 1174. "shutdown -r now" hangs before the reboot happens and reports: "some processes would not die". An interesting property is that ps and top disagree about 1174 CPU usage: ps 100%, top 0%. But top also indicates 1174 always has CPU0 "STATE". (Across tests CPUn varies but within a test it has a fixed n.) I have also seen ps "STAT" being RXs. The following is from my earlier activity with my own builds involved, here 1119, not the 1174 from above. truss reports as the last thing for the stuck process as "getpid()". . . . 1119: 0.588983953 fstatat(AT_FDCWD,"/usr/tests/sys/kern/kern_copyin",{ mode=-r-xr-xr-x ,inode=111756,size=9776,blksize=10240 },AT_SYMLINK_NOFOLLOW) = 0 (0x0) 1119: 0.589065030 mmap(0x0,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),-1,0x0) = 1074188288 (0x4006d000) 1119: 0.589227544 openat(AT_FDCWD,"/tmp/kyua.aBQv6E/2/result.atf",O_WRONLY|O_CREAT|O_TRUNC,0644) = 3 (0x3) 1119: 0.589276503 getpid() = 1119 (0x45f) For reference, from inside an armv7 chroot session before doing such a test: # uname -apKU FreeBSD generic 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n263748-b95d2237af40: Thu Jun 22 11:10:50 UTC 2023 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm armv7 1400090 1400090 === Mark Millard marklmi at yahoo.com