main aarch64: poudriere-devel [UFS context] cpdup stuck in pgnslp state
Date: Fri, 22 Mar 2024 00:34:30 UTC
Note, more recent process creations towards top, older ones towards bottom: PID JID USERNAME PRI NICE SIZE RES STATE C TIME CPU COMMAND . . . 33693 19 root 68 0 6524Ki 3252Ki wait 3 0:00 0.00% /usr/bin/make -C /usr/ports/lang/gcc13 build 33692 0 root 68 0 15728Ki 3552Ki wait 0 0:00 0.00% sh: poudriere[main-CA7-default][02]: build_pkg (gcc13-13.2.0_4) (sh) 30174 0 root 68 0 15728Ki 3564Ki select 3 0:00 0.00% sh: poudriere[main-CA7-default][02]: build_pkg (gcc13-13.2.0_4) (sh) 26338 0 root 66 0 17740Ki 5044Ki pgnslp 0 0:01 0.00% cpdup -i0 -s0 -f -x ref 01 26308 0 root 68 0 15728Ki 3556Ki wait 0 0:00 0.00% sh: poudriere[main-CA7-default][01]: build_pkg (boost-libs-1.84.0) (sh) 33592 0 root 26 0 15728Ki 3388Ki piperd 2 0:01 0.00% sh: poudriere[main-CA7-default]: pkg_cacher_main (sh) 29205 0 root 68 0 15728Ki 3392Ki nanslp 2 1:52 0.14% sh: poudriere[main-CA7-default]: html_json_main (sh) 28834 0 root 20 0 15728Ki 3548Ki select 3 0:01 0.00% /usr/local/libexec/poudriere/sh -e /usr/local/share/poudriere/bulk.sh -jmain-CA7 -c -f /root/origins/CA7-origins.txt 28833 0 root 20 0 13560Ki 1924Ki wait 3 0:00 0.00% /bin/sh /root/build-ports-main-CA7.sh -c . . . pgnslp seems to be from: vm_page_acquire_unlocked in sys/vm/vm_page.c . That in turn looks to be using vm_page_grab_sleep : if (!vm_page_grab_sleep(object, m, pindex, "pgnslp", allocflags, false)) return (false); and: /* * vm_page_grab_sleep * * Sleep for busy according to VM_ALLOC_ parameters. Returns true * if the caller should retry and false otherwise. * * If the object is locked on entry the object will be unlocked with * false returns and still locked but possibly having been dropped * with true returns. */ static bool vm_page_grab_sleep(vm_object_t object, vm_page_t m, vm_pindex_t pindex, const char *wmesg, int allocflags, bool locked) { if ((allocflags & VM_ALLOC_NOWAIT) != 0) return (false); /* * Reference the page before unlocking and sleeping so that * the page daemon is less likely to reclaim it. */ if (locked && (allocflags & VM_ALLOC_NOCREAT) == 0) vm_page_reference(m); if (_vm_page_busy_sleep(object, m, pindex, wmesg, allocflags, locked) && locked) VM_OBJECT_WLOCK(object); if ((allocflags & VM_ALLOC_WAITFAIL) != 0) return (false); return (true); } . . . [10:08:06] [01] [00:00:00] Building devel/boost-libs | boost-libs-1.84.0 . . . # poudriere status -b [main-CA7-default] [2024-03-21_06h23m31s] [parallel_build] Queued: 265 Built: 213 Failed: 0 Skipped: 0 Ignored: 0 Fetched: 0 Tobuild: 52 Time: 10:50:40 ID TOTAL ORIGIN PKGNAME PHASE TIME TMPFS CPU% MEM% [01] 00:42:40 devel/boost-libs | boost-libs-1.84.0 starting 00:42:40 951.54 MiB . . . Unfortunately: A) The booted kernel is my personal build based on -mcpu=cortex-a76 and LSE_ATOMICS . (It is in use on a RPi5 booted via EDK2.) B) The booted world is a PkgBase world. C) The poudriere jail's world directory tree is my personal armv7 world build based on -mcpu=cortex-a7 . All are based on: main-n268827-75464941dc17 . (Well, PkgBase commit identification/verification for world does not exist. I happened to update PkgBase during a long lull for commits to main. In the context, the boot-world seems unlikely to be involved here.) The boot media is a U2 Optane 960 GB used via a USB3 adaptor. I've done bunches of builds in the (A)-(C) context on the RPi5 and have not seen this before, so: does not look to be readily repeatable. (Unfortunately, the purpose of the build was to find out how long the particular build configuration took to finish building the 265 packages from scratch, for comparison to other builds.) I may wait for the system to become fairly idle and then see about forcing a crash dump. It may be a while before the poudriere bulk runs out of packages it can build, absent building boost-libs . Side note: As far as I can tell, how to identify a context that allows identification of what commit vintage a PkgBase world is based on is unspecified so far. For a PkgBase kernel uname -apKU may well report the kernel-commit identification well. (Hard to verify.) === Mark Millard marklmi at yahoo.com