qemu-user-static aarch64 lockup/race? (was Re: Python failure in poudriere on arm64 (via qemu-user-static cross compiling))
- Reply: Guido Falsi : "Re: qemu-user-static aarch64 lockup/race? (was Re: Python failure in poudriere on arm64 (via qemu-user-static cross compiling))"
- In reply to: Guido Falsi : "Python failure in poudriere on arm64 (via qemu-user-static cross compiling)"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 28 Jan 2024 14:15:56 UTC
Hi all, again, I have some more findings about this, I'm top posting because the old message is not really that much relevant anymore. I'm now running a machine with head (commit b32d49cfbaa0437d08e65e7cd7c82c5951b1a852 Jan 25th), poudriere installed in it, machine is amd64, with an arm64 jail, 14.0-RELEASE, installed from official distribution binaries (https download method), with cross tools. To make sure everything is aligned I rebuild everything: updated head, rebuild cross tools in the jail, recompiled all ports for the host architecture and force reinstalled them, especially qemu-user-static, cleaned up all packages for the arm64 jail. If I missed something important please point it out. I have made some more tests and I'm getting python failures in poudriere like the one described below from time to time (don't have hard stats but feels like 50% chance). If I get past that it usually is able to build all the not many packages, but locks up at: Creating repository in /tmp/packages: 0% with nCPUs processes like this: > ps -ax | grep -i pkg 91287 1 I+J 0:00.02 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91288 1 I+J 0:00.02 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91289 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91290 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91291 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91292 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91293 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91294 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91295 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91296 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91297 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91298 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91299 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key 91300 1 I+J 0:00.00 /usr/local/bin/qemu-aarch64-static /.p/pkg-static repo -o /tmp/packages /packages /tmp/repo.key And this has hit me 100% of the time up to now. Looks like it is pkg spawning ncpu processes, I'm looking at reducing them, just in case this can sidestep the race/lockup. My suspect is there is some race with quemu-user-static or the APIs it is using, that is triggered by pkg-repo. How can I investigate this? I'm able to reproduce it 100% of the time. BTW these are the pkgs I'm building at present: dns/unbound net-mgmt/vmutils net/kea sysutils/htop sysutils/node_exporter sysutils/tmux (vmutils and node_exporter are go packages and are being skipped since go fails, but I keep them in the list, since I can grab binaries from the official repos, htop I'm going to drop in the near future) Thanks in advance, any help appreciated, especially any suggestions for where to look at and investigation to understand if this is a local problem, or some issue with base/qemu. On 24/01/24 22:10, Guido Falsi wrote: > Hi, > > I recently see a strange failure with python 3.9 in poudriere, it was > not happening a few weeks ago. > > I'm building in poudriere on a head machine running amd64, with a > poudriere jail for arm64, via qemu-user-static. The jail is running 14.0. > > I'm not sure what is going on. > > It fails in the packaging phase with a bunch of errors like: > > =========================================================================== > =======================<phase: package >============================ > ===== env: 'PKG_NOTES=build_timestamp ports_top_git_hash > ports_top_checkout_unclean port_git_hash port_checkout_unclean built_by' > 'PKG_NOTE_build_timestamp=2024-01-24T17:07:52+0000' > 'PKG_NOTE_ports_top_git_hash=0816fdcb6ce8' > 'PKG_NOTE_ports_top_checkout_unclean=no' > 'PKG_NOTE_port_git_hash=0816fdcb6ce8' > 'PKG_NOTE_port_checkout_unclean=no' > 'PKG_NOTE_built_by=poudriere-git-3.4.1' NO_DEPENDS=yes USER=root UID=0 > GID=0 > ===> Building packages for python39-3.9.18 > ===> Building python39-3.9.18 > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/imaplib.cpython-39.opt-2.pyc:No such file or directory > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/imghdr.cpython-39.opt-2.pyc:No such file or directory > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/imp.cpython-39.opt-2.pyc:No such file or directory > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/inspect.cpython-39.opt-2.pyc:No such file or directory > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/io.cpython-39.opt-2.pyc:No such file or directory > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/ipaddress.cpython-39.opt-2.pyc:No such file or directory > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/mailbox.cpython-39.opt-2.pyc:No such file or directory > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/mailcap.cpython-39.opt-2.pyc:No such file or directory > pkg-static: Unable to access file > /wrkdirs/usr/ports/lang/python39/work/stage/usr/local/lib/python3.9/__pycache__/mimetypes.cpython-39.opt-2.pyc:No such file or directory > > > > (it's all about 'opt-2.pyc' files) > > > What could have changed? Maybe I'm doing something wrong? Maybe I'm > hitting some qemu-user-static issue on head? > > > Any help appreciated. > > > (full log available if needed) > -- Guido Falsi <mad@madpilot.net>