From nobody Mon Jan 29 16:02:36 2024 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TNtP15vWYz58xpc; Mon, 29 Jan 2024 16:02:45 +0000 (UTC) (envelope-from mad@madpilot.net) Received: from mail.madpilot.net (vogon.madpilot.net [159.69.1.99]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4TNtP14PvFz4m60; Mon, 29 Jan 2024 16:02:45 +0000 (UTC) (envelope-from mad@madpilot.net) Authentication-Results: mx1.freebsd.org; none Received: from mail (mail [IPv6:fd5c:5351:d272::3]) by mail.madpilot.net (Postfix) with ESMTP id 4TNtNv521gz6dPZ; Mon, 29 Jan 2024 17:02:39 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=madpilot.net; h= content-transfer-encoding:content-type:content-type:in-reply-to :from:from:references:content-language:subject:subject:date:date :message-id:received; s=bjowvop61wgh; t=1706544157; x= 1708358558; bh=LAcRqEj5uYV1Yn4tNfN3RryBSlrFsTskSWksyv4mxdY=; b=M 1/TJn8MVNEQSaygi5t1Xq1BtRg/P5Gk+wP02+9n6w0M9bi/07BKIX675QPo8bn7v N6A1zPdQEsIOtgYYwLelrB6KvYuxEFQeJpJTY6geOLtn+qz7IEQD/PNvhJnFNpr9 KyDJyNSGk+RkhepWnW5o36vNmfLy+MClXFI83M1mmAqjNA3z6yz/k+oGqGqh6JR8 L8E/qsawjXQJFrwiJgDVODYqxb/x9s5KT4W+wQSKvNxOzLm2DeYqkdZ3tBzeMdSk WFt/TMe7quJBgdoVsWWhgeNJC1kgzCiZn9xYs763quOMPrYlexwZYMtCn4QFE0wD Oc84uPvW36S01HMh3x6GA== Received: from mail.madpilot.net ([IPv6:fd5c:5351:d272::3]) by mail (mail.madpilot.net [IPv6:fd5c:5351:d272::3]) (amavisd-new, port 10026) with ESMTP id 8uxFb8pTTZtb; Mon, 29 Jan 2024 17:02:37 +0100 (CET) Message-ID: <8fc35459-c85e-417c-8b6c-de08cf9907a9@madpilot.net> Date: Mon, 29 Jan 2024 17:02:36 +0100 Subject: Re: qemu-user-static aarch64 lockup/race? (was Re: Python failure in poudriere on arm64 (via qemu-user-static cross compiling)) Content-Language: en-US, it To: Warner Losh Cc: Nathan Reilly-list , emulation@freebsd.org, "freebsd-arm@freebsd.org" , freebsd-pkg@freebsd.org References: <6a33726b-eb6f-418e-9fbd-6d0b9b4bfaa8@madpilot.net> <0fc7f929-6e5b-4a33-97d2-8a9c0c07d524@madpilot.net> <79a5eb0f-d04e-4c1a-9d8a-185e1fb4e4a2@madpilot.net> <5ef2ab66-25ef-45f1-aa5a-4b614eab2f40@madpilot.net> <990427ae-0491-463e-92c7-c74700deb6fa@madpilot.net> From: Guido Falsi Autocrypt: addr=mad@madpilot.net; keydata= xsBNBE+G+l0BCADi/WBQ0aRJfnE7LBPsM0G3m/m3Yx7OPu4iYFvS84xawmRHtCNjWIntsxuX fptkmEo3Rsw816WUrek8dxoUAYdHd+EcpBcnnDzfDH5LW/TZ4gbrFezrHPdRp7wdxi23GN80 qPwHEwXuF0X4Wy5V0OO8B6VT/nA0ADYnBDhXS52HGIJ/GCUjgqJn+phDTdCFLvrSFdmgx4Wl c0W5Z1p5cmDF9l8L/hc959AeyNf7I9dXnjekGM9gVv7UDUYzCifR3U8T0fnfdMmS8NeI9NC+ wuREpRO4lKOkTnj9TtQJRiptlhcHQiAlG1cFqs7EQo57Tqq6cxD1FycZJLuC32bGbgalABEB AAHNHkd1aWRvIEZhbHNpIDxtYWRAbWFkcGlsb3QubmV0PsLAeQQTAQgAIwIbAwIeAQIXgAUL CQgHAwUVCgkICwQWAgMBBQJS79AgAhkBAAoJEBrmhg5Wy9KTc0kH/RO64ORBlTbTHaUaOj8F Je5O5NU2Pt9Cyt5ZWBRvxntr1zPTJGKRPS9ihlIfqT4ZvEngQGp57EUyFbCpI0UWasTerImM tt5WACnGmCzUTB39UXx8Oy4b1EgWeTJQ747e/F1mQLXTNa6ijRBE9fYlTb4gAkPN88/wVV9v 3PZozKLTg16ghBzHM/P7Lk8L7clPEZChX1FTa/6eSt3nvzfCuTMZbBPJF/ph+q1KyPqRgVfh tyhu5dvgMoPz/ni41IfeSrkJTD5RXzdyGR9q4Z1NYeBsLkRjC4LxKAP5KqUsvlOUjKvO1byj ApYdMarol+IGkaSk9e3zVYAJkWKjn/ni8XbOwU0EUxB7QQEQAKFhrDceoPdK/IHDSmoj6SQY isvM7VdhcleS7E9DoEAVt7yMbf6HbbMVTTY6ckvwTWQssywLBXNVqxgc4WLJjzfUhgef+WE7 5M3+WFYlOVQLGZY/zEVgma1raYnOHNAOzeHLDmEXjbZP6vGAeDyBbGfQPpE7qGYZ7ubeT3Xw QO+PklcCrvOPj2ZPcAxGNS2xVU/LzONqCrJqLMJSIcCdsbiSP4G5PnDFHtMokaTY6OEr8OEQ fOAerhcHUa/z7Uu8YtmaqKH+QGkE/WEgaRqSiTnv0JOTD+DxehaqvoKPPZ++2NpCZMHB2i6A /xifmQwEiIjEXtcueBRzkNUQkxhqZyS13SrhocL9ydtaVPBzZatAEjUDDEJmAMLVFs45qfyh MiNapHJo2n3MW/E5omqCvEkDdWX/en3P7CK2TemeaDghMsgkNKax/z0wNo5UZCkOPOz0xpNi UilOVbkuezZZNg65741qee2lfXhQIaZ66yT7hphc/N/z3PIAtLeze4u1VR2EXAuZ2sWAdlKC NTlJMsaU/x70BV11Wd/ypnVzM68dfdQIIAj1iMFAD/lXGlEUmKXg5Ov2VQDlTntQoanCYrAg +8CttPzjrydgLZFq3hrtQmfc0se5yv1WHS69+BsUOG09RvvawUDZxUjW19kyeN9THaNRgow3 kSuArUp6zSmJABEBAAHCwF8EGAEIAAkFAlMQe0ECGwwACgkQGuaGDlbL0pMN5wgA4bCkX/qw EVC06ToeR6C2putmSWQMgpDaqrv65Hubo+QGmg2P4ewTYQQ4g6oYWS03qHxqVVWhKz7FjfrV +dH8qbCLfSgIcvdBha7ayGZVrsiuMLKGbw36fcmkZPpSDOfHcP0XH8Z+u9CWj0xUkTxAlZ/7 i6gYSUpG2JWNtdmE/X8VVEyXusCLwy0K0BI60A/4dRTIX3C4QKrJ3ZbUXegz70ynjHf+lQMZ 9IZKASoRMuS5FozPQh6abvmwZEPdf5I9riUElzvHrqJ8Bx0t3Pujdoth+yNHpnBxrtO8LkQd rQ58P0SwcaIX33T2U9pG8bhu5YVR88FQ8OQ0cEsPBpDncg== In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4TNtP14PvFz4m60 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:24940, ipnet:159.69.0.0/16, country:DE] List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org On 29/01/24 16:53, Warner Losh wrote: > > > On Mon, Jan 29, 2024, 8:48 AM Guido Falsi > wrote: > > On 29/01/24 09:26, Guido Falsi wrote: > > On 29/01/24 02:10, Warner Losh wrote: > >> > >> > >> On Sun, Jan 28, 2024 at 4:45 PM Nathan Reilly-list > > >> >> wrote: > >> > >> > >> > >>>     On 29 Jan 2024, at 8:43 am, Guido Falsi > >>>     >> wrote: > >>>     On 28/01/24 22:34, Guido Falsi wrote: > >>>>     On 28/01/24 22:23, Warner Losh wrote: > >>>>>     On Sun, Jan 28, 2024, 12:38 PM Guido Falsi > > >>>>>     > > > >>>>>     >>> wrote: > >>>>> > >>>>>         On 28/01/24 15:15, Guido Falsi wrote: > >>>>>         [snip] > >>>>>          > Creating repository in /tmp/packages:   0% > >>>>>          > > >>>>> > >>>>>         BTW, forgot to mention last time this worked without > issue > >>>>>     was around > >>>>>         20th December. > >>>>> > >>>>> > >>>>>     I think this is a bsd-user issue. There is a race > somewhere in > >>>>>     that code that causes the hangs. I'd love a reproducible test > >>>>>     case that is somewhat smaller than python... there are bigger > >>>>>     races with the newer stuff and I've not had the time to > chase it > >>>>>     there either. 😞 > >>>>     First of all thanks for your feedback. It encourages me having > >>>>     someone else with better knowledge about this confirm that > a race > >>>>     condition is actually a possible cause! > >>>>     Strange this has not been happening up to mid December. > >>>>     My main and fully reproducible use case is actually mostly > with > >>>> pkg. > >>>>     at the end of the run poudriere runs `pkg repo` to create the > >>>>     meta files and sign the repo. It forks itself (ncpus + 2 I > guess, > >>>>     even forcing it to 1 worker I see three processes), and then > >>>>     locks up, with all the processes stopping using CPU (ps > output is > >>>>     in my message) > >>>>     I guess this can be reproduced with any poudriere repo with at > >>>>     least more than ncpus packages in it. can also be reproduced > >>>>     using `poudriere pkgclean -u ` > >>>>     If that does not work I'm not sure how to reproduce it in > other > >>>>     ways, but I can try  writing some code mocking what pkg > seems to > >>>>     be doing, not an expert at such things, though. > >>> > >>>     In case it helps further norrow doen things, It looks like the > >>>     lockup is happening somewhere around here: > >>> > >>> > >>> > https://github.com/freebsd/pkg/blob/56fa3f87d9d9644348b89680dfd8af47a860ee82/libpkg/pkg_repo_create.c#L778 > > >>> > >>>     and/or in the pkg_create_repo_worker() function here: > >>> > >>> > >>> > https://github.com/freebsd/pkg/blob/56fa3f87d9d9644348b89680dfd8af47a860ee82/libpkg/pkg_repo_create.c#L341 > > >>> > >>> > >>>     (I'm trying to spare you the time needed to find the actual > code > >>>     being executed, I guess you would have identified this in a few > >>>     minutes yourself, but I'm trying to make myself useful) > >> > >> > >>     There appears to be a GitHub issue for poudriere with this, but > >>     seems to be looking in another direction. > >> > >> https://github.com/freebsd/poudriere/issues/1009 > > >>     > > >> > > > > This one looks quite similar. > > > > In my case the ports/pkg are aligned between host and jail, in > fact I > > have built them from the exact same git checkout. > > > > I noticed pkg head has been converted to using pthreads instead > of fork, > > maybe that could help. I will make time to perform some testing. > > Thanks for pointing me here, it looks like this was "it", in that by > fixing this issue it uses native pkg-static, and sidesteps the issue. > > > Unluckily there ARE qemu races and lockups that prevent arm64 > pkg-static > binary to be correctly emulated by qemu-user-static. such conditions > also cause sporadic failures in some ports being built. > > I filed a PR with a fix for that issue: > > https://github.com/freebsd/poudriere/pull/1115 > > > > Ok. This dodges the problem. But it papers over things. Definitely, but this is actually also what was happening in the past. It stopped using native (host) pkg-static due to the pkg port gaining a PORTREVISION, which caused the same version check to fail. I agree the underlying issue should be fixed. > > Any chance you could give me the state of pkg before + the package added > as a test case for qemu? Not sure I understand what you are asking for, can you elaborate? What I did was run poudriere asking it to compile a few packages, the lockup, when trying to use target arch pkg-static via qemu-user, is reproducible 100% in my experience. It does not really depend on number of packages. I get it by starting with an empty build. I'm building these packages (and obviously their dependencies): dns/unbound net/kea sysutils/tmux (I guess building only tmux could suffice) With poudriere you can get it to use target arch pkg-static by modifying /usr/local/share/poudriere/common.sh function ensure_pkg_installed, making sure the check here fails: https://github.com/freebsd/poudriere/blob/e00503d846dc7a3b661aac84a6657f15e0f4b702/src/share/poudriere/common.sh#L5687 -- Guido Falsi