From nobody Thu Aug 24 20:07:22 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RWvJY1t0lz4rWwk for ; Thu, 24 Aug 2023 20:07:41 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic311-24.consmr.mail.gq1.yahoo.com (sonic311-24.consmr.mail.gq1.yahoo.com [98.137.65.205]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4RWvJX0ZPhz3HTn for ; Thu, 24 Aug 2023 20:07:40 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=kzr2Mr1p; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.65.205 as permitted sender) smtp.mailfrom=marklmi@yahoo.com; dmarc=pass (policy=reject) header.from=yahoo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1692907657; bh=CTiPCxlJlb05PibctPBVADFA3EoNMol6Ch5mBdil+lw=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=kzr2Mr1pFnyzNIPVyz1SADu+WYVy/2MTvfWP/Mb5aQVQN6DPEa/avKLYBBjWJu3Vi72kcRU/7u7Ih08U2MIL4rmyUKlgpWlqHllEDgWtvMaGnq09LcPzlFFCOdC7WLQp1a3p+/R0TrCF6Xc8/p+knFL2k7XP7Bo2wxvcBKVn/v2a9ilNpk7y5u5GGGCL9QUdAZtlvCcGQjJCJHuAPz1PVdsVACJkeVNr/s+7zY56NR3JRac1ZELGOOS3eLPJKfaKJ11pLyqtmorRlTNrOmz1GSc6jZnQHS6uSWgxr6xvUiCe4ywDugu/z3H/pUw2VatuJuw0S4UsiMd7mnCMzgv/sA== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1692907657; bh=ADPt5EwlwbchtujQOObRgq0SnT2ScOrSED6KgZe9XIe=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=BcUM82/jovf0TxzTwL9sd/qZSZ1F9ICX38g8WovI44ENFaCjaqAiKsMA9ZRKLTwA2j+iUe6Kl28T4IEecBNN/CHF13gf5jeD0Fqf2FxO3P9PE9BXsmxwMXXWYhftscGPzvEDxQvZv/D/j/hdbHKd+IatrCbfe3x3+vVhoAOKJH64OsrcTJWlqA+g329iUhSklu9V0zbHRSBTXUENqyZrnpn6Oo8ddSXk0jY45tY2Lt1Ag35t/qyb9/X1Btq7tnjy3DwlEjt+TrubY+bxkddYrPtS6vYhp5Epcnt9qCWARt0Ctp0Xcb2YwtcGu/XQuC306aceo0fHKqDzHjUrzrU2GQ== X-YMail-OSG: Eg4ZEQMVM1kGyRaj4VQwLLYtUWZ3b.nPvIsw0XDR5hf5x15YQG7cYdf.sUzksKt bNKqsMn9M9f1UqBw2Zzz6RJsDGWnWLb6vA0c_GOKWCjpJwfW6UjfKnNgBwL1eoIPlXmkb69KbZnF 7HEtXFFL2k24JPsjP6fEfPLmKH3J0pWtzSF1Zqhsw4kPnLtsA7YNybfZRhvv0cMyKWdNZnLtu_Rt nAqxT62BOSGScV6wtRJR7mOQLoBMg_tFam_doY_FF491159MZnjDbdvcRiFfhzElkLFjhdEdQm93 R3JZtjMMuudnbAUPbl3fLp6iWXYS5CvQGKMkYh.8Hdnj41cdiWD2VDAsGknOm4fXualeMJNxBZfR 31mBoqTw3Ho9gFzOmteonVBguc0wxZvskzYvwo9crB_EzpduU7vXciqfFoEZL0BapDisU1YBZ01Y RvCfRABn3y75rC1yo0L52tcI66dYk3nm_sZVQSMXn6xCYMbaQ0pxKXokoC03YUhgUS3q4xiY96fg FBxug2Djx4XGu1KBx5udxKXrjIMJxWFy2fYpXc4cQ8iyHkl5xl68k2aTqDDp6OyW3EE6A0qLfx3K y0u5.nSnM64PXejraOxcL0BA0TKm6SnJR1zCeVgK223wBjmtDmwa7eZIxvh3MeELWzgO9YWSEwJ4 CLIKxaRCgaqn.5sEN_BtqMiTugX4fgR3m81iCb9_wSP7Uk6IZ2L4sUxZsQ0mMlCk.UDBiDaTtotd Sd6BiATNKX.Bo5SENQhSKzAyj4q9a_QW0VXiDRu_sY75_I1mZW1gZz9C1H1GD0_wrHKpRTNR5zNk EdQcBLYeIT4OOZColbja3MB3byorwVrFiq5VULuDUIV6zs1QnICxoiSI0dOK4ejAwGuz7qfe4S1m RaAH38nqhuCU9Bfb2Bh5m.EQgp2Ewa_wN2Qs7RrMpJcOo8lCNCxrKtCW1SjMxXw.joqiCvU9zKtP A9LmPy9TNhKGMDW6QsZenaIVXT.cy8V8gWTGr1zFSoHrsjJ33V2BWo9QbEl.INcbRhoZ2TakHMgn EyZEwUAceDT1ThAe8E8m4dqWlyvhJS0IXpuH3Z1IahijRc15hQrjhWiyJnszJVGz4NiliCvRMbGe 3uyiMkHIaIdbxvQc4r9CyiMqcTqXKOtRlyszIyXG93UdhFC0FaQzTKbuP90tPJsKt1ffyVj61RuG 6bm.ePx0rYmb_ZeObPEID0OcSCY0Ygy5a_4SaTVsv_yCeHSumYW3jihNGVTM7iiq_aJJ5ZmzwhyD tbafU00mzgyDQlqRVsFBxFk6dSp.9TjL61Rva.ZLQDQenDPikZ.o3xPKftGxz6OliCDc8DPZ4p1q koqTDAOEikxe8x4A89G44aJNRCfiGgX6rkqZphmT7YXuwH76j64b4po8gKfDtBJ7n4QHGABNT9JD jN7Prsv9bgL0HS5J7WGTaQzMktrW.cMf2ZkIx8dmHXY45xN44MJF7SC5RWLKL6.40cCehBPQ2USL gKS9VzPinDvVwgbFUYfRE73Ex5MSUErQnEMFQ.VqUCPwwbnDHXhHrPSVN.S_TuU._RMvKLe.jkeo MqbOG3Qm4TE.b8ON4Kvcuy81e7nOMmaGXxMEWem.EKNZJfJ_NJdSwOeT7BwNgLItQVHs6lxfn.mP RIbiCVKj0kMNC3WY0AIqmMy76Y5w3VDEdRFYcqa.uU8m4bMBewDeuwTXi0Yj9pUj2aaCwXVDh081 BgpIkMjxEf1qyAM0I8z37pHPgGJPVbpBOM0pVfbemiXzvKuYaqkKKhIhaxVOpFsHae33c.kYgnhE RYL7lepz1IKzGXJ7CXWM2wV04qDr3iSjAbEocTSJc18HDjMfOrUUWFR.vcIHXH2dK7tEIFO9ewU4 5ILmshiXtu6_MQhzMS3M3lxSRL34.p2PXEdGntwg96PWPCH5TXT8PbWSAb3L4bEmTHxAMq.fPcwu JzwA5MJRUqMmLshka3N30axir0jjC2a1ZHeakZw5Qlnw.QC.W1gklFan989SWDpJpW1tInFKG7Li 0xvDRLoHheFklZ1H.uzFtnHST7jhqfm5_qyjCiLmu1.BTObkuPBetqPWaH5.e6XOALGNWqVdHPaS 8fGG8cR3O83tph9daKz2qxwdhpSqeZPbLMjliaPlm_tR86gdjkupZxEFguNgcT4tTe0y9SPEIVXp xDOU1XkwzuLkl6TPllbXFHRzqXPX3m2w4LdCDhmXg_iwoRuedInzxfYVW31uDmX2UwK0P2DtFvLF 9gqLjHafDHwSeUkrJI2_hgu6fJa7XC6vwsFYmfH6Zu7veC.qxRBmjYPl8zFaABMc1nWsQyFGlZeU yVEM- X-Sonic-MF: X-Sonic-ID: b588e2a7-4e7d-474d-9e3f-9ddfc3cb6cd1 Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.gq1.yahoo.com with HTTP; Thu, 24 Aug 2023 20:07:37 +0000 Received: by hermes--production-ne1-7b767b77cc-7tm2h (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID cc93bd5d4a5e33ed98b91a6cbd942207; Thu, 24 Aug 2023 20:07:34 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.700.6\)) Subject: Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder From: Mark Millard In-Reply-To: <2CF7E1B7-3026-4485-B2F1-3D464CF0FE4F@yahoo.com> Date: Thu, 24 Aug 2023 13:07:22 -0700 Cc: Current FreeBSD Content-Transfer-Encoding: quoted-printable Message-Id: References: <4FFAE432-21FE-4462-9162-9CC30A5D470A.ref@yahoo.com> <4FFAE432-21FE-4462-9162-9CC30A5D470A@yahoo.com> <5D23E6BE-A25C-4190-BB2C-A2D3511ABD90@yahoo.com> <2CF7E1B7-3026-4485-B2F1-3D464CF0FE4F@yahoo.com> To: Mateusz Guzik X-Mailer: Apple Mail (2.3731.700.6) X-Spamd-Result: default: False [-3.50 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; MV_CASE(0.50)[]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MIME_GOOD(-0.10)[text/plain]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCVD_IN_DNSWL_NONE(0.00)[98.137.65.205:from]; MID_RHS_MATCH_FROM(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.65.205:from]; DKIM_TRACE(0.00)[yahoo.com:+]; TO_DN_ALL(0.00)[]; FREEMAIL_FROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; RCPT_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_TO(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-Spamd-Bar: --- X-Rspamd-Queue-Id: 4RWvJX0ZPhz3HTn On Aug 24, 2023, at 00:22, Mark Millard wrote: > On Aug 23, 2023, at 22:54, Mateusz Guzik wrote: >=20 >> On 8/24/23, Mark Millard wrote: >>> On Aug 23, 2023, at 15:10, Mateusz Guzik wrote: >>>=20 >>>> On 8/23/23, Mark Millard wrote: >>>>> [Forked off the ZFS deadlock 14 discussion, per feedback.] >>>>> . . . >>>>=20 >>>> This is a known problem, but it is unclear if you should be running >>>> into it in this setup. >>>=20 >>> The changed fixed the issue: so I do run into the the issue >>> for this setup. See below. >>>=20 >>>> Can you try again but this time *revert* >>>> 138a5dafba312ff39ce0eefdbe34de95519e600d, like so: >>>> git revert 138a5dafba312ff39ce0eefdbe34de95519e600d >>>>=20 >>>> may want to switch to a different branch first, for example: git >>>> checkout -b vfstesting >>>=20 >>> # git -C /usr/main-src/ diff sys/kern/vfs_subr.c >>> diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c >>> index 0f3f00abfd4a..5dff556ac258 100644 >>> --- a/sys/kern/vfs_subr.c >>> +++ b/sys/kern/vfs_subr.c >>> @@ -3528,25 +3528,17 @@ vdbatch_process(struct vdbatch *vd) >>> MPASS(curthread->td_pinned > 0); >>> MPASS(vd->index =3D=3D VDBATCH_SIZE); >>> + mtx_lock(&vnode_list_mtx); >>> critical_enter(); >>> - if (mtx_trylock(&vnode_list_mtx)) { >>> - for (i =3D 0; i < VDBATCH_SIZE; i++) { >>> - vp =3D vd->tab[i]; >>> - vd->tab[i] =3D NULL; >>> - TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); >>> - TAILQ_INSERT_TAIL(&vnode_list, vp, = v_vnodelist); >>> - MPASS(vp->v_dbatchcpu !=3D NOCPU); >>> - vp->v_dbatchcpu =3D NOCPU; >>> - } >>> - mtx_unlock(&vnode_list_mtx); >>> - } else { >>> - for (i =3D 0; i < VDBATCH_SIZE; i++) { >>> - vp =3D vd->tab[i]; >>> - vd->tab[i] =3D NULL; >>> - MPASS(vp->v_dbatchcpu !=3D NOCPU); >>> - vp->v_dbatchcpu =3D NOCPU; >>> - } >>> + for (i =3D 0; i < VDBATCH_SIZE; i++) { >>> + vp =3D vd->tab[i]; >>> + TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); >>> + TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist); >>> + MPASS(vp->v_dbatchcpu !=3D NOCPU); >>> + vp->v_dbatchcpu =3D NOCPU; >>> } >>> + mtx_unlock(&vnode_list_mtx); >>> + bzero(vd->tab, sizeof(vd->tab)); >>> vd->index =3D 0; >>> critical_exit(); >>> } >>>=20 >>> Still with: >>>=20 >>> # grep USE_TMPFS=3D /usr/local/etc/poudriere.conf >>> # EXAMPLE: USE_TMPFS=3D"wrkdir data" >>> #USE_TMPFS=3Dall >>> #USE_TMPFS=3D"data" >>> USE_TMPFS=3Dno >>>=20 >>>=20 >>> That allowed the other builders to eventually reach "Builder = started" >>> and later activity, [00:05:50] [27] [00:02:29] Builder started >>> being the first non-[01] to do so, no vlruwk's observed in what >>> I saw in top: >>>=20 >>> . . . >>>=20 >>> Now testing for the zfs deadlock issue should be possible for >>> this setup. >>>=20 >>=20 >> Thanks for testing, I wrote a fix: >>=20 >> https://people.freebsd.org/~mjg/vfs-recycle-fix.diff >>=20 >> Applies to *stock* kernel (as in without the revert). >=20 > I'm going to leave the deadlock test running for when > I sleep tonight. So it is going to be a while before > I get to testing this. $ work will likely happen first > as well. (No deadlock observed yet, by the way. 6+ hrs > and 3000+ ports built so far.) >=20 > I can easily restore the sys/kern/vfs_subr.c to then > do normal 14.0-ALPHA2-ish based patching with: so not > a problem. Thanks. >=20 I stopped the deadlock experiment, cleaned out the partial bulk -a, put back the modern sys/kern/vfs_subr.c , applied your patch, built, installed, rebooted, and started another bulk -a run. It made progress on all the builders to and past "Builder started": . . . [00:01:34] Building 34042 packages using up to 32 builders [00:01:34] Hit CTRL+t at any time to see build progress and stats [00:01:34] [01] [00:00:00] Builder starting [00:01:57] [01] [00:00:23] Builder started [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4 [00:03:09] [01] [00:01:12] Finished ports-mgmt/pkg | pkg-1.20.4: Success [00:03:22] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1 [00:03:22] [02] [00:00:00] Builder starting [00:03:22] [03] [00:00:00] Builder starting [00:03:22] [04] [00:00:00] Builder starting [00:03:22] [05] [00:00:00] Builder starting [00:03:22] [06] [00:00:00] Builder starting [00:03:22] [07] [00:00:00] Builder starting [00:03:22] [08] [00:00:00] Builder starting [00:03:22] [09] [00:00:00] Builder starting [00:03:22] [10] [00:00:00] Builder starting [00:03:22] [11] [00:00:00] Builder starting [00:03:22] [12] [00:00:00] Builder starting [00:03:22] [13] [00:00:00] Builder starting [00:03:22] [14] [00:00:00] Builder starting [00:03:22] [15] [00:00:00] Builder starting [00:03:22] [16] [00:00:00] Builder starting [00:03:22] [17] [00:00:00] Builder starting [00:03:22] [18] [00:00:00] Builder starting [00:03:22] [19] [00:00:00] Builder starting [00:03:22] [20] [00:00:00] Builder starting [00:03:22] [21] [00:00:00] Builder starting [00:03:22] [22] [00:00:00] Builder starting [00:03:22] [23] [00:00:00] Builder starting [00:03:22] [24] [00:00:00] Builder starting [00:03:22] [25] [00:00:00] Builder starting [00:03:22] [26] [00:00:00] Builder starting [00:03:22] [27] [00:00:00] Builder starting [00:03:22] [28] [00:00:00] Builder starting [00:03:22] [29] [00:00:00] Builder starting [00:03:22] [30] [00:00:00] Builder starting [00:03:22] [31] [00:00:00] Builder starting [00:03:22] [32] [00:00:00] Builder starting [00:03:30] [01] [00:00:08] Finished print/indexinfo | indexinfo-0.3.1: = Success [00:03:30] [01] [00:00:00] Building devel/gettext-runtime | = gettext-runtime-0.22 [00:04:42] [01] [00:01:12] Finished devel/gettext-runtime | = gettext-runtime-0.22: Success [00:04:48] [01] [00:00:00] Building devel/libtextstyle | = libtextstyle-0.22 [00:05:46] [19] [00:02:24] Builder started [00:05:46] [15] [00:02:24] Builder started [00:05:46] [19] [00:00:00] Building graphics/libpotrace | = libpotrace-1.16 [00:05:46] [15] [00:00:00] Building devel/libdaemon | libdaemon-0.14_1 [00:05:46] [25] [00:02:24] Builder started [00:05:46] [25] [00:00:00] Building audio/speexdsp | speexdsp-1.2.1 [00:05:46] [29] [00:02:24] Builder started [00:05:46] [29] [00:00:00] Building devel/opencl | opencl-3.0.14 . . . Thanks. I'll let it run as another deadlock test. The prior run built over 9400 in about 18.5 hr before I stopped it (no deadlocks observed). =3D=3D=3D Mark Millard marklmi at yahoo.com