From nobody Sat Apr 15 17:44:27 2023 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PzLL56mdtz44lLF for ; Sat, 15 Apr 2023 17:44:45 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic314-19.consmr.mail.gq1.yahoo.com (sonic314-19.consmr.mail.gq1.yahoo.com [98.137.69.82]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4PzLL51bTrz4VPw for ; Sat, 15 Apr 2023 17:44:45 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1681580683; bh=ZrcTxurK9d5UwjEFLNbzf6XObLroMR1smTK6w0slaP0=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=HWLW8b95Y//vR8DV/vtP+5lDt24PVKvE1pirek45s9Nzvyn8hxzkFztHiHQbjZF6hPWbgEuZdbgHcaOaUXH9zN/pDD70z83qbXmXMO9qIbwCQ8rvtl7PDAwYM3ol07S6mNnZ7ZYxSaaVKSGOo20JybgIe1i2dmnIwaXkMslO8V+QILIiCXEr5oRMG0a4FOnzF6Jho41synNjBKgRNGFXWOKYxFW7/XjyZiMjowffq/2P21VavDs/iZZi0peqiUrhghCP0TOv/b01L6wHFqVnLUkiHWaZL6KXyqa1y4eKIjLoE1PqAzp6v4gyjUiBGqNwj9V657jguRViA3EwCS1eRg== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1681580683; bh=KmVwmZ6OGecjTyNjv9Ga2ZU7Kb3BbmUQoHPc8E0o0eF=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=QnwrYUH+kE10i87vVWzRIkzNun2W44Xr4RTI4bx05cQdfSP74PEGQove+XWw11H2AXN7uhJFuTWC062H6b/23opeiaYer7TbaruPkRTxBGJLlMb6U/7seTOmOGTx1hkmnk8gFuPJ7xgK0XKh+2Iyu1EoVMVHJAr+pLG/eMzfRnJPH0symEurxhG29GuGqrG3nPyGysvhioLVVYHEQKdwvwqNW7WhejOnRD+A6la3vIMwHBnBXt7RRVQvdbr+f03tgCxTjI3qGt1H+oKksI+Qc40bgQAU3M4pcc0FB+1oUsygcPPjumrqP0yiX6++tBGR2NEZ+VKNMHr3XY15fINIpQ== X-YMail-OSG: ziZLTjoVM1li2vHARzQBVnvieb_HH12r26BBvcAxPwVw_VDpkrUTsnE2QqSmhKe .zgCQ_rBSL9wDZwqakey3X.wPYjpBYcMagyogXykPc6NWPuygVoXzbLP2ZmLH06KwMd95xJFBx2P wZg2TS6BZzmT7OM2BF7UWEPqo6WMaRsuG.QFbd3WBE.hCtu68duMBkB.vlG0b84.ll1hTNyXWjFS TTbFk_.yyTlwNfVBwG3SfdIjvFrojqml2RVBQgXFf2puzC5wJ9qBPnV_Ratsa0RpFzXuLYGSN_XC hNcg.yOsliytcKPcMtCl.dKdECLNl41aQGU.WzW_M7AzLcmis3Dsz5wqfyidLuwJZ5z5ivXoMy5l pjmhk8.JbHqIia2N2JEGDTXDyrHQB4ldeEuubkkmS7D7kFQ8Fln2SbRMOnFqL1KXDG4HK9T.TiS5 BIYVmHE0givi4A6szG5S3zc2LYNx_7KddYOtbOftZtKm.qXGZI3aYlUkgUDbYPbZKuraKsczQjz7 xlWN5icjnrL7x_WyczhMRDtn13HQh4ucE39BlgBn13_FaYh_BAecW6c5KrqhEMX9_W.qoxKhiZAj CUBLbSD7EJhhfLghx_Byk1cu44g2xbT14pcc2oJUJNuJTVGU_R6A.npS1JXSsSlEBIqngMzgMjdc QxIoqIlcuixqegcTgnBk4fBBG9ZqsUSAH_l1inNtGsLFtqO4M1bUorFQi7gcvhERYk6YUDwSD6Eb 5rs4QajclMZFs0u2nIjp.wvUIrISOOKueBxFy8dbEnGHwuE2di9TvputG0fMRKfu95aAXsmW1vxN g_SyuHP9vSwplSWkz_Nb0FRXeemqju7zhQpODE7LlxB33OYz5WPUe4QB.NOjuc0EWq3AviBtk7Un 9OwcEjB9sBenPbbS9TRzvPcWLBC_mEjmlAE9h3a_0gKQpB7uk11Ky0GlYPWJMRqgDLNWHJJyg_3n vPaEuQ1_yCYmhfXpg3Z7mPP2NScbUVNSGGokYi16SsacdAo_UvHdVWfmnJZMrr8NT4JkYeIwYVzq ULwj.5TdJoVd.5v1lhT5efnelDcN3ayzd0T4Nt1N6nEXMyoVyRs.KFESmr40cQcY6ekSAG35rPqG aBQ2HZLgjdbGl410wIaV3bdqfD6ewHb4Yojaauzd3IO3s2TM3y63H4xIhXkDvurKUeIcG0_KrKlt sI.4TFAx0.tBnP6gX8NjewhA8cnEC6B3R3L37GmiaTD6o7Sy1hdLOE3SnOZ0bpF.akixkFWHuws3 Mm5knQLtCPS7P97f2zS3c.t309As76RzHD_O.ljIEc.sc9QnD6tCg1DcZOWgirlcHuY_ndAZLxdG ChJQAG7eYV5wZK4OWRfjdJdps7BDo51cZGs7I1lKFdnv3DNMvzhMzvQx8ELQOWv.q.aY4cPqiYXK rkka7O7BhA9ghIZOmYdzutWwgpA9s.Tk37xpsczuFWiCELtDs9veQFE67AOpIGjSwip2YRohG1nu S01RwEhHNO6echm78yZhPeLyT8ypirJOCdk8LtIlWhS.JLrkhHXhx0FDBXoMmshRXkQ4GFgGt.re r3k5pbmYxOqEH0VRSrhJf_laI399pHZr5ljx3h58OLmvgLMsIzj_Ql3WoVDglxHT7aBe6rVywJYf NaFF1aDhKMqQcnBH_U8Te_qHrpQmhb3KhdmvaV5Tm3XexQHh3kyWRnbBd_hhaJATtLr2m8NmcQHt 7mfrlZoiBqFgizfJ7CbS16Z6_lKHVVyUr9bIBO6w08ntCSeqjdvWY4wpCMpIdZ4nOw8MqqRIssL1 27Xl.uEC4O18T4gPbbqr8L9cYgkXxBbiGL.QcB4PCrfa2zoSaLHKuyElegZJbqUkvT53VkLKSRdH 8jbdUw1nqxTAKCmW8PqrenMFWPQW8.uI995nezHOVsdFCAC_IVH8rwUJiN_STgM7eF4Ntx9q9oTv e9PwZ9DIf.fXaHG6qd0z1gp1tAfQtt3SRK2Qe0Wx97BigVDUrYjSkD.UvrnS4zlhJppjfyM7JNe5 xPmZ3Xevip5jeVAqF_uZCXVDbMSjGMJn1lAN5EAjVTdqMMTSUn_7pM7wDMyH2evEBFbEWjdzruwp VsrW7i21qZ87m.ZZZ4NgMdD7upFowIt0AI9KhkavdN7vyHWQCiR._uLvIKpNh5VOxPwBlzdnJeBs jAkQIRdKJ6znJ9XOKjXRh5XOyX.KEtQ6YFq9Q3H2PHB7ICTH.0b6sN4RtztB1KHOspiT6vLxQiHf zgC59Qx8tv9HvsDDYxTWt1Ymv3JLzANGwrfFkVniZi18AqerRN_M.hErmBXfoSp00VOThxEkvB6l jjBRVlw-- X-Sonic-MF: X-Sonic-ID: 6e967237-709d-47c4-aa42-43619a05bffe Received: from sonic.gate.mail.ne1.yahoo.com by sonic314.consmr.mail.gq1.yahoo.com with HTTP; Sat, 15 Apr 2023 17:44:43 +0000 Received: by hermes--production-bf1-5f9df5c5c4-lwjq6 (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID fdf4869d8b16ad3ffd6f52e10113979a; Sat, 15 Apr 2023 17:44:40 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\)) Subject: Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75 From: Mark Millard In-Reply-To: <20230415143625.99388387@slippy.cwsent.com> Date: Sat, 15 Apr 2023 10:44:27 -0700 Cc: Cy Schubert , Charlie Li , Pawel Jakub Dawidek , Mateusz Guzik , dev-commits-src-main@freebsd.org, Current FreeBSD Content-Transfer-Encoding: quoted-printable Message-Id: <5A47F62D-0E78-4C3E-84C0-45EEB03C7640@yahoo.com> References: <20230413071032.18BFF31F@slippy.cwsent.com> <20230413063321.60344b1f@cschubert.com> <20230413135635.6B62F354@slippy.cwsent.com> <319a267e-3f76-3647-954a-02178c260cea@dawidek.net> <441db213-2abb-b37e-e5b3-481ed3e00f96@dawidek.net> <5ce72375-90db-6d30-9f3b-a741c320b1bf@freebsd.org> <99382FF7-765C-455F-A082-C47DB4D5E2C1@yahoo.com> <32cad878-726c-4562-0971-20d5049c28ad@freebsd.org> <20230415115452.08911bb7@thor.intern.walstatt.dynvpn.de> <20230415143625.99388387@slippy.cwsent.com> To: FreeBSD User X-Mailer: Apple Mail (2.3731.400.51.1.1) X-Rspamd-Queue-Id: 4PzLL51bTrz4VPw X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N On Apr 15, 2023, at 07:36, Cy Schubert = wrote: > In message <20230415115452.08911bb7@thor.intern.walstatt.dynvpn.de>,=20= > FreeBSD Us > er writes: >> Am Thu, 13 Apr 2023 22:18:04 -0700 >> Mark Millard schrieb: >>=20 >>> On Apr 13, 2023, at 21:44, Charlie Li wrote: >>>=20 >>>> Mark Millard wrote: =20 >>>>> FYI: in my original report for a context that has never had >>>>> block_cloning enabled, I reported BOTH missing files and >>>>> file content corruption in the poudriere-devel bulk build >>>>> testing. This predates: >>>>> https://people.freebsd.org/~pjd/patches/brt_revert.patch >>>>> but had the changes from: >>>>> https://github.com/openzfs/zfs/pull/14739/files >>>>> The files were missing from packages installed to be used >>>>> during a port's build. No other types of examples of missing >>>>> files happened. (But only 11 ports failed.) =20 >>>> I also don't have block_cloning enabled. "Missing files" prior to = brt_rev >> ert may actually >>>> be present, but as the corruption also messes with the file(1) = signature, >> some tools like >>>> ldconfig report them as missing. =20 >>>=20 >>> For reference, the specific messages that were not explicit >>> null-byte complaints were (some shown with a little context): >>>=20 >>>=20 >>> =3D=3D=3D> py39-lxml-4.9.2 depends on shared library: libxml2.so - = not found >>> =3D=3D=3D> Installing existing package = /packages/All/libxml2-2.10.3_1.pkg =20 >>> [CA72_ZFS] Installing libxml2-2.10.3_1... >>> [CA72_ZFS] Extracting libxml2-2.10.3_1: .......... done >>> =3D=3D=3D> py39-lxml-4.9.2 depends on shared library: libxml2.so - = found >>> (/usr/local/lib/libxml2.so) . . . >>> [CA72_ZFS] Extracting libxslt-1.1.37: .......... done >>> =3D=3D=3D> py39-lxml-4.9.2 depends on shared library: libxslt.so - = found >>> (/usr/local/lib/libxslt.so) =3D=3D=3D> Returning to build of = py39-lxml-4.9.2 =20 >>> . . . >>> =3D=3D=3D> Configuring for py39-lxml-4.9.2 =20 >>> Building lxml version 4.9.2. >>> Building with Cython 0.29.33. >>> Error: Please make sure the libxml2 and libxslt development packages = are in >> stalled. >>>=20 >>>=20 >>> [CA72_ZFS] Extracting libunistring-1.1: .......... done >>> =3D=3D=3D> libidn2-2.3.4 depends on shared library: = libunistring.so - not found >>=20 >>>=20 >>>=20 >>> [CA72_ZFS] Extracting gmp-6.2.1: .......... done >>> =3D=3D=3D> mpfr-4.2.0,1 depends on shared library: libgmp.so - not = found =20 >>>=20 >>>=20 >>> =3D=3D=3D> nettle-3.8.1 depends on shared library: libgmp.so - not = found >>> =3D=3D=3D> Installing existing package /packages/All/gmp-6.2.1.pkg = =20 >>> [CA72_ZFS] Installing gmp-6.2.1... >>> the most recent version of gmp-6.2.1 is already installed >>> =3D=3D=3D> nettle-3.8.1 depends on shared library: libgmp.so - not = found =20 >>> *** Error code 1 >>>=20 >>>=20 >>> autom4te: error: need GNU m4 1.4 or later: /usr/local/bin/gm4 >>>=20 >>>=20 >>> checking for GNU=20 >>> M4 that supports accurate traces... configure: error: no acceptable = m4 coul >> d be found in >>> $PATH. GNU M4 1.4.6 or later is required; 1.4.16 or newer is = recommended. >>> GNU M4 1.4.15 uses a buggy replacement strstr on some systems. >>> Glibc 2.9 - 2.12 and GNU M4 1.4.11 - 1.4.15 have another strstr bug. >>>=20 >>>=20 >>> ld: error: /usr/local/lib/libblkid.a: unknown file type >>>=20 >>>=20 >>> =3D=3D=3D >>> Mark Millard >>> marklmi at yahoo.com >>>=20 >>>=20 >>=20 >> Hello=20 >>=20 >> whar is the recent status of fixing/mitigate this desatrous bug? = Especially f >> or those with the >> new option enabled on ZFS pools. Any advice? >>=20 >> In an act of precausion (or call it panic) I shutdown several servers = to prev >> ent irreversible >> damages to databases and data storages. We face on one host with = /usr/ports r >> esiding on ZFS >> always errors on the same files created while staging (using = portmaster, leav >> es the system >> with noninstalled software, i.e. www/apache24 in our case). Deleting = the work >> folder doesn't >> seem to change anything, even when starting a scrubbing of the entire = pool (R >> AIDZ1 pool) - >> cause unknown, why it affects always the same files to be corrupted. = Same wit >> h deve/ruby-gems. >>=20 >> Poudriere has been shutdown for the time being to avoid further = issues.=20 >>=20 >> Are there any advies to proceed apart from conserving the boxes via = shutdown? >>=20 >> Thank you ;-) >> oh >>=20 >>=20 >>=20 >> --=20 >> O. Hartmann >=20 > With an up-to-date tree + pjd@'s "Fix data corruption when cloning = embedded=20 > blocks. #14739" patch I didn't have any issues, except for email = messages=20 > with corruption in my sent directory, nowhere else. I'm still = investigating=20 > the email messages issue. IMO one is generally safe to run poudriere = on the=20 > latest ZFS with the additional patch. My poudriere testing failed when I tested such (14739 included), per what I reported, block_cloning never have been enabled. Others have also reported poudriere bulk build failures absent block_cloning being involved and 14739 being in place. My tests do predate: https://people.freebsd.org/~pjd/patches/brt_revert.patch and I'm not sure of if Cy's activity had brt_revert.patch in place or not. Other's notes include Mateusz Guzik's: = https://lists.freebsd.org/archives/dev-commits-src-main/2023-April/014534.= html which said: QUOTE There is corruption with the recent import, with the https://github.com/openzfs/zfs/pull/14739/files patch applied and block cloning disabled on the pool. There is no corruption with top of main with zfs merge reverted = altogether. Which commit results in said corruption remains to be seen, a variant of the tree with just block cloning support reverted just for testing purposes is about to be evaluated. END QUOTE Charlie Li's later related notes that helps interpret that were in: = https://lists.freebsd.org/archives/dev-commits-src-main/2023-April/014545.= html QUOTE Testing with mjg@ earlier today revealed that block_cloning was not the=20= cause of poudriere bulk build (and similar cp(1)/install(1)-based)=20 corruption, although may have exacerbated it. END QUOTE Mateusz later indicated had a hope to have is sorted out sometime Friday for what the cause(s) were: = https://lists.freebsd.org/archives/dev-commits-src-main/2023-April/014551.= html QUOTE I'm going to narrow down the non-blockcopy corruption after my testjig gets off the ground. Basically I expect to have it sorted out on Friday. END QUOTE But the lack of later related messages suggests that did not happen. > My tests of the additional patch (I'm guessing that is a reference to 14739, not to brt_revert.patch .) > concluded that it resolved my last=20 > problems, except for the sent email problem I'm still investigating. = I'm=20 > sure there's a simple explanation for it, i.e. the email thread was=20 > corrupted by the EXDEV regression which cannot be fixed by anything, = even=20 > reverting to the previous ZFS -- the data in those files will remain=20= > damaged regardless. Again: my test jump from prior to the import to after the EXDEV changes, including having 14739. I still had poudriere bulk produce file corruptions. > I cannot speak to the others who have had poudriere and other issues. = I=20 > never had any problems with poudriere on top of the new ZFS. Part of the mess is the variability. As I remember, I had 252 ports build fine in my test before the 11th failure meant that the rest (213) had all been classified as skipped. It is not like most of the port builds failed: relatively uncommon. Also, one port built on a retry, indicating random/racy behavior is involved. (The original failure was not from a file from installing build dependencies but something that the builder generated during the build. The 2nd try did not fail there or anywhere.) > WRT reverting block_cloning pools to without, your only option is to = backup=20 > your pool and recreate it without block_cloning. Then restore your = data. >=20 Given what has been reported by multiple people and Cy's own example of unexplained corruptions in email handling, I'd be cautious risking important data until reports from testing environment activity consistently report not having corruptions. Another thing my activity does not include any testing of the suggestion in: = https://lists.freebsd.org/archives/dev-commits-src-main/2023-April/014607.= html to use "-o sync=3Ddisabled" in a clone, reporting: QUOTE With this workaround I was able to build thousands of packages without=20= panics or failures due to data corruption. END QUOTE If reliable, that consequence to the change might help folks that are trying to isolate the problem(s) figure out what is involved. =3D=3D=3D Mark Millard marklmi at yahoo.com