From nobody Wed Jan 26 00:08:15 2022 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 81C9F198C244 for ; Wed, 26 Jan 2022 00:08:26 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic309-22.consmr.mail.gq1.yahoo.com (sonic309-22.consmr.mail.gq1.yahoo.com [98.137.65.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Jk3w90LB2z3JG9 for ; Wed, 26 Jan 2022 00:08:24 +0000 (UTC) (envelope-from marklmi@yahoo.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1643155696; bh=3ycEJlwAvPHArge7s6KMmd+RvVKsd2PqPq24WBlQDKM=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=mumqSeT+w9PJ8uMBVEUjjLGWLxOH2Sd+er7/2R2nmPQ9KzN6HaKpE04mtpC05z5hdQ5HLT3vgHPxeMdkgUbMzA7+yxyy22bBCCnGBYO2ckHBZI2NimZfv9FfAPvPyAqGWpND6XdRiiTJ6hDQuhNyHfzL3WyRD0Xgl5NVGbAIcPHkND+hD/HEohEPTTDcU5jMngnFTexlYhBQ8DxJCxnCiPUH87r79XqMNJqVaNghMaSetsBLAw7S1iSh5ylfhK1CFAFp5m5AYzruM7o5EAij5du5sn0rjNDIa2GsTlsXj9XP3zSWWJ6dM7WvDzFzA0JSEbMWqLwWeVbqOGyhZfPj2Q== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1643155696; bh=sZKX65+ca6DOOmxyTwEq+sBTYGTOlf6SXuTfGHf2plC=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=EIzvRJLO7aC2C1IxM5ajrAXGoexNIS8E1FqjO5OIFHsmJ1b7OHEAIR5KnbEmxx2q1Nyb0/okxHsH5CDX/c3MCpGdg9I4myEZbscxmuiuvaTI267YfixhFoyBhgfSRJEvovqUXwXD7BTjcnAlwofM3KVNmXiPdqEoIPcnRzoLBi0vYJK3994x8zL+tXanloXINBxIzc5kZW7wFMcgn6KKPDdR5cKSR8OJ2uQm/N85T8yKss/u7eBZYWGqSbcaoScJBFMu6zhPQABoOjHgmzH/MDWp6ddtc5pak2Xwemd4U7m00Mo4X6ZmZfYYn/e36g5nRk8xxmJoYq82v8+kLhFU8Q== X-YMail-OSG: mTBGkZYVM1lBomuIXoL_4eBbxBcy.EW.UKSXBAYjT2CskE.gbe3qImpp_V8GAgE uQSK5M25rgbd.Q_RK7aBzXZF4mEzuYwaj5tMsY7TB1SN0KKZ_L5uU41W9nGgGtLNOiCYcksD_awO mlXuQAj8uRRuMUghuG3MFD1qITQqnRO5ZFWI6Zup1j3PuN270NQO1sTzHdIA_CfrzXXEvnmgFb5p 3rCfSXprybyJou6nxH7im1c0wtQpZySeOGdqu942ybCA1Am6d405o3RDbdMhagG4je.359_5cPUs HZXyDImqKujRpRr4QHapdOqqAxnp00UtTCiZOBj.RrDTIRHiWH26KCwAvT_ZdVbsPs_6xjjkGgb5 1D9QUBQEThvTUcyaHRIWGz2C3QbVrUyMYK4Xz7xPIcLkXON_tqc.rtllObJUv.ccKktZBw7H7nVt zHJcMDcViPfu_4uCMCgQlXQBbKg1FzPWsXdL.jk9PZzbrKCZ5GXNeRNgaxw7zJZ.WceC0I3c8WLv UBDQlE_0WiD_3rBaPRjyyQWmn06jzaj7UKf2h_eZKcNhRQliUBcdN65XvtCyM3gC9BrUfD3Gdex0 An08QUkexamlbYMwIssNq_VBhJLoyK2YpTGsLTJuNNnC9lbDMwWZFX9O9gG4NQ2ENMHAtEnmwrbl Fe_l7NyPWuoPTYgs7MF6JLKn_HvP5sU0pBT8NJviC3bLjgNVUqC66ozPwSWvADk26YE5yXFoYUWM Mt.XFXrlf_dy5noJVFDmgb72M7pFrB9IHzO8JIymtw9QIZvAVplUMP6bglrtahc.EUSmDKrkLUb9 T_0s0K_ha0gUokPb1Bke8_wv6SjNGYZJcGTFeMCE4f.FNMYcy37hE8p947mmaSv7tPuQ_7lRvBFj aXsFFuRcaw_95NYh5f4WsdBKi.dYKwRa3Sx3hKvusKNCQAvdUJhi1ARMkcDtEtgp1wNww60.BITv rAlKry1jU69k4y1N9FGnYsKoefPgcIKTUQnbpxHYuiwGsfUhE_vtlnSP7P8Iqd.1feD2LAvFH5yT zR2UkAAZ6i0ANcAqUjCg.QnaZIXLijbz87msXd3AMSPJ.ls1fooI8QB9EjFKX9r6KQYeXMfsIrAK yxxEo5k.41d1rVBvAJF.IOnBUYiRt2XipN0SY.nLeOsKtjYDQxLn753K9e2JpF9D0dqWtQSZPEtZ Z.rfFV7G.Lvf0eIYWzEoG2CC5zxXt81ZqF1wNepsncak1zrGoX3rfdPVAjx1Z6BuvsuyJIrTDzPr b648m_b6igAJ_CVNKo..FTHekjcdRfxIs50R6kGWqLIYbi.NtjddmE8LZhn5XM9yHN.e2SzPjJPN HDa.jBuO6ikz2TGWPo173thGAzb8eURIUkJ6NWMEG9EajTZ1WSdBAB0_3hXlrk8fRg6IICBsDxu_ an3R5Z2knZYvc5.rNqcdw2W0Wt3t.i9EwJSA7_FvhVUIMOoxAGpLNzHcm.eRIvZ9tN1HA_k.2.zV GMZDU0qnRfrkfgwSihP3SPS4hHEwSVDKQzNszEm8isbkkjyUDjg3yDjcatqhBpSfK44DFnMtFv2. FsW_gyTAL4.Bi_0mveG_85fTtgOP1xrDUg4cuMF6IDHw9hyfX7qYAj21o7tv.VM30kuWL78N0CSX hx6LY58zkD0zS.Ufyr6nlBPia1PQbkyIeL5fQAxPRcD1o0bdf_TCNE1mdSmJnjaXJTt3_Qz2_vdY q5C60_3uNNmOInC6dTMLAg_c6vbVkGhZnJYUJ_tvyteUxmF5w0Mex3D0x4_H.zmwS7UBHuoLv0Y4 zHR1ZVMzfPlmZf20fn2CDRxH1c.dQXcJoOcKsvyLFQ7ycsRHt_jAFf2y9qDao0E3rlrNYe3fJzVp GZxzowdOcy.KDNe1dz2.U7gkoclGLwKGiV1F22GkNooxvkjM0Q2ewjCBPFABMe_eZTdtLcD5Ivx1 Er1GBw6rWW2PLZeLs2VeXhGG5_MVtWnwgBbcq5pEDhK4uHeDfSnKkmkjBQNewTHcb2HuTRwVLuKJ m00mF5f6s8kk2.N1dsX9s_IJr16F7usqw96h5mRlNaXFjozqMOJ2ZWaR5yBw1pLJFI9M..UyCB3t L9nXF3SY1lnbQk1oMO4zfcSGDDHF5lWHNn0vJIjA_764y.5V82hg15pkajNBWXUlrlqi7D8n3dJL bYi6XFvwWOmbg4sOQ_APQlPrjqjIz3pz4DeGPFxdVCvW7Umg7Bxh5eMxvz0dEgE3xeHUG7y5CiB2 OsJgqLQbU0btQk1wNlXiTjp4AC.8bCace0JcJzEHUnWmcdpz5b7_s6eOjxALh7rY_YmJlKODVe2b iPsJVDrhHzYciOSC1Ab.AkTYdIvRYCNS5RCz.f359r1TjfM10Z5mDfvcnGJaD.xyBhg-- X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic309.consmr.mail.gq1.yahoo.com with HTTP; Wed, 26 Jan 2022 00:08:16 +0000 Received: by kubenode514.mail-prod1.omega.gq1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID fe717d92b617e3896648c0e149358da9; Wed, 26 Jan 2022 00:08:15 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Subject: Re: Troubles building world on stable/13 From: Mark Millard In-Reply-To: <20220125221753.GA44654@www.zefox.net> Date: Tue, 25 Jan 2022 16:08:15 -0800 Cc: Free BSD Content-Transfer-Encoding: quoted-printable Message-Id: <58DF1E04-98F4-496C-AFEC-B80EADFF8A74@yahoo.com> References: <8595CFBD-DC65-4472-A0A1-8A7BE1C031D6@yahoo.com> <20220124165449.GA39982@www.zefox.net> <5FAC2B2C-7740-435E-A183-FB3EF1FCE7F9@yahoo.com> <1CB4EDCD-0998-4363-8CEA-14854EB76FA3@yahoo.com> <20220125162245.GA43635@www.zefox.net> <61A3CF79-552C-4884-A8EA-85003B249856@yahoo.com> <20220125180823.GB43635@www.zefox.net> <35046946-7FE4-4E44-950F-BF9CCA72D8F0@yahoo.com> <20220125221753.GA44654@www.zefox.net> To: bob prohaska X-Mailer: Apple Mail (2.3654.120.0.1.13) X-Rspamd-Queue-Id: 4Jk3w90LB2z3JG9 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=mumqSeT+; dmarc=pass (policy=reject) header.from=yahoo.com; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.65.148 as permitted sender) smtp.mailfrom=marklmi@yahoo.com X-Spamd-Result: default: False [-3.50 / 15.00]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[98.137.65.148:from]; MLMMJ_DEST(0.00)[freebsd-arm]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.65.148:from]; RCVD_COUNT_TWO(0.00)[2] X-ThisMailContainsUnwantedMimeParts: N On 2022-Jan-25, at 14:17, bob prohaska wrote: > On Tue, Jan 25, 2022 at 12:49:02PM -0800, Mark Millard wrote: >> On 2022-Jan-25, at 10:08, bob prohaska wrote: >>=20 >>> On Tue, Jan 25, 2022 at 09:13:08AM -0800, Mark Millard wrote: >>>>=20 >>>> -DBATCH ? I'm not aware of there being any use of that symbol. >>>> Do you have a documentation reference for it so that I could >>>> read about it? >>>>=20 >>> It's a switch to turn off dialog4ports. I can't find the reference >>> now. Perhaps it's been deprecated? A name like -DUSE_DEFAULTS would >>> be easier to understand anyway.=20 >>=20 >> I've never had buildworld buildkernel or the like try to use >> dialog4ports. I've only had port building use it. buildworld >> and buildkernel can be done with no ports installed at all. >> dialog4ports is a port. >>=20 >=20 > The attempt to build devel/llvm13 under stable/13 was done under = ports. > Thus the -DBATCH, to avoid manual intervention. I missed the later reference to devel/llvm13 as applying to the above and then later confused the contexts, effectively ignoring devel/llvm13 completely. Sorry. >> I think -DBATCH was ignored for the activity at hand. >>=20 >>> On a whim, I tried building devel/llvm13 on a Pi4 running -current = with=20 >>> 8 GB of RAM and 8 GB of swap. To my surprise, that stopped with: >>> nemesis.zefox.com kernel log messages: >>> +FreeBSD 14.0-CURRENT #26 main-5025e85013: Sun Jan 23 17:25:31 PST = 2022 >>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1873450, = size: 4096 >>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 521393, size: = 4096 >>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 209826, size: = 12288 >>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1717218, = size: 24576 >>> +pid 56508 (c++), jid 0, uid 0, was killed: failed to reclaim memory >>>=20 >>> On an 8GB machine, that seems strange.=20 >>=20 >> -j build? -j4 ? >>=20 > Since this too was a port build, I let ports decide. It settled on 4. >=20 >> Were you watching the swap usage in top (or some such)? >>=20 >=20 > Top was running but the failure happened overnight. Not expecting=20 > it to fail, I didn't keep a log of swapping activity. The message > above was in the next morning's log email. >=20 >> Note: The "was killed" related notices have been improved >> in main, but there is a misnomer case about "out of swap" >> (last I checked). >>=20 >=20 >> An environment that gets "swap_pager: indefinite wait buffer" >> notices is problematical and the I/O delays for the virtual >> memory subsystem can lead to kills, if I understand right. >>=20 >> But, if I remember right, the actual message for a directly >> I/O related kill is now different. >>=20 >=20 > In this case the message was "unable to reclaim memory", a=20 > message I've not seen before.=20 Yea, it is one, more accurate wording of the old out of swap notices --probably covering most occurrences. >> I think that being able to reproduce this case could be >> important. I probably can not because I'd not get the >> "swap_pager: indefinite wait buffer" in my hardware >> context. I was thinking buildworld buildkernel here. I got the context wrong. I'll eventually do a devel/llvm13 build on the 8 GiByte RPi4B with my patched top monitoring various "maximum observed" figures. > If it's relevant, the case of /usr/ports/devel/llvm13 seems like > the most expedient test, since it did fail with realistic amounts > of memory and swap. I gather that there's a certain amount of=20 > self-recompilation in buildworld, is that true of the port version? > Does it matter? >=20 >>> Per the failure message I restarted the build of devel/llvm13 with=20= >>> make -DBATCH MAKE_JOBS_UNSAFE=3DYES > make.log & >>=20 >> Just like -DBATCH is for ports, not buildworld buildkernel, >> MAKE_JOBS_UNSAFE=3D is for ports, not buildworld buildkernel, >> at least if I understand right. >>=20 > This was a ports build on the Pi4. The restart is running = single-thread > and quite slow, I'm tempted to stop it unless a failure would be = useful. Again an example of my not switching context correctly. Sorry. >>>>> However, restarting buildworld using -j1 appears to have worked = past >>>>> the former point of failure. >>>>=20 > [this on stable/13 pi3]=20 >>>> Hmm. That usually means one (or both) of two things was involved >>>> in the failure: >>>>=20 >>>> A) a build race where something is not (fully) ready when >>>> it is used >>>>=20 >>>> B) running out of resources, such as RAM+SWAP >>>>=20 >>>=20 >>> The stable/13 machine is short of swap; it has only 2 GB, which >>> used to be enough. >>=20 >> So RAM+SWAP is 1 GiByte + 2 GiByte, so 3 GiByte on that >> RPi3*? (That would have been good to know earlier, such >> as for my attempts at reproduction.) >>=20 > Correct, 3GB RAM+swap. Didn't realize it would turn out to=20 > be important, sorry! Do not know yet if it would have helped reproduction of the problem. But I now know that I should try for something that would give evidence about getting near or over 3 GiBytes. >> -j for the RPi3* when it was failing? >>=20 > -j4, but I think it also failed at -j2.=20 >> Did you havae failures with the .cpp and .sh (so no >> make use involved) in the RAM+SWAP context? >>=20 > Using the .cpp and .sh file on a Pi3 with 2 GB swap=20 > running stable/13 there was a consistent failure. Ahh, a simpler, quicker test context/case. So that is likely what I'd look into. > Using the .cpp and .sh files on a Pi3 with 7GB swap > there was no failure.=20 >=20 > Using a build of /usr/ports/devel/llvm13 as a test the > build failed even with 8 GB of RAM and 8 GB of swap. >=20 >>> Maybe that's the problem, but having an error=20 >>> report that says it's a segfault is a confusing diagnostic.=20 >>>=20 >>>> But, as I understand, you were able to use a .cpp and >>>> .sh file pair that had been produced to repeat the >>>> problem on the RPi3B --and that would not have been a >>>> parallel-activity context. >>>>=20 >>>=20 >>> To be clear, the reproduction was on the same stable/13 that >>> reported the original failure. An attempt at reproduction >>> on a different Pi3 running -current ran without any errors. >>> Come to think of it, that machine had more swap, too. >>=20 >> How much swap? >>=20 > Two swap partitions, 3.6 GB and 4 GB, both in use. So that is the devel/llvm13 example, not buildworld buildkernel, not the .cpp and .sh combination. >>=20 >> At this point, I expect that the failure was tied to the >> RAM+SWAP totaling to 3 GiBytes. >>=20 >=20 > That seems likely, or at least a reasonable suspicion.=20 >=20 >> Knowing that context we might have a reproducible report >> that can be made based on the .cpp and .sh files, where >> restricting the RAM+SWAP use allowed is part of the >> report. >>=20 >=20 > There seem to be some other reports of clang using unreasonable > amounts of memory, for example=20 > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D261341 >=20 > A much older report that looks vaguely similar (out of memory > reported as segfault) > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D172576 > It's not arm-related and dates from 2012 but is still open. >=20 > I'll try to repeat some of the tests using the logging script > used previously. Right now it contains: >=20 > #!/bin/sh > while true > sysctl hw.regulator.5v0.min_uvolt ; do vmstat ; gstat -abd -I 10s ; = date ; swapinfo ; tail \ > -n 2 /var/log/messages ; netstat -m | grep "mbuf clusters" ; ps -auxd = -w -w > done >=20 > Changes to the script are welcome, the output is voluminous. I'll probably not get to experimenting with this for some time. =3D=3D=3D Mark Millard marklmi at yahoo.com