From nobody Thu Jan 27 21:35:04 2022 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 849DF197B391 for ; Thu, 27 Jan 2022 21:35:18 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic312-24.consmr.mail.gq1.yahoo.com (sonic312-24.consmr.mail.gq1.yahoo.com [98.137.69.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4JlDQX3ZKdz4jM5 for ; Thu, 27 Jan 2022 21:35:16 +0000 (UTC) (envelope-from marklmi@yahoo.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1643319307; bh=6onsu2dDR3X74q0w9C5YVPzQjBFnI/g8Th7PoQaiRAY=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=oySXIOUK6z3wUGx5n2Oa4KuWhyhEswhMxUHtLytdaLe/WNtZmYuW1BjPNkbrSXGn932aoRDSkUu56Dpmv5VdPxA+ZzSerO/VqgQYHbDDjsO4w+GO3l+UVbRGPV4MFM6ZGpiv3/CbB4eoXU3ESkjxXWzRUe5S179lK/RNXhP4zooUQRFAbVrRA4+9QvKzgy2isDpQ07LTW7uCXbGE4QRXNMk80shR45qCWMIJ6YfIHoCubl+3oVOdH3D6J9u+RL5bZVpamITgJc1e5kBre7BJW8xjY8v5GRGzA9xC9+arJfl/oxcb4BrjPFOhKndTafxcuNaGs0cPgpFEo4GsCut8fQ== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1643319307; bh=Xk0joRoebPqSgXB2s1QCk/LCIYapHO7l01I5u4j1M11=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=Wvb1iqDSE2Mlz/lpbSRhfN8OcJ0MFd4CYsk0hQdMAvnRso//8/3oPewKmKzbTIyKqbFF+rjag0t9P7vXra+it6fl23nXjScEYgxJJAHDmB/icIU39/NdeCuqHuh4S+X9djgcyJ8LRXAg3QaBH0uotfH3a1n/DaKUXRsGzf5SBlq0R0ZilRdegFgdn+pX8OcxAoN3XYhBSe02j4EIqxjLbAkaPji5TQPfhGnNFdIwSuOG+0WfF07alN6IhFBK1PnJ/9WG5UqM1HMFIMTi2R3oPmjBQHnDWLBhrzvVJ/+QMGbrupmT1zetnh8TuKMHt09qj2G770s4OeX7OZHKf1sUPQ== X-YMail-OSG: amu4nQsVM1kdG1H3yBX5FrDLjXi1TaZTEDin0L7650JNEf6QHyzkMEGDkF6zvLB jpAjxsf_THW9VrtY0g7U2eXwQJ7cYRDFci8vnJSjD6XF.dTQ7SSixwWVIxTt8IvLmivU.FVecBnT vL.EGdRcWIZtomMeD2L5aHRSMADkkOkCbpuHDGtLSK4n7AwOpg5C9VD875GhqsDlnu3jkpfBcHvZ ha.Gefg2cOOS9jJflQH4k8i19s4LchSBeG_dkEmrN.kq34tp08Ey5mS5Cg9CItb2WW3ndZNHAdsG __PyE9Mtw88VhayU0AzI1hxV6uJo5Xr2Qa6kC2SOlY9EnFb9Tne8xtCPdaTi2JSH5vPyTArvRJEl oMdlY.NQP7x_J.12ea.fo3VTJ6emF6M5X4qjxgxjxF9chy8IruEw0FWWC_BTKVIdFgpFtoD5eG63 wH4djPJkAaeLdqjLn4XUM2Rj6cS1AsVJIoTPbgrrEaYSTYr5DR9j2PlOzZ_4pi0PFGdbF_Hx6F.9 pUGdXSG2dpHYuC5dz.iaYs5Shr9R93JkD.2mgUUJ70Gjk_40oioisng7IGJMmpe3kKQyTKVyBbBm aQSmiDtKDzN5cqjWaWgUIQMtJBMbyozbHXsCsdbee2fJNmC_5nUe.LSaYhfxf7YMcGzCu8rGxCVo MIArQCzmBFwo44zXa26dDAcF3MkmIu.RlFO0hxztOnNd8EZSZ5AWqyrgSooq_rkTK8VR1mW0OUYE aPRS0aUN.UZjb96Es.uypBG9WKNCWeayEkXlb2AzdMZ_E2DWve5bLJxD3F5M56Km2w2eLJsczNbc cu1sBZhTAOPIvjIvAoQBR0EYAXZO6nD5sfS6uzNpfeEquX9ci8Y8V8f_oKYb5MvmlE9HTO_Y_2vp Fc9k2eOIxQmEXsxk2UYaHFpzDnkNlLTLdskowqOTOB_dfEV8dtMirmnu6G5oNYNSTN7aggx_..kt jari8BNaKIJX_.hldd7SVoDxnmXJW4CUvG0IwOMLSIjmKVdw1n1GkpfYoQkH3fDukXMHvoLpO2ME A_Sgdz5YPfwNp4dMdmhSmljmfhDF.VHcWGN598I_JXQJu3bWKIVEpWM7LnJbXSD4T.6MyOLR_pBX ikAgvhgX.MPx04nVHkviBZGTg9a5Mtlav9OGuQtXFPzR76xNluD477H2MFbK9U1dSCvHMGex7AYv pObOXsI3pf.rnAPkTCVKrqOrUReG0wvONqmnDDNxv8xObIYSgZzdsixN0xAEb0zru35kzvr3SpV0 XdWiUpS_dabrRvEG9TqLgBZZyvwxuwjmooxzy1vJAWqqtjDZd3HtgFTgmZgObHsDmcY85jEbJu8V c86go75TL96iNGaDO.BpxMd7iqTPkqXdQ1TDcYwT50fsQIPmkrmBnLOuoQ7J3mTABIoHMbTVoc1Y pqssPYamq1JfkRySSsLN0ax9RDLcIVbI82dfDfLPbYqvUUum2xmyf2VXllfRZeBsIU1kSWEyPVD5 AkR16zhM7.WQ3L2Kh4nEuQSRSDlVVX9fEdEV7KTeY1SDDdzVLullCye0zGvM6MiOgYPCzHTq4UZA a1V_5SSfmk5yDrvEcRqFdXAbwnrNNukRkJF9bSssmA.rq792tdvdiBs.oNohyUaFGVh9u0a2_TkF QqhbX2nZAY94AhldsLgb2mOa1tdMPH9FqLVjd32IkB.IfDeqyO8LVOaDuDSyCJ_QbpkOMzBA9pwz D_jwROT6_Kkmojp3R11vaFx9.ksQCp6mjvP5NGMMAThdp1JaHKiYztbACAz6i9mUUeHmIYEHvEF. kRvyBlF.HA3LuRgUZ34zNFlpFPpq9WVBe3VbRxttmtswDjGvccaz4PHfKqwIrcFc_zTDIQSUjffB TYkkmdjflgj1YnR.4WCX6ldHTBC7ucUgLet7cdTQGuBQVljMCif5WxTXuABmgGzZA3IY9p7i1KjQ y07WB3qkrsON.Jjtn6J6SVQUjt4CIKNlLMx51H3MZ4ybxA3lCDqz2iFAR.7gAOzxL9L.WdM5aCER 61Z3mRptL4uPS7CgtUCiByzJaoysWREhvJW0V0uhXhwFPyv0ubWZoBlSmQNDNDI81.OrBeJtjjE9 PJ.m8FifTwo_mlIV0A650Vr.j1TmQQShULswc898urnpgEyxuGtruxM4S2eCBUfjlgucz2IY8DNv 5bHWTbW7Lz6sP.zj.YEOiwSEI1OOrI0cd_2EY20SmYo1v1kPg0QK0TZX1srXiNIrOM4cSRvlrLb9 K5SGPGVdpQTQqKb8l2RjWSR5_nxZeBgVqbEMt7Yl7yCLUANxe7vi_sJOlfZfQm92TfVzm3c_3wnK cPkXy6qveBUpYd4c- X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.gq1.yahoo.com with HTTP; Thu, 27 Jan 2022 21:35:07 +0000 Received: by kubenode543.mail-prod1.omega.gq1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 8872b5fe25bf85b7396c0319856a9e82; Thu, 27 Jan 2022 21:35:05 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Subject: Re: devel/llvm13 failed to reclaim memory on 8 GB Pi4 running -current From: Mark Millard In-Reply-To: <2C7E741F-4703-4E41-93FE-72E1F16B60E2@yahoo.com> Date: Thu, 27 Jan 2022 13:35:04 -0800 Cc: freebsd-arm@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <61BC9C0F-7B43-4022-9D52-571E85F2C2CD@yahoo.com> References: <20220127164512.GA51200@www.zefox.net> <2C7E741F-4703-4E41-93FE-72E1F16B60E2@yahoo.com> To: bob prohaska X-Mailer: Apple Mail (2.3654.120.0.1.13) X-Rspamd-Queue-Id: 4JlDQX3ZKdz4jM5 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=oySXIOUK; dmarc=pass (policy=reject) header.from=yahoo.com; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.69.205 as permitted sender) smtp.mailfrom=marklmi@yahoo.com X-Spamd-Result: default: False [-1.50 / 15.00]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(1.00)[1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[98.137.69.205:from]; MLMMJ_DEST(0.00)[freebsd-arm]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.69.205:from]; RCVD_COUNT_TWO(0.00)[2] X-ThisMailContainsUnwantedMimeParts: N On 2022-Jan-27, at 12:12, Mark Millard wrote: > On 2022-Jan-27, at 11:31, Mark Millard wrote: >=20 >> On 2022-Jan-27, at 08:45, bob prohaska wrote: >>=20 >>> Attempts to compile devel/llvm13 on a Pi4 running -current (updated >>> on 20220126) with 8 GB of RAM and 8 GB of swap has failed on two = occasions using=20 >>> make -DBATCH > make.log &=20 >>> in /usr/ports/devel/llvm13 using the system compiler. The system is >>> self-hosted.=20 >=20 > Context question: ZFS? UFS? >=20 > (In things involving memory usage issues, knowing which is > always appropriate because of differences in memory use > patterns.) >=20 >>> The first failure reported clang error 139, but the second >>> was different, reporting only: >>> FAILED: = tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/check-expressi= on.cpp.o >>> along with a console report of >>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1258432, = size: 4096 >>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 627221, size: = 8192 >>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 240419, size: = 4096 >>> +swap_pager: out of swap space >>=20 >> In recent builds, such as yours, the above "out of swap" is a >> misnomer but is very interesting for what it is actually about. >>=20 >> Mark Johnston later wrote on 2022-Jan-15 about his "git: >> 4a864f624a70 - main - vm_pageout: Print a more accurate message >> to the console before an OOM kill" that produced the above report >> of "out of swap space": >>=20 >> QUOTE >> Hmm, those cases should likely be changed from "out of swap space" to >> "failed to allocate swap metadata" or something like that. >> END QUOTE >>=20 >> Your context proves the metadata problem really happens, so >> the messaging should be fixed to not be misleading. >>=20 >> In my builds I've code that is more explicit: >>=20 >> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c >> index 01cf9233329f..280621ca51be 100644 >> --- a/sys/vm/swap_pager.c >> +++ b/sys/vm/swap_pager.c >> @@ -2091,6 +2091,7 @@ swp_pager_meta_build(vm_object_t object, = vm_pindex_t pindex, daddr_t swapblk) >> 0, 1)) >> printf("swap blk zone exhausted, = " >> "increase = kern.maxswzone\n"); >> + printf("swp_pager_meta_build: swap = blk uma zone exhausted\n"); >> vm_pageout_oom(VM_OOM_SWAPZ); >> pause("swzonxb", 10); >> } else >> @@ -2121,6 +2122,7 @@ swp_pager_meta_build(vm_object_t object, = vm_pindex_t pindex, daddr_t swapblk) >> 0, 1)) >> printf("swap pctrie zone = exhausted, " >> "increase = kern.maxswzone\n"); >> + printf("swp_pager_meta_build: swap = pctrie uma zone exhausted\n"); >> vm_pageout_oom(VM_OOM_SWAPZ); >> pause("swzonxp", 10); >> } else >>=20 >> The "metadata" is the "swap blk uma zone" and "swap pctrie >> uma zone". Unfortuantely, which got the failure is not still >> indicated in the standard builds. >>=20 >>> +swp_pager_getswapspace(12): failed >>> +pid 61012 (c++), jid 0, uid 0, was killed: failed to reclaim memory >>=20 >> Abssent being able to swap, it tries to reclaim --and that >> too failed. That finally leads to the kills. >>=20 >>> Swap use peaked a little over 50%. >>=20 >> So at around 50% "swap blk uma zone" and/or "swap pctrie uma zone" >> had problems, probably fragmentation related problems. >>=20 >>> After the first failure a restart >>> of make using MAKE_JOBS_UNSAFE=3Dyes ran to completion with one = thread. >>>=20 >>> A copy of the build log, logging script and other notes is at >>> http://www.zefox.net/~fbsd/rpi4/20220127/ >>>=20 >>> Clang error 139 has been seen several times during make buildworld = on a Pi3 running >>> stable/13 with 2 GB of swap as well. Perhaps the two failures are = related. The Pi3=20 >>> failures didn't report out of swap, all were clang error 139 with = "failed to reclaim=20 >>> memory". Even with only 1 thread (j1) the failure reproduced. So far as I know stable/13 does not yet have the changes to the messaging about kills for failures to reclaim memory: still like it used to be for so long. ONly main has the=20 This makes an unmodified stable/13 messages not be nearly so interesting when they are produced. It will be this way until something based on "git: 4a864f624a70 - main - vm_pageout: Print a more accurate message to the console before an OOM kill" is in place in stable/13 (or somewhat analogous local changes are in place). I'm updating the media for my 8 GiByte RPi4B configuration to be based on the bectl environment for main being nearly a copy of (line split for readability): # uname -apKU FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #37 main-n252475-e76c0108990b-dirty: Sat Jan 15 21:53:08 PST 2022 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400047 1400047 That has my variant of Mark Johnston's new messaging and my additional messages as well. So on failure, it should report which metadata got the problem. The update also includes adding a 8 GiByte swap partition as an alternative. I'll temporarily have it configured to boot using just that swap partition. Another thing is that I'll remove my usual options for devel/llvm13 so that just defaults are used, including building of flang. So I hope to reproduce the problem in my context and to be able to report which of the two metadata caused the metadata driven messaging. It does take a while to synchronize the media involved to be based on the CA72_16Gp_ZFS media and building devel/llvm13 on a RPi4B takes a while. But I'll report once I have the console messages (or whatever happens). >> Note in your report above: obj.FortranEvaluate.dir >>=20 >> If you use the options to disable building flang (a.k.a., >> the Fortran compiler build), your builds on the RPi4B >> will likely work in the current configuration. >>=20 >> But it looks like you have identified a test context >> for the "swap blk uma zone" and "swap pctrie uma zone" >> handling. =3D=3D=3D Mark Millard marklmi at yahoo.com