From nobody Sat Feb 18 00:23:59 2023 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PJTvS0VY9z3s0sg for ; Sat, 18 Feb 2023 00:24:20 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic301-20.consmr.mail.gq1.yahoo.com (sonic301-20.consmr.mail.gq1.yahoo.com [98.137.64.146]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4PJTvR5CySz3QJ7 for ; Sat, 18 Feb 2023 00:24:19 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1676679858; bh=zmDbU//TzsQJ7Ag4bdhPyTEh73o5u7uDCwYb0Ft6BJo=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=WPclRCXnyi50obDr5JxsHcdv5UiB3J1P6mfL4G8jttFhdHZWBpfMkpcnlkD0I77Rxyv0JVAk0bxEFB4R/M1e1T+2rusM9TkkCxOzH5Pjyt6fCgY9qGtRREFP7kupD5ZNuUF/RnapD7rdNy+boxSYi4kB1dIRXdkazJy3vk91tPwUuPOmfaztOdpS5ebnxQHzCC2+AotdqkeOk3qRC2S2KPdflg5+v4ZDrQMUlwUlrr3+kRLdzrqEg7t2itwckWsFjUzfVjW4rsqih659+wZqBM8eqPozL3F1VLlV818EfRRFL1aL6HeuwqHU3HBfNoSq+qly1+PKowkkKHQPpYdliw== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1676679858; bh=DeCe80fvfExy7677do+WsXFR+IknoZh2OzrThOHAcsP=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=eJKT8CVvIoN5zn88272SWqkOXLJJYIn2Eg7aFUEIq4ZbTC1hPmCiJKvPY3k0jDvhDg5610YsThZtsYmhsuyiL7wVKuweJl2zQ19eRkzo6sQ/roSz9ec+Ag41dzVDc85vaUZP2L0Pp2BzMLsrHoCEGPLK7K565gb552iPjlNcsT9aBSdMmqS+M4Z7sNbt1Q7hyixnmtN+t4c97d85brHs06iP3qK9YIBsQm1Ri8MWy2lXeJKo2fRRgvG9v0pddI3Ik2hfqojo6vvbflkh+IfMGk0O8ljbOur5Ywb+95CsXyDBJ7Q7JmpRVPe6N8tpGIVW1VMwPN5dEwkfwVm40Mwfkw== X-YMail-OSG: nCeCpGwVM1lsOR.13BPyMtvIbeQzhk3HTIZ5H01_.8tmsFggQbNrrS7InI5m4fZ 2FfMsnBIoesxvjvpHmrO4uu.cNem4tiSpyo4tnKQ2gfal8QzYB8aP5W7k_TFfsgzXMhOfLAZ.hO1 nskS5F5nkCxHJuLBHu_2OzXJd5KOrK913cnEA3flNl.qvpmiYeHBD0GdYRwVZsGcdftZ9o1rBfYT _26cDGvfOKcmAz0HvvQPHL8ILdCfDS6_2GfRxfZsCUkGDun8LgDAlNZPHx_cXCcd4r8xlb6GqwSd a5ZO0Ip7hd_viFmXSKSaQ611WmkzK8WgJSaFuNn7R0S_PDuP4Y4YPxOTWdov.m_sbTyluHwT1ed4 BplqVJv10xLsMt1F1ApATyM8MmjsSGkf.SpvUHZeyONXCImUyQxzmLo3iS3nGkmJCDmY6rhc5YqF zPGipqqlIeHz7V9vSoNCwOoYNRjvGrKS5gAzWUzD9fyueNiJwue.F4vC2N4b4vMVq5W4.Z32b47i eULVyykEc0LEkl8cYlyTJhR31i6bRN7m1ZyPkba4bFaiibG62nr3T2e8uJjs5LfK7ycbz4dhfUKx TQCOi9R6Otj.9HS6xUlXXpdchTzbHNJ1kLML8TVcnCv7.YybeHVxv3MOz4kGtyXH_KEoqKpNB2eM Ugvy3V2jiXi7nOSfDGahy.R38Ec1dNt53S.l5wp5aIXvf2br2hNLyS7fqrVGeAMXUU7c6Vl1KkZc kUmf86_Mc1N7USM1NcMBL6_TnlLMMlguz9JKQElkvrWhCf.nEPWrr39L1FyS4er70A5U1xKisrdN i5Lz0BQZLhq1UgMbC76ZloWNVYQzLDzoHrUk6MAtEDR0GGB9gPTrx1QcDMSX_VLC1bFBuGsUgmZy 6IXu_ERyN093oKcAcOY3N3dkTv9YPv39mfgNNDdFCbvRiWqZpsk26oSCKq8CWhD5fSLNpP.PEKkJ EFCqahggIa_UPEvnG6SBKmUJjQ8K_OecCEEjDsLGAMWkyMGMkQYwMLl_IpJ75VzMN2BrRwHj7K.3 LbJuL2z.fA79i6SE9w3vIE_Xubd481FuyV6Yp94oNH2MUc3dmqD1gykt2By8ta77t3B3sSpFBMI4 xkxL.DeOUMzkm1Bi_z6VvW3C20zwxd56NRmMsqbQZ50hx_R4Q8RSTUOho2DQPfU.BPov1jl6PRrC GuDXfijzrokyzqEn9VmeW29728zH.p7syzUUScj2R0OuUHMKIylcvqp2henPvH4CO13c22FqhssM x6umwjq3N.aaFP8IPYn25lA8rbsI4SyTazSJcuCT8JyegWNRANA6JPN9onOc4n6ZvR1AKLx.Swi4 juQzQWm3OrmYcXpexIsLoBmNNa1HVginrelJZvySwncoGNt9Ce.LFDC0VPFltBHlGx.VbLhiItBi MWR5aJtRVU03L73_2mREgkgZN3oEc1ELEXNrH5cDr3tPQHM.y..YVkEIvZWgCxdKEk9oChR8XkY6 Yo08c058CUDa1pvNdP9hY1m3sqZq_T6O9Rs1saxDcmDxqba9twyBSWM_amk1Rn2oPPOcUPrJ7fFo ZzgHHKeElsF.n6SWXd.DQFtL4ZpnExIpqolViKeUArgxXHgEdRPqFySXIcT7y9BFAPXJuzMMFi8J EpyaE5dR6coYTcRV.2n9eKFF5NNihsFix5oXF_AGC.pAyxjz7Ys1VEoGAvmKnjcczRvPvWUaHn1t Cxw7VCoKSDKuZGVeb_hhVBHQeFRfWa_Ci7Mu9UFo4P1zDLJDSLjqYAJCvHp_E_Q8E_nlt02uZif3 Jau8_M0QeFK8_zAD4INwNKYejm2n2Ezu5uRgHQIcP8q2b8y4SfzSYqb_UFWirl91w.P2c3M1lZMv e0jIlH.Tf62skMCIImYNbD7ydzVugQaMmcA9F6K7dHv6pSPKneUFijyuXzyTTHI8ME4B.exL41M9 XGrFkAoAqWe5OzdllzexIwZ_K0jFHBBmE7CRVXhLZ1UVOojElDhOOmU7bP8m0y7Ep3hkeK27AhbN XosDnJKyNcPwmDaXvwLQZbEsv06mWOT5dxp1cH9rps6MW.GxLKnGjchhsoZts67qv3HPOJwOKPqu 8ECQ8utcg_r.PS1ZUnYiJU2WJG6nSm0xXMfAoVvDY52yHcuFUX_Ef1YVs1xuwRNmZpDbTXPEAhfU DqzpsVTtrk6w4kupuaKFTCJX8SGuzSTpjk_xcQvP247rLvrLxH44.yvSw6uNedqRtPhVG.P_OOIh Q9BmK6qA267E9UV3eXRBEGURNMu_z31InqJFj2fvza6o6Ga1UGcg4x5UsDMZI_XlDbZUWH5N2U6J i X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic301.consmr.mail.gq1.yahoo.com with HTTP; Sat, 18 Feb 2023 00:24:18 +0000 Received: by hermes--production-bf1-57c96c66f6-kqcsw (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID 4fbc28749280701e828452a5d5fdc951; Sat, 18 Feb 2023 00:24:12 +0000 (UTC) Content-Type: text/plain; charset=utf-8 List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.300.101.1.3\)) Subject: Re: fsck segfaults on rpi3 running 13-stable (and on 14-CURRENT analyzing the same file system that resulted from the 13-STABLE crash) From: Mark Millard In-Reply-To: <20230217232537.GA46176@www.zefox.net> Date: Fri, 17 Feb 2023 16:23:59 -0800 Cc: freebsd-arm@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <5FFA8C73-4F9B-4633-897C-75368FA9FD4F@yahoo.com> References: <20230213232519.GD95670@funkthat.com> <20230214161415.GA28276@www.zefox.net> <20230214183827.GG95670@funkthat.com> <20230214210601.GA28959@www.zefox.net> <20230214232746.GI95670@funkthat.com> <20230215154424.GA34278@www.zefox.net> <20230215190856.GA34665@www.zefox.net> <20230217232537.GA46176@www.zefox.net> To: bob prohaska X-Mailer: Apple Mail (2.3731.300.101.1.3) X-Rspamd-Queue-Id: 4PJTvR5CySz3QJ7 X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N On Feb 17, 2023, at 15:25, bob prohaska wrote: > On Wed, Feb 15, 2023 at 11:39:13AM -0800, Mark Millard wrote: >> On Feb 15, 2023, at 11:08, bob prohaska wrote: >>=20 >>> On Wed, Feb 15, 2023 at 09:40:51AM -0800, Mark Millard wrote: >>>>=20 >>>> Looking in my /usr/main-src/sbin/fsck_ffs/inode.c >>>> I see that the original file has a leading tab >>>> instead of spaces. >>>>=20 >>>> The following mostly ignores the 1st column that >>>> should have a space, -, or + in the diff output for >>>> the file-content lines. It is mostly about the text >>>> after the first column. >>>>=20 >>>> So, if you have spaces instead after the first column >>>> for the lines that start with a space, those lines >>>> will not match, leading to a rejection for the >>>> context matching done by patch. >>>=20 >>> Replacing spaces with tabs allowed patch to find the=20 >>> location, but it still fails with=20 >>> patch: **** malformed patch at line 5: printf("SIZE=3D%ju ", = (uintmax_t)DIP(dp, di_size)); >>=20 >> My guess is that when you made the adjustment to have >> the tabs, the leading space was also removed on this >> line. The first column is not part of the original >> text but is instead a directive to the tool. The >> missing space would be that directive and it needs to >> be there. So: >>=20 >> printf("SIZE=3D%ju ", (uintmax_t)DIP(dp, di_size)); >>=20 >> The space indicates to use the reset of the line just >> for context identification. >>=20 >> Of course, since I've no access the file to check my >> hypothesis, it is just a guess. >>=20 >>> Editing by hand looks like a good way to drive myself crazy 8-) >=20 > Turns out to be true, but not in the manner expected. Editing in=20 > the changes by hand seems to have worked, in that fsck_ffs recompiled > and no longer segfaults when examining the -stable filesystem. >=20 > However, repeated runs of fsck continue to emit errors starting with > root@www:/usr/src # fsck -y /dev/da1s2d > ** /dev/da1s2d > ** Last Mounted on /usr > ** Phase 1 - Check Blocks and Sizes > 7912408300994173476 BAD I=3D69393345 > 4313599915630302063 BAD I=3D69393345 > -4473632163892877928 BAD I=3D69393345 > 8068741989830080453 BAD I=3D69393345 > .... > This continues through a succession of I values,=20 > ending with =20 >=20 > ..... >=20 > 3857159125896022134 BAD I=3D74682090 > -4354179704011695453 BAD I=3D74682090 > 7611175298055105740 BAD I=3D74682090 > 3985638883347136889 BAD I=3D74682090 > -2495754894521232470 BAD I=3D74682090 > 7739654885841380823 BAD I=3D74682090 > ** Phase 2 - Check Pathnames > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > LINK COUNT FILE I=3D69316035 OWNER=3Droot MODE=3D100644 > SIZE=3D36680 MTIME=3DFeb 11 12:06 2023 COUNT 2 SHOULD BE 1 > ADJUST? yes >=20 > BAD/DUP FILE I=3D69393345 OWNER=3Droot MODE=3D100644 > SIZE=3D720896 MTIME=3DJul 22 23:00 2022=20 >=20 > CLEAR? yes >=20 > fsck_ffs: cglookup: out of range cylinder group 175966913 > root@www:/usr/src Looks like that is one of the messages for problems fsck_ffs does not attempt to deal with (probably for good reasons in each case/context). The below does not show the specific conditions, just the calls with the message texts used for the various exits of the "errx(EEXIT" form: # grep -r "errx(EEXIT," /usr/main-src/sbin/fsck_ffs/ | more /usr/main-src/sbin/fsck_ffs/pass5.c: = errx(EEXIT, "BAD STATE %d FOR INODE I=3D%ju", /usr/main-src/sbin/fsck_ffs/inode.c: errx(EEXIT, "bad inode = number %ju to ginode", /usr/main-src/sbin/fsck_ffs/inode.c: errx(EEXIT, "bad inode = number %ju to nextinode", /usr/main-src/sbin/fsck_ffs/inode.c: errx(EEXIT, = "cannot allocate space for inode buffer"); /usr/main-src/sbin/fsck_ffs/inode.c: errx(EEXIT, "cannot = increase directory list"); /usr/main-src/sbin/fsck_ffs/inode.c: errx(EEXIT, = "cannot increase directory list"); /usr/main-src/sbin/fsck_ffs/inode.c: errx(EEXIT, "BAD STATE = %d TO BLKERR", inoinfo(ino)->ino_state); /usr/main-src/sbin/fsck_ffs/dir.c: errx(EEXIT, "wrong type = to dirscan %d", idesc->id_type); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, "inoinfo: = inumber %ju out of range", /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, "Initial = malloc(%d) failed", sblock.fs_bsize); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, "%s", = failreason); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, "cglookup: = out of range cylinder group %d", cg); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, = "Cannot allocate cylinder group buffers"); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT,"Ran = out of memory during journal recovery"); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, "Excessive = buffer size %ld > %d\n", size, /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, "panic: lost = %d buffers", numbufs - cnt); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, "ABORTING = DUE TO READ ERRORS"); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, = "cannot allocate buffer pool"); /usr/main-src/sbin/fsck_ffs/fsutil.c: errx(EEXIT, "UNKNOWN = INODESC FIX MODE %d", idesc->id_fix); /usr/main-src/sbin/fsck_ffs/pass4.c: = errx(EEXIT, "BAD STATE %d FOR INODE I=3D%ju", /usr/main-src/sbin/fsck_ffs/pass1.c: errx(EEXIT, = "cannot alloc %u bytes for inoinfo", /usr/main-src/sbin/fsck_ffs/pass1.c: errx(EEXIT, = "cannot alloc %u bytes for inoinfo", /usr/main-src/sbin/fsck_ffs/setup.c: errx(EEXIT, = "cannot allocate space for snapshot " /usr/main-src/sbin/fsck_ffs/setup.c: errx(EEXIT, "cannot = allocate space for superblock"); /usr/main-src/sbin/fsck_ffs/setup.c: errx(EEXIT, "calcsb: = cannot allocate recovery buffer"); /usr/main-src/sbin/fsck_ffs/main.c: = errx(EEXIT, "cannot do level %d conversion", /usr/main-src/sbin/fsck_ffs/main.c: = errx(EEXIT, "bad mode to -m: %o", lfmode); /usr/main-src/sbin/fsck_ffs/main.c: errx(EEXIT, "-%c flag = requires a %s", flag, req); /usr/main-src/sbin/fsck_ffs/pass2.c: errx(EEXIT, = "CANNOT ALLOCATE ROOT INODE"); /usr/main-src/sbin/fsck_ffs/pass2.c: = errx(EEXIT, "CANNOT ALLOCATE ROOT INODE"); /usr/main-src/sbin/fsck_ffs/pass2.c: = errx(EEXIT, "CANNOT ALLOCATE ROOT INODE"); /usr/main-src/sbin/fsck_ffs/pass2.c: errx(EEXIT, "BAD STATE = %d FOR ROOT INODE", /usr/main-src/sbin/fsck_ffs/pass2.c: errx(EEXIT, "BAD = STATE %d FOR INODE I=3D%ju", > It's unclear whether the patch is preventing fsck > from repairing the filesystem, or the problems are > inherently beyond fixing. Looks like it is in the do-not-fix category. If no prior adjustments were made in the run, then things have stayed as they were. (These messages could be clearer about the status that they imply and what one should do in responce.) > Repeated fsck runs seem > to just reproduce the same output. So, appearently, no prior adjustments either for the re-runs. > There's no prompt=20 > to re-run fsck. =20 I expect that is true of all the above "errx(EEXIT" lines: the report is of a "did not fix" issue that blocks progress. > Thanks to both Marks for the patch and essential > help it making it stick. If anything else is > worth trying I'm game, there's little to lose. I've no clue if there is more to try. But, even if there is, there may be other issues/constraints that lead to not bothering to try? Beyond that, things with floating-point use in multi-threading contexts looks to be significantly broken in main [so: 14] for now. (This was involved in your FreeBSD crash based on the the backtrace showed.) If you try to set up another armv7 context, I suggest, for now, staying before: commit 6926e2699ae55080f860488895a2a9aa6e6d9b4d Author: Kornel Dul=C4=99ba AuthorDate: 2023-02-04 12:59:30 +0000 Commit: Kornel Dul=C4=99ba CommitDate: 2023-02-04 19:21:43 +0000 arm: Add support for using VFP in kernel This would be until a list of issues have been addressed. I've reported how to produce 3 distinct failures, 2 of which hit KASSERT panics, and the other one is for ending up with floating-point values from the wrong thread (but same process). More may be identified and fixed before things generally work again for main for armv7 FreeBSD. =3D=3D=3D Mark Millard marklmi at yahoo.com