From nobody Mon Feb 20 11:47:30 2023 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PL0yz4jXHz3sgxN for ; Mon, 20 Feb 2023 11:47:39 +0000 (UTC) (envelope-from jfc@mit.edu) Received: from outgoing-exchange-7.mit.edu (outgoing-exchange-7.mit.edu [18.9.28.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.outgoing-exchange.mit.edu", Issuer "InCommon RSA Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PL0yz2xsqz3Ln8 for ; Mon, 20 Feb 2023 11:47:39 +0000 (UTC) (envelope-from jfc@mit.edu) Authentication-Results: mx1.freebsd.org; none Received: from oc11exedge1.exchange.mit.edu (OC11EXEDGE1.EXCHANGE.MIT.EDU [18.9.3.17]) by outgoing-exchange-7.mit.edu (8.14.7/8.12.4) with ESMTP id 31KBlXKv024982; Mon, 20 Feb 2023 06:47:33 -0500 Received: from OC11EXPO29.exchange.mit.edu (18.9.4.102) by oc11exedge1.exchange.mit.edu (18.9.3.17) with Microsoft SMTP Server (TLS) id 15.0.1497.47; Mon, 20 Feb 2023 06:47:27 -0500 Received: from oc11exhyb6.exchange.mit.edu (18.9.1.111) by oc11expo29.exchange.mit.edu (18.9.4.102) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Mon, 20 Feb 2023 06:47:32 -0500 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (104.47.56.169) by oc11exhyb6.exchange.mit.edu (18.9.1.111) with Microsoft SMTP Server (TLS) id 15.0.1497.47 via Frontend Transport; Mon, 20 Feb 2023 06:47:32 -0500 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=L2HhsyYCuvI4wsAkQE1ZDIlEamQXTBpBTM2IoJdgSB/2vrfTe+zRrCIz+C5bpWxo5f01DUXfXIXYLAz0X3NRN9ePPwlOjcz0urtVektnCcq2eSmsOmSmMfr6CBI/iyRcbR7dVkh2UWV/GFKr8zpPoosXg/q/a+wHtD63dY+Gfkf3XGnl0mA0dgJ97IkUZKd3Nj1XQ1wZTAT0V7gVhzfWw2IcvIg+7eRVURoPPRrvv1vPa2SLz5tQU/YrXPEpWdG556Kb00OlQdO9SuKbQWNVxgQtjuf9u4/Bp4+pUOiA+FmK6RHuv7OQFXscPEh5+SymoTNgU2lHZ3YDHewQ/Zrz2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=P/mwpttbBiJ1j+ijRfj0lL3JkmPkfI17TaIfPepKjjg=; b=lGMUNs+ZSA67XLuSIBTTXlFs6pgMmp+m8+jc5uaDjGxTbl0FUcI859dVDGua0y9FZGZo69YruMPiIJ3tX1bKEtCvxzp8fkVccYmpWQqZ1K/wgVk3LXXS6beUcmx2I+AHLBtK0tQMc5R05tKnDyFh+eyvW7Wf1tC7QGK/SVZN0pqDBZPDcBX6f/9avth0XXoUJDYosMzfdGZXPttAa7VjL5CN63cfpto7YE6H0CL4x7tXICWzTpDLBEZM1zgCrxWy0rSjjXb6FTt9ugdMuUC6nUUllCM6T7EdTfXJCxibAj176qnA+RlJVXAuC8EBMplQQjjSfTpgy1YyEme6cFR3PQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mit.edu; dmarc=pass action=none header.from=mit.edu; dkim=pass header.d=mit.edu; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mit.edu; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=P/mwpttbBiJ1j+ijRfj0lL3JkmPkfI17TaIfPepKjjg=; b=h6Ozq6NAK1smu3x5MuD6/EnEdGkm0nsgvOu3Q17hQokIaW+jpCWievmkuekG5qkiI4r9lf8VtKhZ0FKP9kzUy0sj3SxMAKGqCJ0NdYEpZV+y2wYiRQfVpiaUpCjMJh7qbIeKY8PQ0iTMTSvmA2uYDDK9h0c8rsU1j/rmQDoZii4= Received: from DS7PR01MB7712.prod.exchangelabs.com (2603:10b6:8:7b::17) by CY4PR01MB3270.prod.exchangelabs.com (2603:10b6:903:de::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6111.19; Mon, 20 Feb 2023 11:47:30 +0000 Received: from DS7PR01MB7712.prod.exchangelabs.com ([fe80::beb1:405b:1793:e719]) by DS7PR01MB7712.prod.exchangelabs.com ([fe80::beb1:405b:1793:e719%8]) with mapi id 15.20.6111.020; Mon, 20 Feb 2023 11:47:30 +0000 From: John F Carr To: Mark Millard , bob prohaska CC: "freebsd-arm@freebsd.org" Subject: Re: fsck segfaults on rpi3 running 13-stable (and on 14-CURRENT analyzing the same file system that resulted from the 13-STABLE crash) Thread-Topic: fsck segfaults on rpi3 running 13-stable (and on 14-CURRENT analyzing the same file system that resulted from the 13-STABLE crash) Thread-Index: AQHZROYpEDD0/PiPO0yNgT9ZhfjMLq7XVMCAgAACv4CAAGDhAA== Date: Mon, 20 Feb 2023 11:47:30 +0000 Message-ID: <1DB17CD4-63B5-4FA2-ADC6-6ED817A09CCB@mit.edu> References: <202302192054.31JKsq7w079295@chez.mckusick.com> <3DD8EEC2-6135-42A0-A80C-F195CAAC025E@yahoo.com> <20230219222328.GA55941@www.zefox.net> <2F5B20E9-AFF8-42F6-9E1F-50BBDF4E1B79@yahoo.com> <20230220044544.GB57936@www.zefox.net> <9CEF4E7A-2F13-454F-A04A-A6C5A80FD4B7@yahoo.com> <268392B4-58FE-49EE-9B1D-6DA632757DFA@yahoo.com> In-Reply-To: <268392B4-58FE-49EE-9B1D-6DA632757DFA@yahoo.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DS7PR01MB7712:EE_|CY4PR01MB3270:EE_ x-ms-office365-filtering-correlation-id: d31a4166-0a63-4efa-ec0b-08db13383f57 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ZU8hbIIaE7FMWZ58OIkqSK1Ryesne+/wAScJhfjX28UnKhb/WOPAWfoXgF+h6VhwzM5puukmUlWWcjPWxOTUSBEucxhZaXvXY8eCGCTLvMn8Rv3RVpuevt1csIj622p33xWDr4Dfx5/E5icfEYWrIb0CmSYQDYvQr+Uk2gS/q1HDtgzd+g5N5qAEQx5qLQxe2jJHNOWFt/aF4yPkdb8nooZfuoIz4EvPKaAXM2x5w80UmAUzBLTfKirA9QCDztUVjPzAjJJJJ8w0H7Au7ARfGQ9uie0IzmwXahbJYV6+e4hhATY+Ntv6rwBSxkl8C4zzukZ89gqzBaE34pyzpF0rW1kfYFVT5X7/j5Oi3QI0SKJbt59Se6cC3+OoqfzGcCMGP6osMC4qWzX6QvjEKuYN6MrE/4y5h7areHiyfAJJnhHOhNIa92b8JfAAINkReCSxQLiSL1TslkvTGZQ8l+YZmfHasUAGq5RgmFUWPJAq2lcPt8gJdtDZLVSkJru/Nx5jjw/sS++1Cj5/dGsnc9g9bMqKqjht+P/UMKitow8swFFjHOwXq1QXM3cYaG41a+61KA4nnLSV0hBiCDu1RGQbkqv6ndoNMSrtU9ck//PR4NRQnefHUkWaNpIj4irGEcINUrl5TaBZD6P4QSPyasvKkhXOEsb1ztFx7EQRVV3fb5Xo9SclLbLKKMZyllRvr/yB x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR01MB7712.prod.exchangelabs.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(366004)(39860400002)(136003)(396003)(376002)(346002)(451199018)(86362001)(36756003)(33656002)(38070700005)(66476007)(66556008)(41300700001)(66446008)(64756008)(4326008)(66946007)(8676002)(76116006)(8936002)(5660300002)(91956017)(966005)(6486002)(316002)(786003)(110136005)(83380400001)(2906002)(122000001)(75432002)(38100700002)(26005)(53546011)(6512007)(186003)(6506007)(478600001)(71200400001)(2616005);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?/pa1sWnVxQq476UFAbrp9a1Tee/JNa176vygLhfLweWzPFsaZBrKlTEZohud?= =?us-ascii?Q?VDMRo17xES2dM0mvPmcWx2I9BZ2uALnkg5DHUMCNsFtoLJZtqHVQyt64K+yl?= =?us-ascii?Q?jne58g/bWQiopRoN4v/7FSAY3gxaCm9ZZGSUYdmlxx9v5VUFKPaVWUlyIJ04?= =?us-ascii?Q?0XwVOE1jPgdMv2XWG3xSGSTqjwhGuieMgKOdMAbIIeEe2K8Y0qNgR6cbwM6V?= =?us-ascii?Q?c7s7g8emAW6EyFJYmaUP+9RqDhXO+TglvWnl0sj5JhcU8PE8rsqfhbWP9Yw5?= =?us-ascii?Q?1l0HPZIA1ixmz9r4b4ILHM8uT7WExa4v6X4R9QqncV5Bke/8GDGg59VvYiCX?= =?us-ascii?Q?GFenr9Tb8iS7ybSZjVhaz1dIzuDGwYkEGY7jCqv+9mxCmqxg/I/9BQwnDykl?= =?us-ascii?Q?4iOuTHEbROkBsXzdlrT/icG/pcoG/s2naNQefatbTLJiReqRCbGw3Bbs0PhA?= =?us-ascii?Q?NbUHiWUSl53/Z0YTsZqoFpCYJHAoqWT//tt57b8hRMg4S9PqmYJwHeSq9L11?= =?us-ascii?Q?z1eQ6WfipIIGK8Ps+p1CmIa977XRGG2vVb0H8pnieonYeQ69ASWBFIjer00F?= =?us-ascii?Q?IFWF4pyhAJxem1HlcodFP6T2/vluEFCxJc5vPqMdzM7nlRA38+f/tWBL8Fxe?= =?us-ascii?Q?gn11t6LGHJR2SKFdAa1LFr/PTvIdx0Fl4A6GWqSxmPJWRIGvCkn91l+XcTQe?= =?us-ascii?Q?s8vUJY3/ZlS/FbxdhykUFnkgytV9LrDsBpNQQcoaXKIhUoZ82R//j0j3hDC2?= =?us-ascii?Q?b8aepRuJiaNvrRm0aQfplCq9r59fGH1k30drf0goP3a+sAqL8hxSEHuBAQC3?= =?us-ascii?Q?lu4ANXIJ0TJGGO12of45tGq3nh0+0OuvIVMAE9C0QBHx6XO4M/gdXo9nPAOK?= =?us-ascii?Q?b5n6TsCR0eoAWYhG/xXFdaBTAb77b0pp7XD190umS0PFlyhyvAoeT+XcPqrE?= =?us-ascii?Q?PbxyJEbOc+9b/lSqzBxM0mc7zl7breBFu9AkWFAcB2BqeaO8dKU/bDO3MFhP?= =?us-ascii?Q?myBDnyNuypl78T0aVYHItrKORB/4vNQ5+nO6LyPW4XKd8mUQXNYfPxzqttk5?= =?us-ascii?Q?OuWxrAjoWmgffH8zryO+FKaRlNZJY4mgxe5bG9Iyo+8rfMavLDn1h5cXEOde?= =?us-ascii?Q?PtVMnMbGa+MjPP3sN3CuZR6gejugXu6UiD0cI/wfD9GbmKgWwOP9r0owHGa+?= =?us-ascii?Q?6vhR2zEipUf0zxW6JUObEojRvaED7AzXXLe/BkO2ZcFjxlOvbxt61nMoyKfJ?= =?us-ascii?Q?fVfxYZI0ogf/K2UHlnHnbfSfBBATqkHdXJX3uavGcnMAILfTeM+11TRO7H9a?= =?us-ascii?Q?KIWqLxrLWybWmg6DpjzID8ckSyWa+dZ9zuxSp9GvQujqXhWFZY1vEqJTXAtN?= =?us-ascii?Q?OftyCfM6A4x+PvgcnFeCMYbM0+XItTZzuFEArZAVlFWrssPj95dGRtKZnzO5?= =?us-ascii?Q?fjG2PCa707G8WtlzcuI93h6mKq0Dv/UmL+AA6Fo0wyhnbSMtL0y+ivzkMn2H?= =?us-ascii?Q?UOMW+Av7LIb5TdgRc7HFSP5XDmUU28/JI1dzaI1xeveryUgvwZraP9dmr0dH?= =?us-ascii?Q?FkdO7FCexU7zjFrxqmzfIRAIQ1m1BJk0pyJ6pMlT?= Content-Type: text/plain; charset="us-ascii" Content-ID: <3EC407DE5020E143864D7FB22EA1005B@prod.exchangelabs.com> Content-Transfer-Encoding: quoted-printable List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DS7PR01MB7712.prod.exchangelabs.com X-MS-Exchange-CrossTenant-Network-Message-Id: d31a4166-0a63-4efa-ec0b-08db13383f57 X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Feb 2023 11:47:30.4567 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 64afd9ba-0ecf-4acf-bc36-935f6235ba8b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: jLZmTqlGWOrc5y7xPjTEO/eF/OUEqTKpk63D7+qoN9dObZ7eI9of/5et9FkXfJW2 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR01MB3270 X-OriginatorOrg: mit.edu X-Rspamd-Queue-Id: 4PL0yz2xsqz3Ln8 X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:3, ipnet:18.9.0.0/16, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N > On Feb 20, 2023, at 01:00, Mark Millard wrote: >=20 > On Feb 19, 2023, at 21:50, Mark Millard wrote: >=20 >> On Feb 19, 2023, at 20:45, bob prohaska wrote: >>=20 >>> On Sun, Feb 19, 2023 at 02:35:15PM -0800, Mark Millard wrote: >>>>=20 >>>> Kirk likely monitors the freebsd-fs list. >>>=20 >>> I didn't notice there was such a list 8-\ >>>=20 >>>> Kirk likely does not monitor the freebsd-arm list. >>>> None of us thought to switch to freebsd-fs at the >>>> time. The only part of your context that ended up >>>> to be arm specific was original buildworld crash. >>>> You definitely started in an appropriate place >>>> (freebsd-arm). After the crash, the rest was more >>>> general relative to platforms and more specific >>>> relative to file system handling (UFS support). >>>>=20 >>>> I do not see any reason for any of this exchange >>>> to go to any lists, given the current status. >>>=20 >>> Alas, the story's not over yet 8-( =20 >>>=20 >>> After getting the disk fsck'd and booting once more, >>> an attempt to buildworld using a fresh /usr/src >>> and empty /usr/obj crashed again, >>=20 >> I'm confused. The original crash was reported to be >> on a RPi2B using a armv7 kernel, or so I thought. >> (The RPi3B was for later fsck_ffs activity for the >> media's UFS.) >>=20 >> This new material indicates a RPi3B arm64 (aarch64) >> context for this buildworld failure. Is it the same >> media as for the prior buildworld failure? >>=20 >>> in I think the >>> same way. This time some notes have been collected >>> at >>> http://www.zefox.net/~fbsd/rpi3/scsi_status_error/readme >>>=20 >>> To a casual glance, it looks like a hardware error. >>> But, the machine seems to work fine until it's running >>> buildworld, and then crashes during a relatively easy >>> part of buildworld. The initial error message is: >>>=20 >>> bob@pelorus:/usr/src % (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 43 = 29 d6 40 00 00 40 00=20 >>> (da0:umass-sim0:0:0:0): CAM status: SCSI Status Error >>> (da0:umass-sim0:0:0:0): SCSI status: Check Condition >>> (da0:umass-sim0:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered = read error) >>> (da0:umass-sim0:0:0:0): Error 5, Unretryable error >>=20 >> A description of "Media Error" from seagate is: >>=20 >> Medium Error - Indicates the command terminated with a nonrecovered erro= r condition, probably caused by a flaw in the medium or an error in the rec= orded data. >>=20 >> To compare/contrast with other alternatives, see: >>=20 >> https://www.seagate.com/support/kb/scsi-sense-key-chart-196259en/ >>=20 >> A more extensive list with asc/ascq involved as well is at: >>=20 >> https://en.wikipedia.org/wiki/Key_Code_Qualifier/ >>=20 >> Allowing more comparison/contrast with other classifications. >>=20 >> It indicates: >>=20 >> 3 11 00 Medium Error - unrecovered read error >>=20 >> (matching the reported text). >>=20 >>> SCSI errors are not unknown, but they usually succeed on retry. >>> It's not obvious why this is treated as un-retryable.=20 >>=20 >> Because that is what the "3 11 00" combination involved >> means. The drive is reporting that. It is not a FreeBSD >> driver choice of handling. >>=20 >> (I'm not expert at drive internals, so I take it at face >> value.) >>=20 >>> Are there any simple tests that might help decide what's wrong? >>> It's likely that re-running buildworld will reproduce the crash. >>=20 >> See the https://en.wikipedia.org/wiki/Key_Code_Qualifier/ >> description material for some background information? >>=20 >>> I've placed the results of smartctl -a at the end of the notes.=20 >>> The interpretation isn't self evident, hopefully someone else >>> can lend an eye. I'll try smartctl -t after a good night's sleep.=20 >>=20 >> man smartctl reports: >>=20 >> UNC: UNCorrectable Error in Data >>=20 >> The 3 examples of: >>=20 >> After command completion occurred, registers were: >> ER ST SC SN CL CH DH >> -- -- -- -- -- -- -- >> 40 51 00 ff ff ff 0f Error: UNC at LBA =3D 0x0fffffff =3D 268435455 >>=20 >> indicate UNC. All 3 list the same LBA value. >=20 > Turns out that the LBA value is likely garbage, given the > size of your drive (> 128 GiBytes): But we have an address from the SCSI command: READ(10). CDB: 28 00 43 29 d6= 40 00 00 40 00=20 Decoded that says read, starting block 0x4329d640, length 0x40 blocks. If = block size is 512 bytes that is about half a terabyte into the disk. This shell command should replicate the read: # dd if=3D/dev/da0 of=3D/dev/null bs=3D32768 count=3D1 skip=3D17606489 The device name (if=3D) comes from the error message "da0:umass-sim0:0:0:0"= . The block size (bs=3D) matches the read request in the failed SCSI comma= nd. The skip count is 0x4329d640 (disk block) / 64 (number of disk blocks = per dd block). If you reproduce the error with dd you can try a binary search over the 64 = block range until you find the block that failed.