From nobody Mon Feb 20 04:45:44 2023 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PKqbh1DS9z3rtTZ for ; Mon, 20 Feb 2023 04:45:20 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (www.zefox.net [50.1.20.27]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "www.zefox.com", Issuer "www.zefox.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PKqbf5CMHz3lQ4 for ; Mon, 20 Feb 2023 04:45:18 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Authentication-Results: mx1.freebsd.org; dkim=none; spf=none (mx1.freebsd.org: domain of fbsd@www.zefox.net has no SPF policy when checking 50.1.20.27) smtp.mailfrom=fbsd@www.zefox.net; dmarc=none Received: from www.zefox.net (localhost [127.0.0.1]) by www.zefox.net (8.16.1/8.15.2) with ESMTPS id 31K4jjsh058232 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sun, 19 Feb 2023 20:45:45 -0800 (PST) (envelope-from fbsd@www.zefox.net) Received: (from fbsd@localhost) by www.zefox.net (8.16.1/8.15.2/Submit) id 31K4jjP7058231; Sun, 19 Feb 2023 20:45:45 -0800 (PST) (envelope-from fbsd) Date: Sun, 19 Feb 2023 20:45:44 -0800 From: bob prohaska To: Mark Millard Cc: freebsd-arm@freebsd.org Subject: Re: fsck segfaults on rpi3 running 13-stable (and on 14-CURRENT analyzing the same file system that resulted from the 13-STABLE crash) Message-ID: <20230220044544.GB57936@www.zefox.net> References: <202302192054.31JKsq7w079295@chez.mckusick.com> <3DD8EEC2-6135-42A0-A80C-F195CAAC025E@yahoo.com> <20230219222328.GA55941@www.zefox.net> <2F5B20E9-AFF8-42F6-9E1F-50BBDF4E1B79@yahoo.com> List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2F5B20E9-AFF8-42F6-9E1F-50BBDF4E1B79@yahoo.com> X-Spamd-Result: default: False [-1.10 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; AUTH_NA(1.00)[]; NEURAL_HAM_SHORT(-1.00)[-0.999]; NEURAL_HAM_LONG(-1.00)[-0.997]; MID_RHS_WWW(0.50)[]; WWW_DOT_DOMAIN(0.50)[]; MIME_GOOD(-0.10)[text/plain]; FREEMAIL_TO(0.00)[yahoo.com]; MLMMJ_DEST(0.00)[freebsd-arm@freebsd.org]; ASN(0.00)[asn:7065, ipnet:50.1.16.0/20, country:US]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_NA(0.00)[no SPF record]; DMARC_NA(0.00)[zefox.net]; MID_RHS_MATCH_FROM(0.00)[] X-Rspamd-Queue-Id: 4PKqbf5CMHz3lQ4 X-Spamd-Bar: - X-ThisMailContainsUnwantedMimeParts: N On Sun, Feb 19, 2023 at 02:35:15PM -0800, Mark Millard wrote: > > Kirk likely monitors the freebsd-fs list. I didn't notice there was such a list 8-\ > Kirk likely does not monitor the freebsd-arm list. > None of us thought to switch to freebsd-fs at the > time. The only part of your context that ended up > to be arm specific was original buildworld crash. > You definitely started in an appropriate place > (freebsd-arm). After the crash, the rest was more > general relative to platforms and more specific > relative to file system handling (UFS support). > > I do not see any reason for any of this exchange > to go to any lists, given the current status. Alas, the story's not over yet 8-( After getting the disk fsck'd and booting once more, an attempt to buildworld using a fresh /usr/src and empty /usr/obj crashed again, in I think the same way. This time some notes have been collected at http://www.zefox.net/~fbsd/rpi3/scsi_status_error/readme To a casual glance, it looks like a hardware error. But, the machine seems to work fine until it's running buildworld, and then crashes during a relatively easy part of buildworld. The initial error message is: bob@pelorus:/usr/src % (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 43 29 d6 40 00 00 40 00 (da0:umass-sim0:0:0:0): CAM status: SCSI Status Error (da0:umass-sim0:0:0:0): SCSI status: Check Condition (da0:umass-sim0:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) (da0:umass-sim0:0:0:0): Error 5, Unretryable error SCSI errors are not unknown, but they usually succeed on retry. It's not obvious why this is treated as un-retryable. Are there any simple tests that might help decide what's wrong? It's likely that re-running buildworld will reproduce the crash. I've placed the results of smartctl -a at the end of the notes. The interpretation isn't self evident, hopefully someone else can lend an eye. I'll try smartctl -t after a good night's sleep. Thanks for reading! bob prohaska