Re: fsck segfaults on rpi3 running 13-stable (and on 14-CURRENT analyzing the same file system that resulted from the 13-STABLE crash)

From: bob prohaska <fbsd_at_www.zefox.net>
Date: Mon, 20 Feb 2023 04:45:44 UTC
On Sun, Feb 19, 2023 at 02:35:15PM -0800, Mark Millard wrote:
> 
> Kirk likely monitors the freebsd-fs list.

I didn't notice there was such a list 8-\
 
> Kirk likely does not monitor the freebsd-arm list.
> None of us thought to switch to freebsd-fs at the
> time. The only part of your context that ended up
> to be arm specific was original buildworld crash.
> You definitely started in an appropriate place
> (freebsd-arm). After the crash, the rest was more
> general relative to platforms and more specific
> relative to file system handling (UFS support).
> 
> I do not see any reason for any of this exchange
> to go to any lists, given the current status.

Alas, the story's not over yet 8-(  

After getting the disk fsck'd and booting once more,
an attempt to buildworld using a fresh /usr/src
and empty /usr/obj crashed again, in I think the
same way. This time some notes have been collected
at
http://www.zefox.net/~fbsd/rpi3/scsi_status_error/readme

To a casual glance, it looks like a hardware error.
But, the machine seems to work fine until it's running
buildworld, and then crashes during a relatively easy
part of buildworld. The initial error message is:

bob@pelorus:/usr/src % (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 43 29 d6 40 00 00 40 00 
(da0:umass-sim0:0:0:0): CAM status: SCSI Status Error
(da0:umass-sim0:0:0:0): SCSI status: Check Condition
(da0:umass-sim0:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
(da0:umass-sim0:0:0:0): Error 5, Unretryable error

SCSI errors are not unknown, but they usually succeed on retry.
It's not obvious why this is treated as un-retryable. 

Are there any simple tests that might help decide what's wrong?
It's likely that re-running buildworld will reproduce the crash.

I've placed the results of smartctl -a at the end of the notes. 
The interpretation isn't self evident, hopefully someone else
can lend an eye. I'll try smartctl -t after a good night's sleep. 

Thanks for reading!

bob prohaska