Re: Error detection for microSD-based swap, buildworld failures on pi3

From: MJ <mafsys1234_at_gmail.com>
Date: Wed, 02 Feb 2022 00:47:48 UTC

On 2/02/2022 3:18 am, bob prohaska wrote:
> [new subject, different emphasis, old problem]
> 
> On Mon, Jan 31, 2022 at 03:06:01PM -0800, Mark Millard wrote:
>>
>> One thing that could fit the behavior is if small part(s)
>> of the system c++ compiler (or libraires it uses) were
>> corrupted on that specific media. In that case, nothing
>> elsewhere would replicate the failures but a lot might
>> work without using the corrupted part(s), making the
>> failures not random.
> 
> [spaced for emphasis]
> 
>> Checking on that is part of why
>> I'd hoped to get a lldb report for a .sh/.cpp pair
>> leading to failure on your RPi3* in question.
>>
> 
> If/when the stable/13 Pi3 finishes its -j1 single-user
> build/install cycle I'll make a point of trying the
> .sh/.cpp test under lldb.
> 
> For most of their operational history both troublesome Pi3
> systems have had some of their swap on microSD. If there
> is no error detection at all for microSD-based storage

Is this true? I would have thought it used some form of error detection in the firmware or in
the controller.

> then undetected corruption of data from swap is a real
> possibility. I expected that storage errors would be
> reported but maybe not, especially outside file systems.

If indeed your suppositions are correct, would a file for swap be more prudent as it has to
go through the file system (UFS/VFS) to read/write to swap?

> 
> Mechanical disks have some internal error detection and
> report explictly when data can't be retrieved. As I think
> back on it at least one flash device (a USB thumb drive)
> failed silently, no reported errors but also no-write.
> That was on a filesystem, so the OS noticed and so did I.

But this could "simply" be because one of the NAND blocks has failed, not that it could not
detect an error. Is there a lack of error detection in the driver handling USB thumb drives and reported back to the kernel? I do not know.

> 
> Is there any error detection/correction employed by the
> virtual memory system as it reads and writes mass storage?

You would think there should be.

> 
> Thanks for reading!
> 
> 
MJ