Re: 15-aarch64-RPI-snap
- In reply to: Mark Millard : "Re: 15-aarch64-RPI-snap"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 29 Oct 2023 01:25:18 UTC
On Oct 28, 2023, at 09:40, Mark Millard <marklmi@yahoo.com> wrote: > On Oct 27, 2023, at 23:00, Mark Millard <marklmi@yahoo.com> wrote: > >> On Oct 27, 2023, at 22:24, Mark Millard <marklmi@yahoo.com> wrote: >> >>> On Oct 27, 2023, at 21:34, Glen Barber <gjb@FreeBSD.org> wrote: >>> >>>>>> . . . >>>>>> ^ >>>>>> ./offset.inc:16:19: error: null character ignored [-Werror,-Wnull-character] >>>>>> <U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+00 >>>>>> 00><U+0000>#undef _SA >>>>>> ^ >>> >>> Are the above from a ZFS file system? UFS? Something else? >>> >>> Back in 2021-Nov (15..21) I had problems where ZFS was leading >>> to blocks of such on aarch64, not specifically RPi*'s, various >>> files but not the same ones from test to test. When I updated >>> past some zfs updates on the 23rd the problem stopped. >>> >>> I also have notes from 2022-Mar (19..22) about replicating >>> another example problem someone was having with files ending >>> up with such blocks of bytes but the testing was on the >>> ThreadRipper 1950X. (The replication showed that ccache did >>> not need to be involved since I've never used it.) Again >>> ZFS was part of the environment that got the replication. >>> Mark Johnson fixed sys/contrib/openzfs/module/zfs/dnode.c >>> during this and my ability to replicate the issue then >>> stopped when I tested the patch. >>> >>> Which ever file system it is that holds the bad bytes, some >>> attempted testing for repeatability of the problem could >>> be of interest, some of that being on aarch64 but not on >>> RPi*'s, some of it not on aarch64 at all. But it might take >>> information about the context to know better what/how to >>> test. That could include information about both the host and >>> the jail OS versions if such is involved. >> >> Those last notes are likely too generic, in that normally >> official buildworld buildkernel activity is done on amd64 >> for all target platforms (last I knew). (Not that running >> such builds on other platforms would be a bad problem-scope >> isolation test.) >> >> Any notes that help delimit what sort of test context >> would be a reasonable partial replication of the original >> context could prove useful. >> >>> . . . > > If the file system is ZFS, I'll note that main [so: 15] already has > a zpool feature that is not part of openzfs-2.2 and so not part of > releng/14.0 or stable/14 . So what zpool features are enabled could > be relevant to problems that only happen in main and might need to > be involved in efforts to replicate the problem. > > But I've not evaluated if redaction_list_spill would be likely to > possibly be involved for the specific type of file corruptions. I'll note that the upstream openzfs master commit for the data corruption issue: "Zpool can start allocating from metaslab before TRIMs have completed" was on 2023-Oct-12, so not long ago. If the official builds use ZFS and TRIM but are based on a system version that predates FreeBSD picking up that commit, then there is a known data zfs data corruption issue present in the official build environment. Since port->package builds are based on a HOST/JAIL such as: Host OSVERSION: 1500000 Jail OSVERSION: 1500002 or: Host OSVERSION: 1500000 Jail OSVERSION: 1400097 but the Host kernel is the one in use (with the Host kernel commit not identified), it could have such an issue. (Because of such issues, I wish that Host OSVERSION related commit identification was also reported for the package builds. Presuming ZFS use, I also wish that the zpool features enabled were reported for similar reasons.) === Mark Millard marklmi at yahoo.com