Re: Troubles building world on stable/13 [an experiment-environment that leaves existing things alone]
Date: Sat, 05 Feb 2022 02:54:03 UTC
On 2022-Feb-4, at 18:06, bob prohaska <fbsd@www.zefox.net> wrote: > On Fri, Feb 04, 2022 at 05:00:05PM -0800, Mark Millard wrote: >> On 2022-Feb-4, at 16:08, bob prohaska <fbsd@www.zefox.net> wrote: >> >>> On Fri, Feb 04, 2022 at 02:44:01PM -0800, Mark Millard wrote: >>>> On 2022-Feb-4, at 13:44, bob prohaska <fbsd@www.zefox.net> wrote: >>>> >>> >>> It sounds like I simply have a corrupted c++. Perhaps just >>> set the old version aside and copy from the chroot directory >>> to /usr/bin ? Granted, other things might be wrong as well. >> >> I'm not so sure. My expectation is that if you first >> do (presuming not already in place at the time): >> >> # sysctl kern.elf64.aslr.enable=0 >> > On checking, that's already the case. I didn't change it > knowingly, likely it's been zero all along. So you get the failures even when: # sysctl kern.elf64.aslr.enable kern.elf64.aslr.enable: 0 ? That is different than in my context. I've never gotten the failure for the above type of context. It may be that for stable/13 's kernel the default is 0 . I did test and one can actually set: kern.elf64.aslr.enable from inside a chroot context, at least when one generally works as root. It changed the system's overall kern.elf64.aslr.enable status. >> and then to your buildworld buildkernel it will just work >> -- using your exising c++ compiler (system clang/clang++). >> > Well, that hasn't happened yet. On the theme that if a > problem won't get better find out what makes it worse, > I've set it to 1 and am re-running buildworld with -j1. Okay. That you get the failures even when kern.elf64.aslr.enable is 0 means that my existing context for investigation is still problematical. >> >> It seems very odd that such a setting would "uncorrupt" >> your clang/clang++ build (used under the name c++). I'm >> not aware of the compiler doing anything like the ntpd >> did, for which having ASLR enabled as a problem. >> >> For far as I can tell, the setting changes the detailed >> behavior of mmap calls (including implicit ones in >> library code and such). >> >> I've not found a way to look at the context just before >> the failure (without disturbing things enough via debugger >> activity that the failure does not happen). It is likely >> that I'll not manage to get such evidence that includes >> the failure. >> >> I worry that the failures seen with your c++ involves a >> kernel bug but I do not see a way to investigate that. > > I share your feeling that something isn't right but am > utterly ill equipped to posit what that might be. The > most obvious recent strangeness with outbound network > traffic not working unless accompanied by an outbound > ping is most peculiar. > > > Might this be a reason to try Peter Holm's stress2 suite? I > haven't played with it in a long time, not sure it'll even > compile now. "Success" in stress2 terms is a kernel panic. main [so: 14] has: # ls -Tld /usr/main-src/tools/test/stress2/ drwxr-xr-x 8 root wheel 33 Apr 28 15:20:54 2021 /usr/main-src/tools/test/stress2/ But I'm not sure if it would be of any help or not. It may not have tests for causing vm.aslr_restarts to increment during operation and then seeing what works vs. what does not. stable/13 and before do not seem to have stress2/ . >> Another option might be to use a copy of the >> compiler from the chroot area to replace the >> normal system's copies, possibly renaming the >> old ones first (various names), including >> deal with clang.debug as well. This presumes >> that the 2 stable/13 builds are sufficiently >> compatible for such a substitution to work. > > That sounds worth a try if no better ideas emerge. > === Mark Millard marklmi at yahoo.com