Re: Troubles building world on stable/13
- Reply: Mark Millard : "Re: Troubles building world on stable/13"
- In reply to: bob prohaska : "Re: Troubles building world on stable/13"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 03 Feb 2022 08:05:42 UTC
On 2022-Feb-2, at 17:51, bob prohaska <fbsd@www.zefox.net> wrote: > On Wed, Feb 02, 2022 at 04:18:07PM -0800, Mark Millard wrote: >> On 2022-Feb-2, at 14:32, bob prohaska <fbsd@www.zefox.net> wrote: >> >>> The latest Pi3 single-user -j1 buildworld stopped with clang error 139. >>> Running the .sh and .cpp files produced an error. The .sh was >>> re-run under lldb and backtraced. >> >> I'll see if I can find any interesting information based on >> the "fault address: 0x5" and source code related to the >> backtrace reporting . Between the source code and the assembler, I've yet to see how the value 0x5 ended up as the address used. >>> The output is at >>> http://www.zefox.net/~fbsd/rpi3/clang_trouble/20220202/lldb_session >>> along with the buildworld log and related files in the same directory. >> >> Your 20220202/readme reports: >> >> QUOTE >> The first attempt to run lldb-gtest-all-fe760c.sh finished with exit >> code zero. A subsequent try produced the expected output reported >> here in lldb_session. >> END QUOTE >> >> (You had also reported (off list) a recent prior failure where the >> .sh/.cpp pair did not repeat the failure when you tired it.) >> >> Well: >> >> http://www.zefox.net/~fbsd/rpi3/clang_trouble/20220202/gtest-all-fe760c.o >> >> seems to be a possibly valid compile result to go along with the >> run (under lldb) that finished normally (exit code zero). >> >> I can not see the dates/times on the files over the web interface. >> Can you report the output of something like a >> >> # ls -Tla > > Done and added to the web directory as file_dates: > root@pelorus:/usr/src # ls -Tla *76* > -rw-r--r-- 1 root wheel 0 Feb 2 13:52:42 2022 gtest-all-fe760c-d2733764.o.tmp > -rw-r--r-- 1 root wheel 7339473 Feb 2 13:18:34 2022 gtest-all-fe760c.cpp > -rw-r--r-- 1 root wheel 5246192 Feb 2 13:48:41 2022 gtest-all-fe760c.o > -rw-r--r-- 1 root wheel 4527 Feb 2 13:18:34 2022 gtest-all-fe760c.sh > -rw-r--r-- 1 root wheel 1448 Feb 2 13:35:26 2022 lldb-gtest-all-fe760c.sh Thanks. That is the order for having a successful gtest-all-fe760c.o generation when the lldb ran the compile to completion with a zero status result and later having the failing run. The gtest-all-fe760c.o content is probably good. >> We will see if I notice anything intersting looking at >> source code related to the frames of the backtrace for >> the lldb based failure reporting. One of the odd things is that the bt command should have reported a lot more frames than just #0..#5. It should have gone all the way back to main and __start and rtld_start . Another oddity that I've no clue about at this point. >> The variable results for the (lldb) .sh/.cpp runs for the >> same file pair suggests possibilities like race conditions, >> use of uninitialized memory, use of deallocated-and-reused >> memory (now in-use for something else), flaky hardware. >> >> (That failure only sometimes happened with the .sh/cpp >> pair means that no processing of include files was involved: >> the .cpp of the pair is self contained.) >> >> I'll note that the buildworld.log 's : >> >> 1. /usr/obj/usr/src/arm64.aarch64/tmp/usr/include/private/gtest/internal/gtest-type-util.h:899:21: current parser token '{' >> 2. /usr/obj/usr/src/arm64.aarch64/tmp/usr/include/private/gtest/internal/gtest-type-util.h:58:1: parsing namespace 'testing' >> >> is the exact same places in the original source code >> as was reported in other such logs for failures while >> processing gtest-all.cc . >> >>> Not sure what to try next. It is possible to build kernel-toolchain and >>> new kernels, >> >> kernel-toolchain builds a subset of the toolchain that >> buildworld builds. I'm unsure if the buildworld >> completed building what kernel-toolchain builds or not. >> >>> if that might be useful. >> >> For now, I need to explore source code. >> > > Ok, I'll refrain from idle tampering. Without being able to replicate the problem and explore the context, I may well not get anywhere useful. And nothing about this is suggesting any sort of work around beyond building on a context that is not having the issue and then installing from there to the media that would be used on the RPi3* . >>> At present the machine reports >>> bob@pelorus:/usr/src % uname -apKU >>> FreeBSD pelorus.zefox.org 13.0-STABLE FreeBSD 13.0-STABLE #0 stable/13-n249120-dee0854a009: Sat Jan 22 23:32:23 PST 2022 bob@pelorus.zefox.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64 aarch64 1300525 1300523 >> >> Thanks for the reference to stable/13 's dee0854a009 . >> >> (This was still the "boot -s" single user context for >> the testing. Note is for anyone reading this.) >> >> FYI: the variable result makes the corrupted-block >> hypothesis less likely. You have seen the failure via >> the original files and via the .cpp of the .sh/.cpp >> pair. You appear to have had a successful build >> with the .cpp pair --and an unsuccessful one. Also >> no stage under lldb reported illegal instructions or >> the like. >> === Mark Millard marklmi at yahoo.com