Re: port binary dumping core on recent head in poudriere

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 21 Nov 2024 15:04:58 UTC
[Just resending, including the original-sender's listing of
freebsd=ports.]

On Nov 21, 2024, at 02:22, Mark Millard <marklmi@yahoo.com> wrote:

> I've noticed that recently some ports are dumping core during builds of 
> dependencies in head in poudriere.
> 
> I'm seeing this for example with sassc crashing while trying to build 
> x11-themes/greybird-theme.
> 
> My first suspect was the llvm upgrade in head, but forcing sassc and 
> libsass to build with older clang via USES=llvm:max=18 is not helping.
> 
> I did recompile the offending programs with debug and tried a backtrace 
> and got this:
> 
> ```
> (lldb) bt
> * thread #1, name = 'sassc', stop reason = signal SIGSEGV: invalid 
> permissions for mapped object (fault address: 0x82374a000)
> * frame #0: 0x000000082374a000 libsass.so.1
> frame #1: 0x0000000823865a86 libsass.so.1`_GLOBAL__sub_I_ast.cpp 
> [inlined] double std::__1::__math::acos[abi:se190102]<int, 0>(__x=-1) at 
> inverse_trigonometric_functions.h:40:10
> frame #2: 0x0000000823865a81 libsass.so.1`_GLOBAL__sub_I_ast.cpp 
> [inlined] __cxx_global_var_init at units.hpp:11:21
> frame #3: 0x0000000823865a81 libsass.so.1`_GLOBAL__sub_I_ast.cpp at 
> ast.cpp:0
> frame #4: 0x00001eac6e3f078d ld-elf.so.1
> frame #5: 0x00001eac6e3ef349 ld-elf.so.1
> frame #6: 0x00001eac6e3ec099 ld-elf.so.1`___lldb_unnamed_symbol27 + 25
> ```
> 
> which points me to this upstream line of code: 
> https://github.com/sass/libsass/blob/7037f03fabeb2b18b5efa84403f5a6d7a990f460/src/units.hpp#L11

The below verifies that the acos@got.plt ends up with the bad
value that is the failure address, reloc_plt setting the bad
value into acos@got.plt .

Some of this may be because there are 2 <acos@ply>'s in the
code (from an "info b" in gdb after I'd "b acos@plt"):

2.1                         y   0x00000008004f3da0 <acos@plt>
2.2                         y   0x000000080066e690 <acos@plt>

(gdb) disass 0x00000008004f3da0
Dump of assembler code for function acos@plt:
  0x00000008004f3da0 <+0>: jmp    *0x14362(%rip)        # 0x800508108 <acos@got.plt>
  0x00000008004f3da6 <+6>: push   $0x72
  0x00000008004f3dab <+11>: jmp    0x8004f3670
(gdb) disass 0x000000080066e690
Dump of assembler code for function acos@plt:
  0x000000080066e690 <+0>: jmp    *0x2ada(%rip)        # 0x800671170 <acos@got.plt>
  0x000000080066e696 <+6>: push   $0xb
  0x000000080066e69b <+11>: jmp    0x80066e5d0
End of assembler dump.

There are also the 2 separate <acos@got.plt> 's in the code:

(gdb) x/gx 0x800508108
0x800508108 <acos@got.plt>: 0x0000000800249000
(gdb) x/gx 0x800671170
0x800671170 <acos@got.plt>: 0x000000080066e696

One of these is is junk and the other is not:

(gdb) disass 0x0000000800249000
No function contains specified address.

NOTE: At this stage the result should have been based on:
0x00000008004f3da6

(gdb) disass 0x000000080066e696
Dump of assembler code for function acos@plt:
  0x000000080066e690 <+0>: jmp    *0x2ada(%rip)        # 0x800671170 <acos@got.plt>
  0x000000080066e696 <+6>: push   $0xb
  0x000000080066e69b <+11>: jmp    0x80066e5d0
End of assembler dump.


I built textproc/sassc in poudriere in my personal environment.
Running it fails as was described.


First just see the failing address and where it fits in the
info files information.

(gdb) run
Starting program: /usr/local/bin/sassc  
Program received signal SIGSEGV, Segmentation fault.
Invalid permissions for mapped object.
0x0000000800249000 in ?? ()
(gdb) info files
Symbols from "/usr/local/bin/sassc".
Native process:
Using the running image of child process 80969.
While running this, GDB does not access memory from...
Local exec file:
`/usr/local/bin/sassc', file type elf64-x86-64-freebsd.
Entry point: 0x203440. . .
0x0000000800225230 - 0x0000000800226058 is .bss in /libexec/ld-elf.so.1
0x00007ffffffff0e8 - 0x00007ffffffff100 is .hash in system-supplied DSO at 0x7ffffffff000
0x00007ffffffff100 - 0x00007ffffffff130 is .dynsym in system-supplied DSO at 0x7ffffffff000
0x00007ffffffff130 - 0x00007ffffffff157 is .dynstr in system-supplied DSO at 0x7ffffffff000
0x00007ffffffff158 - 0x00007ffffffff15c is .gnu.version in system-supplied DSO at 0x7ffffffff000
0x00007ffffffff15c - 0x00007ffffffff194 is .gnu.version_d in system-supplied DSO at 0x7ffffffff000
0x00007ffffffff194 - 0x00007ffffffff1a8 is .eh_frame_hdr in system-supplied DSO at 0x7ffffffff000
0x00007ffffffff1a8 - 0x00007ffffffff214 is .eh_frame in system-supplied DSO at 0x7ffffffff000
0x00007ffffffff218 - 0x00007ffffffff2c8 is .dynamic in system-supplied DSO at 0x7ffffffff000
0x00007ffffffff2d0 - 0x00007ffffffff2e6 is .text in system-supplied DSO at 0x7ffffffff000
0x0000000800249270 - 0x0000000800249288 is .note.tag in /usr/local/lib/libsass.so.1
. . .
0x0000000800630310 - 0x00000008006343f8 is .bss in /lib/libcxxrt.so.1
0x00000008006352a8 - 0x00000008006352c0 is .note.tag in /lib/libm.so.5
. . .
0x000000080066e5d0 - 0x000000080066ee30 is .plt in /lib/libm.so.5


NOTE where 0x0000000800249000 fits in the above: outside any range.
(And is referencing an implementation in /lib/libm.so.5 instead
of /usr/local/lib/libsass.so.1 .)

NOTE where 0x000000080066e696 fits in the above: inside a /lib/libm.so.5 range.
(And is referencing the implementation also in /lib/libm.so.5 .)

I'll note c++/v1/__math/inverse_trigonometric_functions.h having:

template <class _A1, __enable_if_t<is_integral<_A1>::value, int> = 0>
inline _LIBCPP_HIDE_FROM_ABI double acos(_A1 __x) _NOEXCEPT {
 return __builtin_acos((double)__x);
}

This is in use in multiple places but each should end up using
the implementation in /lib/libm.so.5 .



Then rerunning for tracking acos@got.plt use and a
little other context:

(gdb) set radix 16
Input and output radices now set to decimal 16, hex 10, octal 20.
(gdb) b __cxx_global_var_init
Breakpoint 1 at 0x800364a81: __cxx_global_var_init. (50 locations)
(gdb) b acos@plt
Breakpoint 2 at 0x8004f3da0 (2 locations)
(gdb) watch -l *(unsigned long*)0x800508108
Hardware watchpoint 3: -location *(unsigned long*)0x800508108
(gdb) watch -l *(unsigned long*)0x800671170
Hardware watchpoint 4: -location *(unsigned long*)0x800671170
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/local/bin/sassc  
Hardware watchpoint 3: -location *(unsigned long*)0x800508108

Old value = 0x0
New value = 0x800249000
reloc_plt (obj=obj@entry=0x80022a808, flags=flags@entry=0x4, lockstate=lockstate@entry=0x0) at /usr/main-src/libexec/rtld-elf/amd64/reloc.c:343
343   break;

NOTE: Old value being 0x0 is why things are odd.

(gdb) bt
#0  reloc_plt (obj=obj@entry=0x80022a808, flags=flags@entry=0x4, lockstate=lockstate@entry=0x0) at /usr/main-src/libexec/rtld-elf/amd64/reloc.c:343
#1  0x0000000800217763 in relocate_object (obj=obj@entry=0x80022a808, bind_now=0x0, rtldobj=rtldobj@entry=0x800225250 <obj_rtld>, flags=flags@entry=0x4, lockstate=lockstate@entry=0x0)
   at /usr/main-src/libexec/rtld-elf/rtld.c:3331
#2  0x000000080020fc38 in relocate_objects (first=<optimized out>, bind_now=<optimized out>, flags=0x4, lockstate=0x0, rtldobj=<optimized out>) at /usr/main-src/libexec/rtld-elf/rtld.c:3369
#3  _rtld (sp=<optimized out>, exit_proc=0x7fffffffea50, objp=0x7fffffffea58) at /usr/main-src/libexec/rtld-elf/rtld.c:903
#4  0x000000080020cdf9 in rtld_start () at /usr/main-src/libexec/rtld-elf/amd64/rtld_start.S:40
(gdb) list
338 switch(ELF_R_TYPE(rela->r_info)) {
339 case R_X86_64_JMP_SLOT:
340   /* Relocate the GOT slot pointing into the PLT. */
341   where = (Elf_Addr *)(obj->relocbase + rela->r_offset);
342   *where += (Elf_Addr)obj->relocbase;
343   break;
344 345 case R_X86_64_IRELATIVE:
346   obj->irelative = true;
347   break;
(gdb) print obj->relocbase
$1 = (caddr_t) 0x800249000 "\177ELF\002\001\001\t"
(gdb) print rela->r_offset
$2 = 0x2bf108
(gdb) print where
$3 = (Elf_Addr *) 0x800508108 <acos@got[plt]>
(gdb) print *where
$4 = 0x800249000
(gdb) c
Continuing.

Hardware watchpoint 4: -location *(unsigned long*)0x800671170

Old value = 0x39696
New value = 0x80066e696
reloc_plt (obj=obj@entry=0x80022e408, flags=flags@entry=0x4, lockstate=lockstate@entry=0x0) at /usr/main-src/libexec/rtld-elf/amd64/reloc.c:343
343   break;
(gdb) bt
#0  reloc_plt (obj=obj@entry=0x80022e408, flags=flags@entry=0x4, lockstate=lockstate@entry=0x0) at /usr/main-src/libexec/rtld-elf/amd64/reloc.c:343
#1  0x0000000800217763 in relocate_object (obj=obj@entry=0x80022e408, bind_now=0x0, rtldobj=rtldobj@entry=0x800225250 <obj_rtld>, flags=flags@entry=0x4, lockstate=lockstate@entry=0x0)
   at /usr/main-src/libexec/rtld-elf/rtld.c:3331
#2  0x000000080020fc38 in relocate_objects (first=<optimized out>, bind_now=<optimized out>, flags=0x4, lockstate=0x0, rtldobj=<optimized out>) at /usr/main-src/libexec/rtld-elf/rtld.c:3369
#3  _rtld (sp=<optimized out>, exit_proc=0x7fffffffea50, objp=0x7fffffffea58) at /usr/main-src/libexec/rtld-elf/rtld.c:903
#4  0x000000080020cdf9 in rtld_start () at /usr/main-src/libexec/rtld-elf/amd64/rtld_start.S:40
(gdb) list
338 switch(ELF_R_TYPE(rela->r_info)) {
339 case R_X86_64_JMP_SLOT:
340   /* Relocate the GOT slot pointing into the PLT. */
341   where = (Elf_Addr *)(obj->relocbase + rela->r_offset);
342   *where += (Elf_Addr)obj->relocbase;
343   break;
344 345 case R_X86_64_IRELATIVE:
346   obj->irelative = true;
347   break;
(gdb) print obj->relocbase
$5 = (caddr_t) 0x800635000 "\177ELF\002\001\001\t"
(gdb) print rela->r_offset
$6 = 0x3c170
(gdb) print where
$7 = (Elf_Addr *) 0x800671170 <acos@got[plt]>
(gdb) print *where
$8 = 0x80066e696
(gdb) c
Continuing.

Breakpoint 1.50, 0x00000008005beee4 in __cxx_global_var_init () from /lib/libc++.so.1
(gdb) c
Continuing.

Breakpoint 1.1, __cxx_global_var_init () at ./units.hpp:11
warning: 11 ./units.hpp: No such file or directory
(gdb) c
Continuing.

Breakpoint 2.1, 0x00000008004f3da0 in acos@plt () from /usr/local/lib/libsass.so.1
(gdb) bt
#0  0x00000008004f3da0 in acos@plt () from /usr/local/lib/libsass.so.1
#1  0x0000000800364a86 in _ZNSt3__16__math4acosB8se190102IiTnNS_9enable_ifIXsr11is_integralIT_EE5valueEiE4typeELi0EEEdS3_ (__x=0xffffffff)
   at /usr/include/c++/v1/__math/inverse_trigonometric_functions.h:40
#2  __cxx_global_var_init () at ./units.hpp:11
#3  0x0000000800364a86 in _GLOBAL__sub_I_ast.cpp () from /usr/local/lib/libsass.so.1
#4  0x00000008002114ed in objlist_call_init (list=list@entry=0x7fffffffe9e0, lockstate=lockstate@entry=0x7fffffffe7d0) at /usr/main-src/libexec/rtld-elf/rtld.c:3128
#5  0x00000008002100a9 in _rtld (sp=<optimized out>, exit_proc=0x7fffffffea50, objp=0x7fffffffea58) at /usr/main-src/libexec/rtld-elf/rtld.c:974
#6  0x000000080020cdf9 in rtld_start () at /usr/main-src/libexec/rtld-elf/amd64/rtld_start.S:40
(gdb) disass
Dump of assembler code for function acos@plt:
=> 0x00000008004f3da0 <+0>: jmp    *0x14362(%rip)        # 0x800508108 <acos@got.plt>
  0x00000008004f3da6 <+6>: push   $0x72
  0x00000008004f3dab <+11>: jmp    0x8004f3670
End of assembler dump.
(gdb) stepi
0x0000000800249000 in ?? ()
(gdb) bt
#0  0x0000000800249000 in ?? ()
#1  0x0000000800364a86 in _ZNSt3__16__math4acosB8se190102IiTnNS_9enable_ifIXsr11is_integralIT_EE5valueEiE4typeELi0EEEdS3_ (__x=0xffffffff)
   at /usr/include/c++/v1/__math/inverse_trigonometric_functions.h:40
#2  __cxx_global_var_init () at ./units.hpp:11
#3  0x0000000800364a86 in _GLOBAL__sub_I_ast.cpp () from /usr/local/lib/libsass.so.1
#4  0x00000008002114ed in objlist_call_init (list=list@entry=0x7fffffffe9e0, lockstate=lockstate@entry=0x7fffffffe7d0) at /usr/main-src/libexec/rtld-elf/rtld.c:3128
#5  0x00000008002100a9 in _rtld (sp=<optimized out>, exit_proc=0x7fffffffea50, objp=0x7fffffffea58) at /usr/main-src/libexec/rtld-elf/rtld.c:974
#6  0x000000080020cdf9 in rtld_start () at /usr/main-src/libexec/rtld-elf/amd64/rtld_start.S:40
(gdb) stepi

Program received signal SIGSEGV, Segmentation fault.
Invalid permissions for mapped object.
0x0000000800249000 in ?? ()




===
Mark Millard
marklmi at yahoo.com


===
Mark Millard
marklmi at yahoo.com