Re: llvm19 lld issue
- In reply to: Dimitry Andric : "Re: llvm19 lld issue"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 15 Nov 2024 08:40:44 UTC
On 14.11.2024 22:01, Dimitry Andric wrote: > On 14 Nov 2024, at 13:44, Michal Meloun <mmel@FreeBSD.org> wrote: >> >> While searching for the cause of armv7 kernel corruption after updating to llvm19 lld, I came across an interesting problem. >> >> - The linker script does not list all generated sections. Specifically, the data sections created by the linker set are not listed. >> >> - The linker can place these orphaned sections in any location (OK, with some restrictions). See https://maskray.me/blog/2024-06-02-understanding-orphan-sections. >> >> - Creating symbols outside a section is fragile and subject to error; the linker may place an orphaned section between the symbol definition and the following section. >> >> We ran into this problem many years ago, see https://github.com/freebsd/freebsd-src/commit/6e764e36da019837d90e3b4b712871ee4442637a. Unfortunately, we didn't fix it completely then, and we have to address the same corruption again. >> >> I think we should be strict in this area and use '--orphan-handling=error' for kernel linking. However, I'm not sure we can handle linker sets gracefully. >> >> Any comments, contrary opinion or better solution ? Does anyone know how to properly list all linker sets (mainly but not only 'set_<foo>_set') in linker script and which section is appropriate for them ? .rodata? > > I tried adding --orphan-handler=error, and on buildkernel (even for amd64) I get pretty soon: > > --- all_subdir_accf_data --- > ld: error: accf_data.o:(.data) is being placed in '.data' > ld: error: accf_data.o:(set_modmetadata_set) is being placed in 'set_modmetadata_set' > ld: error: accf_data.o:(set_sysinit_set) is being placed in 'set_sysinit_set' > ld: error: accf_data.o:(.debug_loc) is being placed in '.debug_loc' > ld: error: accf_data.o:(.debug_abbrev) is being placed in '.debug_abbrev' > ld: error: accf_data.o:(.debug_info) is being placed in '.debug_info' > ld: error: accf_data.o:(.debug_ranges) is being placed in '.debug_ranges' > ld: error: accf_data.o:(.debug_str) is being placed in '.debug_str' > ld: error: accf_data.o:(.comment) is being placed in '.comment' > ld: error: accf_data.o:(.debug_frame) is being placed in '.debug_frame' > ld: error: accf_data.o:(.debug_line) is being placed in '.debug_line' > ld: error: accf_data.o:(.llvm_addrsig) is being placed in '.llvm_addrsig' > ld: error: accf_data.o:(.SUNW_ctf) is being placed in '.SUNW_ctf' > ld: error: <internal>:(.note.gnu.build-id) is being placed in '.note.gnu.build-id' > ld: error: <internal>:(.note.GNU-stack) is being placed in '.note.GNU-stack' > ld: error: <internal>:(.symtab) is being placed in '.symtab' > ld: error: <internal>:(.shstrtab) is being placed in '.shstrtab' > ld: error: <internal>:(.strtab) is being placed in '.strtab' > --- all_subdir_aic7xxx --- > --- all_subdir_aic7xxx/ahc --- > --- machine --- > machine -> /home/dim/src/freebsd/src/sys/amd64/include > --- all_subdir_accf_data --- > *** [accf_data.ko.full] Error code 1 > > > Not sure if those are all really orphaned, though? > > -Dimitry > Most of them are not orphaned and I think they should be explicitly placed. Annoying as it is, we should probably keep a list of sections used in the kernel (one is sufficient for all architectures) and include it in the ldscripts for a particular arches(it's about 24 lines now). After discussion with jrtc27 (thanks a lot for your patience), I think we have only three options besides explicitly listing all kernel sections: 1) Leave the ldscripts as they are, but prefix each <foo>_start symbol with a guard, i.e. explicit assignment to location counter ( '.=.' or ALIGN()). 2) Move all <foo>_start/end symbols defined outside to the appropriate sections 3) Add the linker '--orphan-handling=error' and declare/discard all compiler-generated sections. I definitely don't like option 1. It's too fragile and depends on not very defined linker behavior. Option 2 is easy and robust, and with the explicit placement of all kernel sections seems sufficient. In my best opinion, we can combine options 2 and 3 to get the most robust solution. Another problem is that an explicit list of kernel sections could probably make modules outside the tree interact badly with linker script. What was your preference? Michal