Re: git: e85eaa930862 - main - Have rtld query the page size from the kernel

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Wed, 20 Nov 2024 14:50:37 UTC
On 11/20/24 22:08, Konstantin Belousov wrote:
> On Tue, Nov 19, 2024 at 10:14:52AM -0500, John Baldwin wrote:
>> On 11/19/24 22:30, Konstantin Belousov wrote:
>>> On Thu, Nov 14, 2024 at 09:47:30AM -0500, John Baldwin wrote:
>>>> On 11/13/24 12:10, Jessica Clarke wrote:
>>>>> On 13 Nov 2024, at 19:44, John Baldwin <jhb@FreeBSD.org> wrote:
>>>>>>
>>>>>> On 4/7/22 07:38, Andrew Turner wrote:
>>>>>>> The branch main has been updated by andrew:
>>>>>>> URL: https://cgit.FreeBSD.org/src/commit/?id=e85eaa930862d5b4dc917bc31e8d7254a693635d
>>>>>>> commit e85eaa930862d5b4dc917bc31e8d7254a693635d
>>>>>>> Author:     Andrew Turner <andrew@FreeBSD.org>
>>>>>>> AuthorDate: 2022-04-04 15:05:40 +0000
>>>>>>> Commit:     Andrew Turner <andrew@FreeBSD.org>
>>>>>>> CommitDate: 2022-04-07 14:37:37 +0000
>>>>>>>        Have rtld query the page size from the kernel
>>>>>>>             To allow for a dynamic page size on arm64 have the runtime linker
>>>>>>>        query the kernel for the currentl page size.
>>>>>>>             Reviewed by:    kib
>>>>>>>        Sponsored by:   The FreeBSD Foundation
>>>>>>>        Differential Revision: https://reviews.freebsd.org/D34765
>>>>>>
>>>>>> This broke relro handling for rtld.  The reason is that init_pagesizes() is
>>>>>> called after parsing the program headers for rltd in init_rtld().  As a result,
>>>>>> page_size is 0 when rtld_round_page() is called so the relro_size is 0.  The
>>>>>> RTLD_INIT_EARLY_PAGESIZES case was for ia64, and in the early case it's probably
>>>>>> not safe to call sysctl?  If it is safe to call sysctl, we could just always
>>>>>> init pagesizes early?
>>>>>
>>>>> It looks like there are a few things going on:
>>>>>
>>>>> 1. relocate_object calls obj_enforce_relro if !obj->mainprog, so will
>>>>> try to enforce RELRO for RTLD itself whilst page_size is 0
>>>>>
>>>>> 2. init_rtld later calls obj_enforce_relro for obj_rtld, after
>>>>> page_size has been initialised
>>>>>
>>>>> 3. init_rtld is careful to avoid using global variables until it’s
>>>>> called relocate_objects for RTLD itself, but by hiding accesses to
>>>>> page_size away in rtld_*_page that’s no longer true (definitely not
>>>>> true in the case of text relocations, for example, though whether it
>>>>> also occurs for other cases we care more about I don’t know)
>>>>>
>>>>> So I think there are a couple of things to fix:
>>>>>
>>>>> 1. Stop accessing page_size prior to relocate_objects returning for
>>>>> RTLD itself
>>>>>
>>>>> 2. Stop enforcing RELRO twice for RTLD (e.g. add && obj != rtldobj to
>>>>> relocate_object’s case)
>>>>>
>>>>> At least, that’s what I’ve inferred from reading the code.
>>>>>
>>>>> Though, to be honest, things might be rather nicer if we just made
>>>>> .rtld_start responsible for relocating RTLD itself prior to calling
>>>>> init_rtld, that’s what we have to do for CHERI, as do arm, powerpc and
>>>>> powerpc64, and it means you can use globals from the start in init_rtld.
>>>>
>>>> I've done 2) locally which fixed my immediate issue.
>>> Can you provide some more info please?
>>
>> I have a local patch that supports mutiple PT_GNU_RELRO segments, and to
>> do that it removes relro_* from Obj_Entry and just walks the list of
>> phdrs in obj_remap_relro().  However, that defeats the order you mention
>> below of the timing of parse_rtld_phdr().  I hadn't realized until your
>> paragraph below that that local change was part of why this was exposed.
>> It would still be more correct, strictly speaking, to skip obj_enforce_relro
>> for rtldobj, and if we ever supported multiple PT_GNU_RELRO upstream we
>> would want that change.
> Ok, please commit this change.

Ok, I will put that up for review.  I wasn't sure if we would want this change
upstream, but it is pretty simple.

-- 
John Baldwin