Re: git: e85eaa930862 - main - Have rtld query the page size from the kernel

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Tue, 19 Nov 2024 06:30:20 UTC
On Thu, Nov 14, 2024 at 09:47:30AM -0500, John Baldwin wrote:
> On 11/13/24 12:10, Jessica Clarke wrote:
> > On 13 Nov 2024, at 19:44, John Baldwin <jhb@FreeBSD.org> wrote:
> > > 
> > > On 4/7/22 07:38, Andrew Turner wrote:
> > > > The branch main has been updated by andrew:
> > > > URL: https://cgit.FreeBSD.org/src/commit/?id=e85eaa930862d5b4dc917bc31e8d7254a693635d
> > > > commit e85eaa930862d5b4dc917bc31e8d7254a693635d
> > > > Author:     Andrew Turner <andrew@FreeBSD.org>
> > > > AuthorDate: 2022-04-04 15:05:40 +0000
> > > > Commit:     Andrew Turner <andrew@FreeBSD.org>
> > > > CommitDate: 2022-04-07 14:37:37 +0000
> > > >      Have rtld query the page size from the kernel
> > > >           To allow for a dynamic page size on arm64 have the runtime linker
> > > >      query the kernel for the currentl page size.
> > > >           Reviewed by:    kib
> > > >      Sponsored by:   The FreeBSD Foundation
> > > >      Differential Revision: https://reviews.freebsd.org/D34765
> > > 
> > > This broke relro handling for rtld.  The reason is that init_pagesizes() is
> > > called after parsing the program headers for rltd in init_rtld().  As a result,
> > > page_size is 0 when rtld_round_page() is called so the relro_size is 0.  The
> > > RTLD_INIT_EARLY_PAGESIZES case was for ia64, and in the early case it's probably
> > > not safe to call sysctl?  If it is safe to call sysctl, we could just always
> > > init pagesizes early?
> > 
> > It looks like there are a few things going on:
> > 
> > 1. relocate_object calls obj_enforce_relro if !obj->mainprog, so will
> > try to enforce RELRO for RTLD itself whilst page_size is 0
> > 
> > 2. init_rtld later calls obj_enforce_relro for obj_rtld, after
> > page_size has been initialised
> > 
> > 3. init_rtld is careful to avoid using global variables until it’s
> > called relocate_objects for RTLD itself, but by hiding accesses to
> > page_size away in rtld_*_page that’s no longer true (definitely not
> > true in the case of text relocations, for example, though whether it
> > also occurs for other cases we care more about I don’t know)
> > 
> > So I think there are a couple of things to fix:
> > 
> > 1. Stop accessing page_size prior to relocate_objects returning for
> > RTLD itself
> > 
> > 2. Stop enforcing RELRO twice for RTLD (e.g. add && obj != rtldobj to
> > relocate_object’s case)
> > 
> > At least, that’s what I’ve inferred from reading the code.
> > 
> > Though, to be honest, things might be rather nicer if we just made
> > .rtld_start responsible for relocating RTLD itself prior to calling
> > init_rtld, that’s what we have to do for CHERI, as do arm, powerpc and
> > powerpc64, and it means you can use globals from the start in init_rtld.
> 
> I've done 2) locally which fixed my immediate issue.
Can you provide some more info please?

obj_enforce_relro() only acts if obj_relro_size is > 0, which requires
the call to parse_rtld_phdr() to fill it.  This is indeed called after the
rtld is relocated by relocate_object(s), but so it was before the Andrew'
change.

I can see a problem if plt is mapped ro and requires relro to remap it
rw before relocation, but I do not believe this is the case for rtld and
any static linker.

And what is the arch?

The RTLD_INIT_PAGESIZES_EARLY symbol indeed should be eliminated, I have
the patch. But let's sort out the page size issue as well.

> 
> I agree though that having all arches implement rtld_relocate_nonplt_self()
> which is called before rtld() would be nicer.
> 
> -- 
> John Baldwin