From nobody Sat Nov 13 19:07:53 2021 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id EC8B81848704 for ; Sat, 13 Nov 2021 19:08:02 +0000 (UTC) (envelope-from jrtc27@jrtc27.com) Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Hs4jG55KLz4gls for ; Sat, 13 Nov 2021 19:08:02 +0000 (UTC) (envelope-from jrtc27@jrtc27.com) Received: by mail-wm1-f54.google.com with SMTP id p3-20020a05600c1d8300b003334fab53afso9580194wms.3 for ; Sat, 13 Nov 2021 11:08:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=tseSdlOaf7jHqhuYivigm1iwPSwiw4qduHIi51qliv8=; b=RHXnfzxkBGcYR6aBTKpRUO+KqRt5nJlqBgjS5nsv2jP1msnMZtMUvNfXjGNr6TWcHm Jgav8OUujFNYyH1oHHFrMqILz/kjuhagISPdifpNDHFca/eR+zK0JUxmmy2Fmq5swekm /WvfwLuS/dhAODw9Hi2ocn03Zr9lqQSuGlVL3GPkoLP6vLEhBLFKYi1Xoq2ZcA2r/syT XorA8rXpHiiH6rV0+B6cxziJ/NY+T2N7oTxvaOqFa6ChV68pWUoSwedEaaMEKjEPMXlm QDWODNPOBq78txCV1h2myiUqcmExYe4Ab2xFayROAXuDW63e2K1OPgCKaa28z2QfAVtc oq+g== X-Gm-Message-State: AOAM533++VB+TGhpiiaaL21NgyHIab5GWv4gBTA6qaE3b+hny8gZdAm9 Sz0YGuMaO2Gzvg+w4Lyrv/LFYEn54U5bqQ== X-Google-Smtp-Source: ABdhPJzHUg9auFXYetJAWGcFzmwxLHYdKEBPXZdrs58/l7Ckj6yYXR+QR6umOnx0eebabO3oCjuEOA== X-Received: by 2002:a05:600c:1548:: with SMTP id f8mr45345645wmg.67.1636830475033; Sat, 13 Nov 2021 11:07:55 -0800 (PST) Received: from smtpclient.apple (global-5-141.nat-2.net.cam.ac.uk. [131.111.5.141]) by smtp.gmail.com with ESMTPSA id b14sm11731360wrd.24.2021.11.13.11.07.54 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 13 Nov 2021 11:07:54 -0800 (PST) Content-Type: text/plain; charset=utf-8 List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Subject: Re: git: 64ba1f4cf3a6 - main - rtld: Implement LD_SHOW_AUXV From: Jessica Clarke In-Reply-To: Date: Sat, 13 Nov 2021 19:07:53 +0000 Cc: "src-committers@freebsd.org" , "dev-commits-src-all@freebsd.org" , "dev-commits-src-main@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: References: <202111131733.1ADHXekX049248@gitrepo.freebsd.org> <37FC39AA-925D-4D75-8E0A-EA14E846E3A6@freebsd.org> <110784F6-3A7A-4F27-AAEB-E9B5A8F7CF0E@freebsd.org> <2450270B-CB98-43D0-B3BE-3C6D02F9B6FD@freebsd.org> To: Konstantin Belousov X-Mailer: Apple Mail (2.3654.120.0.1.13) X-Rspamd-Queue-Id: 4Hs4jG55KLz4gls X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On 13 Nov 2021, at 18:58, Konstantin Belousov = wrote: >=20 > On Sat, Nov 13, 2021 at 06:29:24PM +0000, Jessica Clarke wrote: >> On 13 Nov 2021, at 17:57, Jessica Clarke wrote: >>>=20 >>> On 13 Nov 2021, at 17:54, Jessica Clarke wrote: >>>>=20 >>>> On 13 Nov 2021, at 17:33, Konstantin Belousov = wrote: >>>>>=20 >>>>> The branch main has been updated by kib: >>>>>=20 >>>>> URL: = https://cgit.FreeBSD.org/src/commit/?id=3D64ba1f4cf3a6847a1dacf4bab0409d94= 898fa168 >>>>>=20 >>>>> commit 64ba1f4cf3a6847a1dacf4bab0409d94898fa168 >>>>> Author: Konstantin Belousov >>>>> AuthorDate: 2021-11-13 01:18:13 +0000 >>>>> Commit: Konstantin Belousov >>>>> CommitDate: 2021-11-13 17:33:13 +0000 >>>>>=20 >>>>> rtld: Implement LD_SHOW_AUXV >>>>>=20 >>>>> It dumps auxv as seen by interpreter, right before starting any = user >>>>> code. >>>>>=20 >>>>> Copied from: glibc >>>>> Sponsored by: The FreeBSD Foundation >>>>> MFC after: 1 week >>>>> --- >>>>> libexec/rtld-elf/rtld.1 | 7 +++++- >>>>> libexec/rtld-elf/rtld.c | 67 = +++++++++++++++++++++++++++++++++++++++++++++++++ >>>>> 2 files changed, 73 insertions(+), 1 deletion(-) >>>>>=20 >>>>> diff --git a/libexec/rtld-elf/rtld.1 b/libexec/rtld-elf/rtld.1 >>>>> index 187dc105667a..66aa2bdabd17 100644 >>>>> --- a/libexec/rtld-elf/rtld.1 >>>>> +++ b/libexec/rtld-elf/rtld.1 >>>>> @@ -28,7 +28,7 @@ >>>>> .\" >>>>> .\" $FreeBSD$ >>>>> .\" >>>>> -.Dd August 15, 2021 >>>>> +.Dd November 13, 2021 >>>>> .Dt RTLD 1 >>>>> .Os >>>>> .Sh NAME >>>>> @@ -309,6 +309,11 @@ will process the filtee dependencies of the = loaded objects immediately, >>>>> instead of postponing it until required. >>>>> Normally, the filtees are opened at the time of the first symbol = resolution >>>>> from the filter object. >>>>> +.It Ev LD_SHOW_AUXV >>>>> +If set, causes >>>>> +.Nm >>>>> +to dump content of the aux vector to standard output, before = passing >>>>> +control to any user code. >>>>> .El >>>>> .Sh DIRECT EXECUTION MODE >>>>> .Nm >>>>> diff --git a/libexec/rtld-elf/rtld.c b/libexec/rtld-elf/rtld.c >>>>> index c173c5a6e22e..0475134b0d96 100644 >>>>> --- a/libexec/rtld-elf/rtld.c >>>>> +++ b/libexec/rtld-elf/rtld.c >>>>> @@ -104,6 +104,7 @@ static Obj_Entry *dlopen_object(const char = *name, int fd, Obj_Entry *refobj, >>>>> static Obj_Entry *do_load_object(int, const char *, char *, struct = stat *, int); >>>>> static int do_search_info(const Obj_Entry *obj, int, struct = dl_serinfo *); >>>>> static bool donelist_check(DoneList *, const Obj_Entry *); >>>>> +static void dump_auxv(Elf_Auxinfo **aux_info); >>>>> static void errmsg_restore(struct dlerror_save *); >>>>> static struct dlerror_save *errmsg_save(void); >>>>> static void *fill_search_info(const char *, size_t, void *); >>>>> @@ -364,6 +365,7 @@ enum { >>>>> LD_TRACE_LOADED_OBJECTS_FMT1, >>>>> LD_TRACE_LOADED_OBJECTS_FMT2, >>>>> LD_TRACE_LOADED_OBJECTS_ALL, >>>>> + LD_SHOW_AUXV, >>>>> }; >>>>>=20 >>>>> struct ld_env_var_desc { >>>>> @@ -396,6 +398,7 @@ static struct ld_env_var_desc ld_env_vars[] =3D = { >>>>> LD_ENV_DESC(TRACE_LOADED_OBJECTS_FMT1, false), >>>>> LD_ENV_DESC(TRACE_LOADED_OBJECTS_FMT2, false), >>>>> LD_ENV_DESC(TRACE_LOADED_OBJECTS_ALL, false), >>>>> + LD_ENV_DESC(SHOW_AUXV, false), >>>>> }; >>>>>=20 >>>>> static const char * >>>>> @@ -857,6 +860,9 @@ _rtld(Elf_Addr *sp, func_ptr_type *exit_proc, = Obj_Entry **objp) >>>>> if (rtld_verify_versions(&list_main) =3D=3D -1 && !ld_tracing) >>>>> rtld_die(); >>>>>=20 >>>>> + if (ld_get_env_var(LD_SHOW_AUXV) !=3D NULL) >>>>> + dump_auxv(aux_info); >>>>> + >>>>> if (ld_tracing) { /* We're done */ >>>>> trace_loaded_objects(obj_main); >>>>> exit(0); >>>>> @@ -6058,6 +6064,67 @@ print_usage(const char *argv0) >>>>> " Arguments to the executed process\n", argv0); >>>>> } >>>>>=20 >>>>> +#define AUXFMT(at, xfmt) [at] =3D { .name =3D #at, .fmt =3D xfmt = } >>>>> +static const struct auxfmt { >>>>> + const char *name; >>>>> + const char *fmt; >>>>> +} auxfmts[] =3D { >>>>> + AUXFMT(AT_NULL, NULL), >>>>> + AUXFMT(AT_IGNORE, NULL), >>>>> + AUXFMT(AT_EXECFD, "%d"), >>>>> + AUXFMT(AT_PHDR, "%p"), >>>>> + AUXFMT(AT_PHENT, "%u"), >>>>> + AUXFMT(AT_PHNUM, "%u"), >>>>> + AUXFMT(AT_PAGESZ, "%u"), >>>>> + AUXFMT(AT_BASE, "%#lx"), >>>>> + AUXFMT(AT_FLAGS, "%#lx"), >>>>> + AUXFMT(AT_ENTRY, "%p"), >>>>> + AUXFMT(AT_NOTELF, NULL), >>>>> + AUXFMT(AT_UID, "%d"), >>>>> + AUXFMT(AT_EUID, "%d"), >>>>> + AUXFMT(AT_GID, "%d"), >>>>> + AUXFMT(AT_EGID, "%d"), >>>>> + AUXFMT(AT_EXECPATH, "%s"), >>>>> + AUXFMT(AT_CANARY, "%p"), >>>>> + AUXFMT(AT_CANARYLEN, "%u"), >>>>> + AUXFMT(AT_OSRELDATE, "%u"), >>>>> + AUXFMT(AT_NCPUS, "%u"), >>>>> + AUXFMT(AT_PAGESIZES, "%p"), >>>>> + AUXFMT(AT_PAGESIZESLEN, "%u"), >>>>> + AUXFMT(AT_TIMEKEEP, "%p"), >>>>> + AUXFMT(AT_STACKPROT, "%#x"), >>>>> + AUXFMT(AT_EHDRFLAGS, "%#lx"), >>>>> + AUXFMT(AT_HWCAP, "%#lx"), >>>>> + AUXFMT(AT_HWCAP2, "%#lx"), >>>>> + AUXFMT(AT_BSDFLAGS, "%#lx"), >>>>> + AUXFMT(AT_ARGC, "%u"), >>>>> + AUXFMT(AT_ARGV, "%p"), >>>>> + AUXFMT(AT_ENVC, "%p"), >>>>> + AUXFMT(AT_ENVV, "%p"), >>>>> + AUXFMT(AT_PS_STRINGS, "%p"), >>>>> + AUXFMT(AT_FXRNG, "%p"), >>>>> +}; >>>>> + >>>>> +static void >>>>> +dump_auxv(Elf_Auxinfo **aux_info) >>>>> +{ >>>>> + Elf_Auxinfo *auxp; >>>>> + const struct auxfmt *fmt; >>>>> + int i; >>>>> + >>>>> + for (i =3D 0; i < AT_COUNT; i++) { >>>>> + auxp =3D aux_info[i]; >>>>> + if (auxp =3D=3D NULL) >>>>> + continue; >>>>> + fmt =3D &auxfmts[i]; >>>>> + if (fmt->fmt =3D=3D NULL) >>>>> + continue; >>>>> + rtld_fdprintf(STDOUT_FILENO, "%s:\t", fmt->name); >>>>> + rtld_fdprintfx(STDOUT_FILENO, fmt->fmt, = auxp->a_un.a_ptr); >>>>> + rtld_fdprintf(STDOUT_FILENO, "\n"); >>>>=20 >>>> This is undefined behaviour, breaks CHERI, and totally unnecessary. = You >>>> have a handful of cases here, just make an enum and have separate >>>> rtld_fdprintf calls. >>=20 >> In particular, ignoring CHERI, unsigned ints are sign-extended to 64 >> bits on MIPS and RISC-V. Thus by passing a 64-bit value but using a = %u, >> you are violating the calling convention. I can=E2=80=99t currently = get GCC or >> Clang to exploit the fact that varargs arguments are sign-extended, = but >> on MIPS, and RISC-V GCC (Clang is currently stupid and round-trips = via >> memory even when the va_arg calls have no branching surrounding them, >> rather than just grabbing from the register) there is a redundant >> sext.w that can legally be optimised out, but would be broken by this >> calling convention violation. > I might understand the argument that all non-pointer formats for auxv > should be longs, i.e. %lu/%ld/%lx, but this is the only problem I see > there. We do rely on having specific representations for addresses and > longs, and a low-level component as rtld has full rights to exercise > this fact, same as VM subsystem or memory allocators. Integer addresses are not the same thing as pointers. They happen to be for FreeBSD, but they are not for CHERI. > In fact ELF spec exercises this as well. > Our arches are either ILP32 or LP64. And ours downstream are not. My point is that your approach here is unnecessarily exploiting that, it could easily have been written to read the active member of the union rather than adding rtld_fdprintfx as a kludge that relies on pointers being plain integers. The portable code is just as easy to write. We upstream various pointer/integer conflation improvements and this commit makes that worse. Jess