Re: git: 034c0856018c - main - modules: fix freebsd32_modstat on big endian platforms

From: Jessica Clarke <jrtc27_at_freebsd.org>
Date: Fri, 07 Jul 2023 17:59:41 UTC
On 7 Jul 2023, at 05:24, Ka Ho Ng <khng@FreeBSD.org> wrote:
> 
> The branch main has been updated by khng:
> 
> URL: https://cgit.FreeBSD.org/src/commit/?id=034c0856018ccf73c0f2d6140825f3edf43f47b2
> 
> commit 034c0856018ccf73c0f2d6140825f3edf43f47b2
> Author:     Ka Ho Ng <khng@FreeBSD.org>
> AuthorDate: 2023-07-07 04:21:01 +0000
> Commit:     Ka Ho Ng <khng@FreeBSD.org>
> CommitDate: 2023-07-07 04:22:59 +0000
> 
>    modules: fix freebsd32_modstat on big endian platforms
> 
>    The layout of modspecific_t on both little endian and big endian are as
>    follows:
>    |0|1|2|3|4|5|6|7|
>    +-------+-------+
>    |uintval|       |
>    +-------+-------+
>    |ulongval       |
>    +-------+-------+
> 
>    For the following code snippet:
>            CP(mod->data, data32, longval);
>            CP(mod->data, data32, ulongval);
>    It only takes care of little endian platforms that it truncates the
>    highest 32bit automatically. However on big endian platforms it takes
>    the highest 32bit instead. This eventually returns a garbage syscall
>    number to the 32bit userland.

This fixes the case where uintval is the active member, but breaks the
case where ulongval is the active member, since it’ll take bytes 0, 1,
2 and 3 which are the high bytes for BE, not the low bytes. This is an
intrinsic issue with BE; LE is very permissive when you access the
wrong union member for different-width integer types, but BE is not and
you have to access the right one. How does userspace know which member
to access, anyway? The kernel needs to repeat that and copy the right
member based on that information, otherwise it is inherently impossible
to have both cases work on BE.

>    Since modspecific_t's usage currently is for the use of syscall modules,
>    we only initialize modspecific32_t with uintval.

Is this saying that uintval is always the active member, never
ulongval? If so, there need to be comments, both in freebsd32_modstat
to justify this and on the type to say “DO NOT USE (u)longval”,
otherwise there is a significant risk someone will start using
(u)longval and we’ll be stuck. Perhaps (u)longval should even be
renamed to be ABI-compatibility padding (and aligning).

Jess

>    Now on both BE and LE
>    64-bit platforms it always pick up the first 4 bytes.
> 
>    Sponsored by:   Juniper Networks, Inc.
>    Reviewed by:    markj
>    Differential Revision:  https://reviews.freebsd.org/D40814
>    MFC after:      1 week
> ---
> sys/kern/kern_module.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/sys/kern/kern_module.c b/sys/kern/kern_module.c
> index 40e47868481c..d4edc64a5ce4 100644
> --- a/sys/kern/kern_module.c
> +++ b/sys/kern/kern_module.c
> @@ -520,10 +520,9 @@ freebsd32_modstat(struct thread *td, struct freebsd32_modstat_args *uap)
> id = mod->id;
> refs = mod->refs;
> name = mod->name;
> - CP(mod->data, data32, intval);
> + _Static_assert(sizeof(data32) == sizeof(data32.uintval),
> +    "bad modspecific32_t size");
> CP(mod->data, data32, uintval);
> - CP(mod->data, data32, longval);
> - CP(mod->data, data32, ulongval);
> MOD_SUNLOCK;
> stat32 = uap->stat;
>