Re: git: 005aa1743b42 - main - modules: bzero the modspecific_t
- In reply to: Mark Millard : "Re: git: 005aa1743b42 - main - modules: bzero the modspecific_t"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 03 Jul 2023 07:11:03 UTC
On Jul 2, 2023, at 09:34, Mark Millard <marklmi@yahoo.com> wrote: > Ka Ho Ng <khng300_at_gmail.com> wrote on > Date: Sun, 02 Jul 2023 10:18:35 UTC : > > On Sun, Jul 2, 2023 at 6:03 AM Ka Ho Ng <khng300@gmail.com> wrote: >> >>> On Sat, Jul 1, 2023 at 7:13 PM Konstantin Belousov <kostikbel@gmail.com> >>> wrote: >>> >>>> On Sat, Jul 01, 2023 at 10:59:22PM +0000, Ka Ho Ng wrote: >>>>> The branch main has been updated by khng: >>>>> >>>>> URL: >>>> https://cgit.FreeBSD.org/src/commit/?id=005aa1743b42b52fbd49b9d5ec44816902b6ee9f >>>>> >>>>> commit 005aa1743b42b52fbd49b9d5ec44816902b6ee9f >>>>> Author: Ka Ho Ng <khng@FreeBSD.org> >>>>> AuthorDate: 2023-07-01 19:41:53 +0000 >>>>> Commit: Ka Ho Ng <khng@FreeBSD.org> >>>>> CommitDate: 2023-07-01 22:58:46 +0000 >>>>> >>>>> modules: bzero the modspecific_t >>>>> >>>>> Per https://reviews.llvm.org/D68115, only the first field is >>>>> zero-initialized, meanwhile other fields are undef. >>>> This is not true. >>>> >>>>> >>>>> The pattern can be observed on clang as well, that when >>>>> -ftrivial-auto-var-init=pattern is specified 0xaa is filled for >>>>> non-active fields, otherwise they are zero-initialized. >>>> What are non-active fields? >>>> All fields with implicit initializers >>>> "shall be initialized implicitly the same as >>>> objects that have static storage duration." >>>> >>>> I do not think this is required for padding. >>>> >>> In that case, modspecific_t ms did become 0xaaaaaaaa00000000 on amd64 if >>> -ftrivial-auto-var-init=pattern was specified. >>> >>> >>>> >>>>> Technically both are acceptable when using clang. However it >>>>> would be good to simply bzero the modspecific_t in such case to >>>>> be strict to the standard. >>>>> >>>>> MFC with: 2cab2d43b83b >>>>> MFC after: 1 day >>>> Min MFC period is 3 days. >>>> >>>>> Sponsored by: Juniper Networks, Inc. >>>>> Reviewed by: delphij >>>>> Differential Revision: https://reviews.freebsd.org/D40830 >>>>> --- >>>>> sys/kern/kern_syscalls.c | 3 ++- >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/sys/kern/kern_syscalls.c b/sys/kern/kern_syscalls.c >>>>> index 234e51cfd280..0b51edf5e985 100644 >>>>> --- a/sys/kern/kern_syscalls.c >>>>> +++ b/sys/kern/kern_syscalls.c >>>>> @@ -173,9 +173,10 @@ kern_syscall_module_handler(struct sysent >>>> *sysents, struct module *mod, >>>>> int what, void *arg) >>>>> { >>>>> struct syscall_module_data *data = arg; >>>>> - modspecific_t ms = { 0 }; >>>>> + modspecific_t ms; >>>>> int error; >>>>> >>>>> + bzero(&ms, sizeof(ms)); >>>>> switch (what) { >>>>> case MOD_LOAD: >>>>> error = kern_syscall_register(sysents, data->offset, >>> >>> >>> Ka Ho >>> >> Since I missed the reply-all button just now, I resend this email. >> >> Reading on N2716 one of the footnote stated: >> "Note that aggregate type does not include union type because an object >> with union type can only contain one member at a time" >> >> And below at 6.7.9.21 only aggregate was mentioned but not union, which >> matched what I observed with an `cc -ftrivial-auto-var-init=pattern` >> invocation. > > The context for modspecific_t is (looking in my source > tree for main): > > # grep -r modspecific_t /usr/main-src/sys/ | more > /usr/main-src/sys/sys/module.h:} modspecific_t; > /usr/main-src/sys/sys/module.h:void module_setspecific(module_t, modspecific_t *); > /usr/main-src/sys/sys/module.h: modspecific_t data; > /usr/main-src/sys/kern/kern_module.c: modspecific_t data; /* module specific data */ > /usr/main-src/sys/kern/kern_module.c:module_setspecific(module_t mod, modspecific_t *datap) > /usr/main-src/sys/kern/kern_module.c: modspecific_t data; > /usr/main-src/sys/kern/kern_module.c: modspecific_t data; > /usr/main-src/sys/kern/kern_syscalls.c: modspecific_t ms; > > This is C code, not C++ code. The languages are not fully > matching in various details in some similar areas, especially > when the vintages are not from a similar timeframe. Also, > modspecific_t is a union: > > /* > * A module can use this to report module specific data to the user via > * kldstat(2). > */ > typedef union modspecific { > int intval; > u_int uintval; > long longval; > u_long ulongval; > } modspecific_t; > > When I look up C's N2716 in: > > https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log.htm > > I find: "N2716 2021/05/09 Tydeman, Numerically equal" with the > link https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2716.htm . > Its summary line is: > > QUOTE > Summary: The phrases "numerically equal" and "numerically equivalent" are used, but never defined. Also, they seem to add no value to just "equal". > END QUOTE > > (There is also: "N2847 2021/10/15 Thomas, C23 proposal - Revised > suggested change from N2716" with the link: > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2847.pdf .) > > This has nothing to do with what you are writing about. > > For C code, I suggest only using C langauge specification > materials, not C++, unless the same source is to have > dual language use for some reason. (I see no indication > of such here. If such use is the intended context, please > be explicit about that.) > > Also, using material from 2021 when FreeBSD is based on a > older C standard version can have its own problems. If I > remember right, FreeBSD is basically targeting C11 or > so for its C code. > > An example quote from footnote 310 (for memcmp) in ISO/IEC > 9899:2011 is: > > QUOTE > The contents of "holes" used as padding for purposes of > alignment within structure objects are indeterminate. > Strings shorter than the allocated space and unions may > also cause problems in comparison. > END QUOTE > > Initializing .intval and then accessing .ulongval resulting > in 0xaaaaaaaa00000000 as a possible result should be no > surprise for C as far as I can tell. Based on what I'm > reading, code that assumes otherwise is incorrect relative > to the C language standard. > > It looks to me like the bzero use is trying to make changes > to allow non-standard C code "still work". I missed the relationship between the earlier: git: 2cab2d43b83b - main - syscalls: fix modspecific_t stack content leak and: git: 005aa1743b42 - main - modules: bzero the modspecific_t and so did not comment on the implications in that direction. One could read my notes as indicating that the bzero is inappropriate to the memory content leak. What I wrote instead supports use of bzero (or some such) for avoiding memory content leaks. For: typedef union modspecific { int intval; u_int uintval; long longval; u_long ulongval; } modspecific_t; Code like: { . . . modspecific_t ms = { 0 }; . . . assigns to .intval and falls under what I quoted about "holes" and such. Also, ISO/IEC 9899:2011's explicit list of unspecified behavior in J.1 Unspecified behavior reports: -- The value of padding bytes when storing values in structures or unions (6.2.6.1) -- The values of bytes that correspond to union members other than the one last stored into (6.2.6.1) "modspecific_t ms = { 0 };" stores specifically into intval under the language standard's rules, and there are bytes in longval and ulongval that are not in intval on at least some platforms for FreeBSD. Thus the language standard indicates that assignment leaves the values of various bytes in the union unspecified --and that allows the leak of some stack content as a possibility. The bzero avoids that. Sorry if I caused any confusion with my to narrow of a focus. === Mark Millard marklmi at yahoo.com