Re: git: 005aa1743b42 - main - modules: bzero the modspecific_t

From: Brooks Davis <brooks_at_freebsd.org>
Date: Wed, 05 Jul 2023 18:10:45 UTC
On Mon, Jul 03, 2023 at 04:20:41PM -0700, Mark Millard wrote:
> On Jul 3, 2023, at 15:27, Mark Millard <marklmi@yahoo.com> wrote:
> 
> > Brooks Davis <brooks_at_freebsd.org> wrote on
> > Date: Mon, 03 Jul 2023 21:24:11 UTC :
> > 
> >> On Sat, Jul 01, 2023 at 10:59:22PM +0000, Ka Ho Ng wrote:
> >>> The branch main has been updated by khng:
> >>> 
> >>> URL: https://cgit.FreeBSD.org/src/commit/?id=005aa1743b42b52fbd49b9d5ec44816902b6ee9f
> >>> 
> >>> commit 005aa1743b42b52fbd49b9d5ec44816902b6ee9f
> >>> Author: Ka Ho Ng <khng@FreeBSD.org>
> >>> AuthorDate: 2023-07-01 19:41:53 +0000
> >>> Commit: Ka Ho Ng <khng@FreeBSD.org>
> >>> CommitDate: 2023-07-01 22:58:46 +0000
> >>> 
> >>> modules: bzero the modspecific_t
> >>> 
> >>> Per https://reviews.llvm.org/D68115, only the first field is
> >>> zero-initialized, meanwhile other fields are undef.
> >>> 
> >>> The pattern can be observed on clang as well, that when
> >>> -ftrivial-auto-var-init=pattern is specified 0xaa is filled for
> >>> non-active fields, otherwise they are zero-initialized.
> >>> Technically both are acceptable when using clang. However it
> >>> would be good to simply bzero the modspecific_t in such case to
> >>> be strict to the standard.
> >> 
> >> IMO this is a move in the wrong direction. We should see about
> >> switching this file to C17 which IIRC removes this bug in the standard.
> >> 
> >> Ideally we'd be moving to C23 where we can just do foo = {}
> >> to zero things, but we've got a ways to go...
> > 
> > Can you point me to where some (draft?) C?? standard material indicates
> > that:
> > 
> > A) pad bytes are to be determined to have a specific value?
> > 
> > B) union bytes unused by a smaller size field that is the one initialized
> >    are to be determined to have a specific value?
> > 
> > My copy of N2176 for ISO/IEC 9899:2017 still has the J.1 Unspecified
> > behavior wording:
> > 
> > -- The value of padding bytes when storing values in structures
> >    or unions (6.2.6.1)
> > 
> > -- The values of bytes that correspond to union members other
> >    than the one last stored into (6.2.6.1)
> > 
> > As long as those are true, initializer notation is not guaranteed
> > to avoid memory content leakage for the padding bytes and unused
> > bytes for smaller union fields.
> > 
> > (I'll not generate wording to deal with trap representations for such
> > issues, something C++ avoids.)
> > 
> 
> I just got a copy of N3096 for ISO/IEC 9899:2023 and it still
> reports for memcmp (note 379):
> 
> QUOTE
> The unused bytes used as padding for purposes of alignment within
> struture objects take on unspecified values when a value is stored
> in the object (see 6.2.6.1). Strings shorter than their allocated
> space and unions can also cause problems in comparison.
> END QUOTE
> 
> The J.1 Unspecfied behavior items are still there as well. [These
> are numbered in the C23 draft: (10) and (11).]
> 
> Such suggests no "fix" is present in that C23 draft.

I was wrong about padding being corrected :(, however, C23's empty
initializizer (struct foo = {};) does guarantee zeroing and we should
be moving to it as soon as the short list of compilers we care about it
support it.

For the original commit, I think it's entirely harmless to leak the pad
with 0xaa's.  The details of which fields are explicitly initialized is
not a secret.

-- Brooks