Re: widening ticks

From: Warner Losh <imp_at_bsdimp.com>
Date: Sat, 11 Jan 2025 20:12:24 UTC
Why not have jiffiesjust be an alias for tickl at the assembler level, then
just have extern unsigned long jiffies; so the types match and we don't
have fragile macros? At the assembler level, long and unsigned long are the
sane for object definition.

Warner

On Sat, Jan 11, 2025, 12:36 PM Tomoaki AOKI <junchoon@dec.sakura.ne.jp>
wrote:

> On Sat, 11 Jan 2025 11:34:06 -0500
> Mark Johnston <markj@freebsd.org> wrote:
>
> > On Sat, Jan 11, 2025 at 01:11:06PM +0900, Tomoaki AOKI wrote:
> > > On Wed, 8 Jan 2025 18:07:47 -0500
> > > Mark Johnston <markj@freebsd.org> wrote:
> > >
> > > > On Thu, Jan 09, 2025 at 12:18:48AM +0200, Konstantin Belousov wrote:
> > > > > On Wed, Jan 08, 2025 at 04:31:16PM -0500, Mark Johnston wrote:
> > > > > > The global "ticks" variable counts hardclock ticks, it's widely
> used in
> > > > > > the kernel for low-precision timekeeping.  The linuxkpi provides
> a very
> > > > > > similar variable, "jiffies", but there's an incompatibility: the
> former
> > > > > > is a signed int and the latter is an unsigned long.  It's not
> > > > > > particularly easy to paper over this difference, which has been
> > > > > > responsible for some nasty bugs, and modifying drivers to store
> the
> > > > > > jiffies value in a signed int is error-prone and a maintenance
> burden
> > > > > > that the linuxkpi is supposed to avoid.
> > > > > >
> > > > > > It would be nice to provide a compatible implementation of
> jiffies.  I
> > > > > > can see a few approaches:
> > > > > > - Define a 64-bit ticks variable, say ticks64, and make
> hardclock()
> > > > > >   update both ticks and ticks64.  Then #define jiffies ticks64
> on 64-bit
> > > > > >   platforms.  This is the simplest to implement, but it adds
> extra work
> > > > > >   to hardclock() and is somewhat ugly.
> > > > > > - Make ticks an int64_t or a long and convert our native code
> > > > > >   accordingly.  This is cleaner but requires a lot of auditing
> to avoid
> > > > > >   introducing bugs, though perhaps some code could be left
> unmodified,
> > > > > >   implicitly truncating the value to an int.  For example I think
> > > > > >   sched_pctcpu_update() is fine.  I've gotten an amd64 kernel to
> compile
> > > > > >   and boot with this change, but it's hard to be confident in
> it.  This
> > > > > >   approach also has the potential downside of bloating
> structures that
> > > > > >   store a ticks value, and it can't be MFCed.
> > > > > > - Introduce a 64-bit ticks variable, ticks64, and
> > > > > >   #define ticks ((int)ticks64).  This requires renaming any
> struct
> > > > > >   fields and local vars named "ticks", of which there's a decent
> number,
> > > > > >   but that can be done fairly mechanically.
> > > > > >
> > > > > > Is there another solution which avoids these pitfalls?  If not,
> should
> > > > > > we go ahead with one of these approaches?  If so, which one?
> > > > >
> > > > > You cannot do this in C, but can in asm:
> > > > >         .data
> > > > >         .globl  ticksl, ticks
> > > > >         .type   ticksl, @object
> > > > >         .type   ticks, @object
> > > > > ticksl: .quad
> > > > >         .size   ticksl, 8
> > > > > ticks   =ticksl         /* for little-endian */
> > > > > /* ticks        =ticksl + 4  for big-endian */
> > > > >         .size   ticks, 4
> > > > >
> > > > >
> > > > > Then update only ticksl in the hardclock().
> > > >
> > > > I implemented your suggestion here:
> https://reviews.freebsd.org/D48383
> > >
> > > As this is already committed to main, commenting here instead of review
> > > D48383.
> > >
> > > Maybe I'm too paranoid and overlooking something, but...
> > >
> > > *If "jiffies" in LinuxKPI is really unsigned, isn't there any
> > >  possibilities that relies on its value to be larger than
> > >  0x7fffffffffffffff as a threshold?
> > >  (Yes, it should be silly and non-realistic, but theoretically
> > >   possible.)
> >
> > Ideally we would have
> >
> >     #define jiffies ((unsigned long)ticksl)
> >
> > in the linuxkpi, but some Linux code uses "jiffies" as a struct field or
> > local variable name, so this doesn't quite work.
> >
> > In practice, the value is usually assigned to an unsigned long or used
> > as an operand where it would be implicitly promoted to an unsigned type,
> > so we don't see any incompatibilities.
> >
> > When jiffies is an int, code like the following can misbehave:
> >
> >       unsigned long remain, timeout = jiffies + const;
> >       ...
> >       remain = timeout - jiffies;
> >       if ((long)remain < 0)
> >               /* timed out */
> >
> > If (int)timeout and jiffies have different signs, as might happen close
> > to a rollover, the comparison won't work as expected.
> >
> > Linux has some macros (time_after() etc.) which are supposed to be used
> > instead of direct comparisons, but they're not always used.
>
> So ticksl should better be unsigned long if there's no reason to keep
> it signed, isn't it?
>
>
> > > *Is anywhere checking carry (sign) bit for int on LP32?
> > >  Maybe it would be the reason if "jiffies" in LinuxKPI is really
> > >  unsigned.
> >
> > Could you provide an example of what you mean?
>
> Not an example of code, but for example, when ticksl is at
> 0x7fffffffffffffff (positive value), ticks shoule be 0xffffffff
> (negative value), if I read the diff correctly.
> The same thing starts happening ticksl is at 0x0000000080000000 throug
> 0x00000000ffffffff and values alike. So signs (carry bits, usually the
> leftmost bit of each) should be checked separately for ticksl and ticks.
>
> Am I (hopefully) overlooking something?
>
> --
> Tomoaki AOKI    <junchoon@dec.sakura.ne.jp>
>
>