Re: widening ticks

From: Tomoaki AOKI <junchoon_at_dec.sakura.ne.jp>
Date: Sat, 11 Jan 2025 22:42:18 UTC
On Sat, 11 Jan 2025 13:12:24 -0700
Warner Losh <imp@bsdimp.com> wrote:

> Why not have jiffiesjust be an alias for tickl at the assembler level, then
> just have extern unsigned long jiffies; so the types match and we don't
> have fragile macros? At the assembler level, long and unsigned long are the
> sane for object definition.

Basically I like this idea, but why I didn't proposed this is that
"jiffies" is stated (sorry, not read the LinuxKPI code) to be unsigned
long, while tickl is defined as (signed) int in the diff committed.

Yes, I know it's just how the bit pattern of the same size is
interpreted, but would need some safeguards in C part.

> 
> Warner
> 
> On Sat, Jan 11, 2025, 12:36 PM Tomoaki AOKI <junchoon@dec.sakura.ne.jp>
> wrote:
> 
> > On Sat, 11 Jan 2025 11:34:06 -0500
> > Mark Johnston <markj@freebsd.org> wrote:
> >
> > > On Sat, Jan 11, 2025 at 01:11:06PM +0900, Tomoaki AOKI wrote:
> > > > On Wed, 8 Jan 2025 18:07:47 -0500
> > > > Mark Johnston <markj@freebsd.org> wrote:
> > > >
> > > > > On Thu, Jan 09, 2025 at 12:18:48AM +0200, Konstantin Belousov wrote:
> > > > > > On Wed, Jan 08, 2025 at 04:31:16PM -0500, Mark Johnston wrote:
> > > > > > > The global "ticks" variable counts hardclock ticks, it's widely
> > used in
> > > > > > > the kernel for low-precision timekeeping.  The linuxkpi provides
> > a very
> > > > > > > similar variable, "jiffies", but there's an incompatibility: the
> > former
> > > > > > > is a signed int and the latter is an unsigned long.  It's not
> > > > > > > particularly easy to paper over this difference, which has been
> > > > > > > responsible for some nasty bugs, and modifying drivers to store
> > the
> > > > > > > jiffies value in a signed int is error-prone and a maintenance
> > burden
> > > > > > > that the linuxkpi is supposed to avoid.
> > > > > > >
> > > > > > > It would be nice to provide a compatible implementation of
> > jiffies.  I
> > > > > > > can see a few approaches:
> > > > > > > - Define a 64-bit ticks variable, say ticks64, and make
> > hardclock()
> > > > > > >   update both ticks and ticks64.  Then #define jiffies ticks64
> > on 64-bit
> > > > > > >   platforms.  This is the simplest to implement, but it adds
> > extra work
> > > > > > >   to hardclock() and is somewhat ugly.
> > > > > > > - Make ticks an int64_t or a long and convert our native code
> > > > > > >   accordingly.  This is cleaner but requires a lot of auditing
> > to avoid
> > > > > > >   introducing bugs, though perhaps some code could be left
> > unmodified,
> > > > > > >   implicitly truncating the value to an int.  For example I think
> > > > > > >   sched_pctcpu_update() is fine.  I've gotten an amd64 kernel to
> > compile
> > > > > > >   and boot with this change, but it's hard to be confident in
> > it.  This
> > > > > > >   approach also has the potential downside of bloating
> > structures that
> > > > > > >   store a ticks value, and it can't be MFCed.
> > > > > > > - Introduce a 64-bit ticks variable, ticks64, and
> > > > > > >   #define ticks ((int)ticks64).  This requires renaming any
> > struct
> > > > > > >   fields and local vars named "ticks", of which there's a decent
> > number,
> > > > > > >   but that can be done fairly mechanically.
> > > > > > >
> > > > > > > Is there another solution which avoids these pitfalls?  If not,
> > should
> > > > > > > we go ahead with one of these approaches?  If so, which one?
> > > > > >
> > > > > > You cannot do this in C, but can in asm:
> > > > > >         .data
> > > > > >         .globl  ticksl, ticks
> > > > > >         .type   ticksl, @object
> > > > > >         .type   ticks, @object
> > > > > > ticksl: .quad
> > > > > >         .size   ticksl, 8
> > > > > > ticks   =ticksl         /* for little-endian */
> > > > > > /* ticks        =ticksl + 4  for big-endian */
> > > > > >         .size   ticks, 4
> > > > > >
> > > > > >
> > > > > > Then update only ticksl in the hardclock().
> > > > >
> > > > > I implemented your suggestion here:
> > https://reviews.freebsd.org/D48383
> > > >
> > > > As this is already committed to main, commenting here instead of review
> > > > D48383.
> > > >
> > > > Maybe I'm too paranoid and overlooking something, but...
> > > >
> > > > *If "jiffies" in LinuxKPI is really unsigned, isn't there any
> > > >  possibilities that relies on its value to be larger than
> > > >  0x7fffffffffffffff as a threshold?
> > > >  (Yes, it should be silly and non-realistic, but theoretically
> > > >   possible.)
> > >
> > > Ideally we would have
> > >
> > >     #define jiffies ((unsigned long)ticksl)
> > >
> > > in the linuxkpi, but some Linux code uses "jiffies" as a struct field or
> > > local variable name, so this doesn't quite work.
> > >
> > > In practice, the value is usually assigned to an unsigned long or used
> > > as an operand where it would be implicitly promoted to an unsigned type,
> > > so we don't see any incompatibilities.
> > >
> > > When jiffies is an int, code like the following can misbehave:
> > >
> > >       unsigned long remain, timeout = jiffies + const;
> > >       ...
> > >       remain = timeout - jiffies;
> > >       if ((long)remain < 0)
> > >               /* timed out */
> > >
> > > If (int)timeout and jiffies have different signs, as might happen close
> > > to a rollover, the comparison won't work as expected.
> > >
> > > Linux has some macros (time_after() etc.) which are supposed to be used
> > > instead of direct comparisons, but they're not always used.
> >
> > So ticksl should better be unsigned long if there's no reason to keep
> > it signed, isn't it?
> >
> >
> > > > *Is anywhere checking carry (sign) bit for int on LP32?
> > > >  Maybe it would be the reason if "jiffies" in LinuxKPI is really
> > > >  unsigned.
> > >
> > > Could you provide an example of what you mean?
> >
> > Not an example of code, but for example, when ticksl is at
> > 0x7fffffffffffffff (positive value), ticks shoule be 0xffffffff
> > (negative value), if I read the diff correctly.
> > The same thing starts happening ticksl is at 0x0000000080000000 throug
> > 0x00000000ffffffff and values alike. So signs (carry bits, usually the
> > leftmost bit of each) should be checked separately for ticksl and ticks.
> >
> > Am I (hopefully) overlooking something?
> >
> > --
> > Tomoaki AOKI    <junchoon@dec.sakura.ne.jp>


-- 
Tomoaki AOKI    <junchoon@dec.sakura.ne.jp>