Re: widening ticks
- Reply: Tomoaki AOKI : "Re: widening ticks"
- In reply to: Tomoaki AOKI : "Re: widening ticks"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 11 Jan 2025 23:00:12 UTC
On Sun, Jan 12, 2025 at 07:50:38AM +0900, Tomoaki AOKI wrote: > On Sat, 11 Jan 2025 17:35:36 -0500 > Mark Johnston <markj@freebsd.org> wrote: > > > On Sun, Jan 12, 2025 at 04:35:43AM +0900, Tomoaki AOKI wrote: > > > On Sat, 11 Jan 2025 11:34:06 -0500 > > > Mark Johnston <markj@freebsd.org> wrote: > > > > > > > On Sat, Jan 11, 2025 at 01:11:06PM +0900, Tomoaki AOKI wrote: > > > > > On Wed, 8 Jan 2025 18:07:47 -0500 > > > > > Mark Johnston <markj@freebsd.org> wrote: > > > > > > > > > > > On Thu, Jan 09, 2025 at 12:18:48AM +0200, Konstantin Belousov wrote: > > > > > > > On Wed, Jan 08, 2025 at 04:31:16PM -0500, Mark Johnston wrote: > > > > > > > > The global "ticks" variable counts hardclock ticks, it's widely used in > > > > > > > > the kernel for low-precision timekeeping. The linuxkpi provides a very > > > > > > > > similar variable, "jiffies", but there's an incompatibility: the former > > > > > > > > is a signed int and the latter is an unsigned long. It's not > > > > > > > > particularly easy to paper over this difference, which has been > > > > > > > > responsible for some nasty bugs, and modifying drivers to store the > > > > > > > > jiffies value in a signed int is error-prone and a maintenance burden > > > > > > > > that the linuxkpi is supposed to avoid. > > > > > > > > > > > > > > > > It would be nice to provide a compatible implementation of jiffies. I > > > > > > > > can see a few approaches: > > > > > > > > - Define a 64-bit ticks variable, say ticks64, and make hardclock() > > > > > > > > update both ticks and ticks64. Then #define jiffies ticks64 on 64-bit > > > > > > > > platforms. This is the simplest to implement, but it adds extra work > > > > > > > > to hardclock() and is somewhat ugly. > > > > > > > > - Make ticks an int64_t or a long and convert our native code > > > > > > > > accordingly. This is cleaner but requires a lot of auditing to avoid > > > > > > > > introducing bugs, though perhaps some code could be left unmodified, > > > > > > > > implicitly truncating the value to an int. For example I think > > > > > > > > sched_pctcpu_update() is fine. I've gotten an amd64 kernel to compile > > > > > > > > and boot with this change, but it's hard to be confident in it. This > > > > > > > > approach also has the potential downside of bloating structures that > > > > > > > > store a ticks value, and it can't be MFCed. > > > > > > > > - Introduce a 64-bit ticks variable, ticks64, and > > > > > > > > #define ticks ((int)ticks64). This requires renaming any struct > > > > > > > > fields and local vars named "ticks", of which there's a decent number, > > > > > > > > but that can be done fairly mechanically. > > > > > > > > > > > > > > > > Is there another solution which avoids these pitfalls? If not, should > > > > > > > > we go ahead with one of these approaches? If so, which one? > > > > > > > > > > > > > > You cannot do this in C, but can in asm: > > > > > > > .data > > > > > > > .globl ticksl, ticks > > > > > > > .type ticksl, @object > > > > > > > .type ticks, @object > > > > > > > ticksl: .quad > > > > > > > .size ticksl, 8 > > > > > > > ticks =ticksl /* for little-endian */ > > > > > > > /* ticks =ticksl + 4 for big-endian */ > > > > > > > .size ticks, 4 > > > > > > > > > > > > > > > > > > > > > Then update only ticksl in the hardclock(). > > > > > > > > > > > > I implemented your suggestion here: https://reviews.freebsd.org/D48383 > > > > > > > > > > As this is already committed to main, commenting here instead of review > > > > > D48383. > > > > > > > > > > Maybe I'm too paranoid and overlooking something, but... > > > > > > > > > > *If "jiffies" in LinuxKPI is really unsigned, isn't there any > > > > > possibilities that relies on its value to be larger than > > > > > 0x7fffffffffffffff as a threshold? > > > > > (Yes, it should be silly and non-realistic, but theoretically > > > > > possible.) > > > > > > > > Ideally we would have > > > > > > > > #define jiffies ((unsigned long)ticksl) > > > > > > > > in the linuxkpi, but some Linux code uses "jiffies" as a struct field or > > > > local variable name, so this doesn't quite work. > > > > > > > > In practice, the value is usually assigned to an unsigned long or used > > > > as an operand where it would be implicitly promoted to an unsigned type, > > > > so we don't see any incompatibilities. > > > > > > > > When jiffies is an int, code like the following can misbehave: > > > > > > > > unsigned long remain, timeout = jiffies + const; > > > > ... > > > > remain = timeout - jiffies; > > > > if ((long)remain < 0) > > > > /* timed out */ > > > > > > > > If (int)timeout and jiffies have different signs, as might happen close > > > > to a rollover, the comparison won't work as expected. > > > > > > > > Linux has some macros (time_after() etc.) which are supposed to be used > > > > instead of direct comparisons, but they're not always used. > > > > > > So ticksl should better be unsigned long if there's no reason to keep > > > it signed, isn't it? > > > > Well, I kept it signed since it's meant to be similar in usage to ticks. > > With a signed counter, you can check test whether a value has passed by > > looking at the sign of the difference between ticks(l) and that value > > (modulo rollover). With an unsigned counter, you need some casting, as > > in the example above. > > > > > > > *Is anywhere checking carry (sign) bit for int on LP32? > > > > > Maybe it would be the reason if "jiffies" in LinuxKPI is really > > > > > unsigned. > > > > > > > > Could you provide an example of what you mean? > > > > > > Not an example of code, but for example, when ticksl is at > > > 0x7fffffffffffffff (positive value), ticks shoule be 0xffffffff > > > (negative value), if I read the diff correctly. > > > The same thing starts happening ticksl is at 0x0000000080000000 throug > > > 0x00000000ffffffff and values alike. So signs (carry bits, usually the > > > leftmost bit of each) should be checked separately for ticksl and ticks. > > > > That's true, but I can't see why any code would care about this? > > While ticks is defined as (signed) int, it shoule be turnaround when it > reaches at 0x7fffffff (as incrementing it causes overflow). > Is ticks allowed to be minus value? My guess is that it is monotonic > counter. Yes, INT_MAX ticks elapse in approximately 25 days at 1000Hz. In fact, ticks is initialized to INT_MAX - <small number> in subr_param.c so that it wraps around shortly after boot, after which it is negative. Kernel code should not care about the sign of ticks.