64bit ticks, was Re: Changing p_swtime and td_slptime to ticks
Bruce Evans
brde at optusnet.com.au
Tue Sep 18 21:57:21 PDT 2007
On Tue, 18 Sep 2007, Julian Elischer wrote:
> Sam Leffler wrote:
>> Jeff Roberson wrote:
>>> On Tue, 18 Sep 2007, Jeff Roberson wrote:
>>>
>>>> On Tue, 18 Sep 2007, Andre Oppermann wrote:
>>>>> ...
>>>>> ticks is 2^31 on x86 and at HZ=1000 is wraps within a reasonable
>>>>> short uptime. You have to make sure that your code handles that
>>>>> correctly or you run into lots of strange effects which are almost
>>>>> impossible to reproduce. In TCP we've got bitten by that.
>>>>
>>>> Thanks Andre, this is a good point. For the td_slptime I don't think
>>>> it's of practical concern. However, for swtime I think I will convert it
>>>> then to seconds from boot.
IIRC, the patch only uses deltas of `ticks', so its wraparound doesn't
matter, provided you actually use it often enough. Often enough is
at least once every 497 days with HZ = 100, or just every 49.7 days
with HZ = 1000.
>>> Is there a good reason for not making ticks 64bit?
I think this would be only a small pessimization, unless accesses to
`ticks' are fully locked or made atomic. The clock interrupt handler
can interrupt almost anything, but almost nothing locks out clock
interrupts. The lock for the write access to `ticks' in hardclock() is
(rather bogusly) callout_lock, and sofclock() uses this lock for read
accesses without really trying, but generic code can't be expected to
know about this and in fact doesn't know about it -- `ticks' seems to
be used mainly in netinet where there is no callout_lock'ing in sight.
Without locking: with a 32-bit `ticks', read accesses not using atomic_read()
are probably atomic, so the race would just give an out of date value,
and this shouldn't matter -- the reader cannot tell the difference
between a value that is out of date when it is read or one that becomes
out of date 1 instruction after it is read; with a 64-bit `ticks' on
32-bit machines, read accesses would not be atomic and the value would
sometimes be garbage.
>>> math involving this
>>> value is relatively infrequent. Bruce? Any comments? It'd sure let us
>>> forget all of these counter wrapping problems.
>>
>> ticks is used a lot. I'd rather set hz back to 100 by default. This
>> approach is a perfect example of ignoring low-end platforms.
I agree. However, one variable isn't going to make much difference. fs
code is full of unecessary 64-bit calculations but no one notices the
real-time pessimization from this, at least on non-low-end platforms (I
only notice the code bloat).
> but it still needs to work on systems with high hz values.
Just changing its type won't fix all such problems, but will cause new
ones. E.g., in tcp_timer():
% int tick = ticks;
This will truncate `ticks'. Truncation is the right thing to do in most
places that only need a small delta, but...
% ...
% CTR6(KTR_NET, "%p %s inp %p active %x tick %i nextc %i",
% tp, __func__, inp, tt->tt_active, tick, tt->tt_nextc);
Easy to detect breakage of the printf format. Old printf format errors:
%p for types other than void *.
% ...
% if (tick < tt->tt_nextc)
% goto rescan;
Comparison of truncated value, OK (since both tick and tt_nextc have
type int and int overflow is benign on all supported machines). Types
should be u_int to avoid undefined behaviour on int overflow.
% ...
% if (tp->t_state != TCPS_TIME_WAIT &&
% (ticks - tp->t_rcvtime) <= tcp_maxidle)
% tcp_timer_activate(tp, TT_2MSL, tcp_keepintvl);
This is already broken on all supported machines, since t_rcvtime has
type u_long which is incompatible with the type of both `ticks' and
tcp_maxidle (both int). All the types should be truncated to a common
unsigned type to get defined behaviour that is not too hard to understand.
I think the current behaviour is benign overflow -- t_rcvtime is set
to an earlier value of `ticks', and the difference never grows very
large. However, changing `ticks' to 64 bits breaks this on non-64-bit
machines, -- `ticks' may grow large, but the copy of it in t_rcvtime
is limited to u_long = 32 bits on non-64-bit machines, so after about
2^32 ticks of uptime (ticks - tp->t_rcvtime) will always exceed
tcp_maxidle.
Bruce
More information about the freebsd-arch
mailing list