cvs commit: src/sys/net bpf.c
Bruce Evans
bde at zeta.org.au
Mon Jul 31 09:11:09 UTC 2006
On Tue, 25 Jul 2006, Jung-uk Kim wrote:
> On Tuesday 25 July 2006 03:01 pm, David Malone wrote:
>>
>> It sounds to me like a reasonable thing to do would be to pass up
>> a raw version of the timestamp (as returned by the hardware). We'd
>> also pass up the regular microtime() timestamp. You can then do any
>> postprocessing to syncronise timestamps later in userland?
>
> Nope. In that case, you actually need to export few more things,
> i.e., current hardware timecounter value, clock frequency, size of
> the timecounter, etc. Even then, it's going to be hard to get
> correct timeval without exposing few kernel internals.
Synchronization is so hard to do that it is not done even in the kernel
where all the variables are directly accessible (modulo locking), even
for cases that are much more important. E.g.:
- synchronization of TSCs across CPUs. This might require IPIs so it
might be very inefficient. If all CPUs are driven by the same
hardware clock then they might stay in sync even when the clock is
throttled. Then IPIs would not be needed and the synchronization
problems reduce to the next one. Otherwise it is difficult to
keep the TSCs perfectly in sync even with IPIs and the next problem
might need to be solved anyway (to keep the TSCs in sync with
something).
- synchronization of TSCs (or other efficient but possibly unstable
timecounters) with "higher" quality timecounters (ones that are
inefficient but possibly more stable). Before timecounters or SMP
or much CPU throttling, the i386 TSC was synced with the i8254 on
every clock tick. This worked OK, but was missing recalibration of
the TSC and smoothing of jumps at sync points, and with CPU throttling
recalibration is necessary else the jumps could be very large and
remain large. Now there is some synchronization of "cpu ticks" with
the active timecounter. This is missing almost the opposite things
-- it has recalibration and doesn't need smoothing of jumps only
since it doesn't have the jumps necessary for synchronhization.
- synchronization of timecounters with themselves. The get*time()
functions are not properly synchronized with the non-get versions,
although this breaks the "get" versions, because proper synchronization
would be less efficent and/or complicated. Synchronization only
occurs every few msec in tc_windup(), but this is not enough for
proper synchronization. E.g., timestamps made using time_second (as
most file systems do) can be more that 1 second in the past relative
to the current time, since updates of time_second are normally delayed
by several msec. Userland can see this bug using code like
"now = time(NULL); utimes(file, NULL); stat(file, &sb);
assert(sb.st_mtime >= now);" -- time(3) uses microtime(9) and correctly
rounds to seconds, while utimes(2) normally uses time_second which
is the current time incorrectly rounded to seconds. I used to fix
this in the non-SMP case by syncing time_second and other offsets
in every call to a non-get function, using hackish locking that
only works in the non-SMP case.
>>> Okay. But I am worried about timecounter <-> timeval conversion
>>> because I want to know timeval delta from system time, not just
>>> some timer value.
To get the delta, you would have to read the system time (not using a
"get" function) so things might be slower than just reading the system
time for everything. I think only cases where the hardware writes
timestamps using DMA are interesting (if the timestamps involve bus
accesses then they are likely to be slower than ACPI-"fast" ones which
are hundreds of times slower than TSC accesses on most systems). Then
the timestamps would have been made a relatively long time in the past
and you would prefer to know the system time at which they were made,
but it is impossible to know that time precisely. It is only possible
to compare with the current time. The comparision might not need to
be very precise but it should avoid obvious bugs like the ones for
file times:
now = time(NULL); assert(now >= packettime.tv_sec);
Hardware could easily make incoherent timestamps here and then the
system shouldn't just blindly convert them into negative deltas, etc.
Bruce
More information about the cvs-src
mailing list