clock problem
Matthew Dillon
dillon at apollo.backplane.com
Fri May 11 21:37:27 UTC 2007
:One of our customers has 6 GPS-locked NTP servers. Only problem is
:that two of them are reporting a time that is exactly one second
:different to the other four. You shouldn't rely solely on your
:GPS or DCF receiver - use it as the primary source but have some
:secondary sources for sanity checks. (From experience, I can state
:that ntpd does not behave well when presented with two stratum 1
:servers that differ by 1 second).
:
:--=20
:Peter Jeremy
Ntp will also become really unhappy when chunky time slips occur
or if the skew rate is more then a few hundred ppm. Ntp will also blow
up if it loses the network link for a long period of time. It will just
give up and stop making corrections entirely, even after the link is
restored. This is particularly true when it is used over a dialup
(me having done that for over a year in 1997, so I can tell you how
badly it works).
A slow time slip over a day could still be chunky, which would imply
lost interrupts. Determining whether the problem is due to an 8254
rollover or lost hardclock interrupts is easy... just set 'hz' to
something really high, like 20000, and see if your time goes crazy.
If it does, then you have your culprit.
I don't know if those bugs are still present in FreeBSD, but I do
remember that I had to redo all the timekeeping in DragonFly because
lost interrupts from high 'hz' settings were causing timekeeping to
go nuts. That turned out to mainly be due to the same 8254 timer being
used to generate the hardclock interrupt AND handle time keeping.
i.e. at high hz settings one was not getting the full 1/18 second
benefit from the timer. You just can't do that... it doesn't work.
It is almost 100% guarenteed to result in a bad time base.
It is easy to test.. just set your kern.hz in the boot env, reboot,
and see if things blow up or not. Time keeping should be stable
regardless of what hz is set to (provisio: never set hz less then 100).
Unfortunately, all the timebases in the system have their own quirks.
Blame the hardware manufacturers. The 8254 timer 0 is actually the
MOST consistent of the lot, with the ACPI timer coming a close second.
TSC Haha. Good luck. Nice wide timer, easy to read,
but any power savings mode, including the failsafe
modes that intel has when a cpu overheats, will
probably blow it up. Because of that it is not
really a good idea to use it as a timebase. I shake
my fist at Intel! $#%$#%$#%
ACPI timer Despite the hardware bugs this almost always works
as a timebase, but sometimes the frequency changes
when the cpu goes into power savings mode or EST,
and sometimes the frequency is something other
then what it is supposed to be.
8254 timer 0 Almost always works as a timebase, but only if
not also used to generate high-speed interrupts
(because interrupts are lost easily). Set it to
a full cycle (1/18 second) and you will be fine.
Set it to anything else and you will lose interrupts.
The BIOS will sometimes mess with timer 0, but not
as often as it messes with timer 2.
8254 timer 1 Sometimes works as a time base, but can lock older
machines up. Can even lock up newer machines.
Why? Because hardware manufacturers are idiots.
8254 timer 2 Often can be used as a time base, but video bios
calls often try to use it too. #@%$#%$# bios makers!
Still, this is better then losing interrupts when
timer 0 is set to high speed so DragonFly uses
timer 2 for its timebase as a default until the
ACPI timer becomes available, with a boot option
to use timer 1 instead. Using timer 2 as a time
base means you don't get motherboard speaker sound
(the old beep beep BEEP!). Do I care? No.
LAPIC timer Dunno. Probably best to use it as a high speed
clock interrupt which would free 8254 timer 0 to
use as a time base.
RTC interrupt Basically unusable. Stable, but doesn't have
sufficient resolution to be helpful and takes
forever to read.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the freebsd-stable
mailing list