system time instability

Hrant Dadivanyan hrant at dadivanyan.net
Tue Dec 13 18:17:11 UTC 2016


> On Mon, Dec 12, 2016 at 11:33:59PM +0400, Hrant Dadivanyan wrote:
> > > On Mon, Dec 12, 2016 at 11:04:08PM +0400, Hrant Dadivanyan wrote:
> > > > Now, when you ask, I start to suspect PPS delivery to uart again - cable
> > > > and amplifier, but can't understand how the 100ppm error fits into that.
> > > 
> > > If you disable PPS sync in ntp config, does the machine keep time adequately ?
> > > 
> > 
> > Thanks for reminding - yes, I've tried this as well, the issue persists.
> > So uart shouldn't be in charge.
> > 

This statement seems to be wrong, look below.

> > > There might be relatively long pauses when system management mode handlers
> > > do something in response to hw events.  E.g. if you have USB emulation of
> > > AT keyboard enabled in BIOS, try to disable that.  And update the BIOS.
> > 
> > The USB is switched off in the BIOS. I've removed all changes in sysctl.conf
> > and nice flag from ntpd, recompiled kernel as following:
> > include         GENERIC
> > options         PPS_SYNC
> > device          pf
> > device          pflog
> > and started over. Dmesg is attached.
> > 
> Please show verbose dmesg.
> 

I've updated BIOS to the latest one. Verbose dmesg is attached.

> > CPU: Intel(R) Core(TM)2 Duo CPU     E4500  @ 2.20GHz (2194.55-MHz K8-class CPU)
> 
> This is relatively old CPU which is known to have some (minor) issues with
> interaction between power saving and cores.  Try the following OS config:
> disable deep C states, allow only C1 (there might be some tweaks in BIOS,
> if possible, disable the Cn, n > 1, there too);

Have never touched Cx states on servers, it was disabled in BIOS and sysctl
shows C1 as lowest. Now I've enabled it in BIOS, but didn't touch in OS:
dev.cpu.1.cx_usage: 100.00% last 31000us
dev.cpu.1.cx_lowest: C1
dev.cpu.1.cx_supported: C1/1/0
dev.cpu.0.cx_usage: 100.00% last 5569us
dev.cpu.0.cx_lowest: C1
dev.cpu.0.cx_supported: C1/1/0
dev.cpu.0.freq_levels: 2200/35000 2000/31000 1800/27000 1600/23000 1400/19000 1200/16000
dev.cpu.0.freq: 2200
Is this correct ?

> use LAPIC for event timer (not HPET);

Have disabled HPET in BIOS:
kern.eventtimer.periodic: 0
kern.eventtimer.timer: LAPIC
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 2
kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.LAPIC.quality: 400
kern.eventtimer.et.LAPIC.frequency: 99751860
kern.eventtimer.et.LAPIC.flags: 15

> re-check that you use RDTSC for the timecounter;

kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0) dummy(-1000000)
kern.timecounter.hardware: TSC-low
kern.timecounter.alloweddeviation: 5
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.tc.TSC-low.frequency: 1097249250
kern.timecounter.tc.TSC-low.counter: 1335765171
kern.timecounter.tc.TSC-low.mask: 4294967295
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 6343821
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 2701
kern.timecounter.tc.i8254.mask: 65535

> do not enable powerd.
> 

Never did on servers.

> You might also try the stable/11 kernel, which has more changes WRT C-states
> handling and PPS/ntp locking.
> 

The server did run for almost a day without PPS and looks stable. I start
to believe, to my shame, that I did a mistake when testing this previously.
Then the whole post is wrong and cable seems to be most suspected part again.
Even now it's hard to understand this wrong behaviour, but anyway ...

Just replaced the cable with shielded one where each pair has separate
shield, used dedicated pair for PPS and ground; grounded the shields.

Thank you Konstantin, thank you Ian !
Hrant

-- 
Hrant Dadivanyan (aka Ran d'Adi)		hrant(at)dadivanyan.net
/* "Feci quod potui, faciant meliora potentes." */       ran(at)psg.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmesg.v
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20161213/f2f0ed27/attachment.ksh>


More information about the freebsd-hackers mailing list