host, bhyve vm and ntpd
Boris Samorodov
bsam at passap.ru
Fri Oct 20 17:20:15 UTC 2017
(CC to freebsd-virtualization@)
20.10.2017 19:32, Ian Lepore пишет:
> On Fri, 2017-10-20 at 18:36 +0300, Boris Samorodov wrote:
>> 20.10.2017 18:31, Boris Samorodov пишет:
>>>
>>> 20.10.2017 18:12, Ian Lepore пишет:
>>>>
>>>> On Fri, 2017-10-20 at 14:46 +0300, Boris Samorodov wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I have got a host:
>>>>> ---
>>>>> bhyve-host% uname -a
>>>>> FreeBSD sm.bsnet 12.0-CURRENT FreeBSD 12.0-CURRENT #3 r322868: Fri Aug
>>>>> 25 05:25:26 MSK 2017
>>>>> bsam at builder.bsnet:/usr/obj/usr/src/sys/GENERIC-FAST amd64 amd64
>>>>> ---
>>>>>
>>>>> And a bhyve vm:
>>>>> ---
>>>>> bhyve-vm: uname -a
>>>>> FreeBSD builder.bsnet 12.0-CURRENT FreeBSD 12.0-CURRENT #58 r324782: Fri
>>>>> Oct 20 05:12:17 MSK 2017
>>>>> bsam at builder.bsnet:/usr/obj/usr/src/sys/PKG64X amd64 amd64
>>>>> ---
>>>>>
>>>>> The only difference at kernel configs is a colored console. :-)
>>>>>
>>>>> And here I get some weird (is it?) result at the VM (I expect ntpd to be
>>>>> more stable):
>>>>> ---
>>>>> bhyve-vm% for t in `jot 10`; do ntpq -pn; sleep 64; done
>>>>> remote refid st t when poll reach delay offset
>>>>> jitter
>>>>> ==============================================================================
>>>>> XX.XX.XX.1 XX.XX.XX.245 4 u 9 64 3 0.605 -1.202
>>>>> 316.407
>>>>> XX.XX.XX.1 XX.XX.XX.245 4 u 7 64 7 0.605 -1.202
>>>>> 358.395
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 5 64 17 0.615 -328.42
>>>>> 181.405
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 3 64 37 0.615 -328.42
>>>>> 214.868
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 67 64 37 0.615 -328.42
>>>>> 214.868
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 63 64 77 0.615 -328.42
>>>>> 268.618
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 60 64 177 0.615 -328.42
>>>>> 333.175
>>>>> XX.XX.XX.1 .STEP. 16 u 1910 64 0 0.000 0.000
>>>>> 0.000
>>>>> XX.XX.XX.1 XX.XX.XX.245 4 u 27 64 1 0.703 -262.63
>>>>> 0.004
>>>>> XX.XX.XX.1 XX.XX.XX.245 4 u 31 64 1 0.649 -331.43
>>>>> 68.800
>>>>> ---
>>>>>
>>>>> At the same time host's results are very stable:
>>>>> ---
>>>>> bhyve-host% for t in `jot 10`; do ntpq -pn; sleep 64; done
>>>>> remote refid st t when poll reach delay offset
>>>>> jitter
>>>>> ==============================================================================
>>>>>
>>>>>
>>>>>
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 1 64 1 0.401 0.176
>>>>> 0.106
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 6 64 3 0.401 0.176
>>>>> 0.459
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 3 64 7 0.401 0.176
>>>>> 0.940
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 67 64 7 0.401 0.176
>>>>> 0.940
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 64 64 17 0.401 0.176
>>>>> 1.566
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 60 64 37 0.448 1.275
>>>>> 1.739
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 55 64 77 0.448 1.275
>>>>> 2.365
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 53 64 177 0.448 1.275
>>>>> 3.110
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 50 64 377 0.448 1.275
>>>>> 3.929
>>>>> *XX.XX.XX.1 XX.XX.XX.245 4 u 45 64 377 0.443 8.750
>>>>> 4.722
>>>>> ---
>>>>>
>>>>> The network is organized via bridge -- host igb and vm tap interfaces
>>>>> are members of one bridge.
>>>>>
>>>>> Are those results expected? Does it smell like a bug? Should I dig
>>>>> furter?
>>>>>
>>>> So it is repeatedly stepping the clock in the VM? (Set
>>>> kern.timecounter.stepwarnings=1 to log steps).
>>> No kernel/ntpd messages for 20 minutes after setting this sysctl.
>>>
>>>>
>>>> That is usually a sign
>>>> that the chosen timecounter is running at a different frequency than it
>>>> claimed to be when it registered itself -- the host may not be
>>>> emulating the timer hardware properly in the guest. What is the output
>>>> of sysctl kern.timecounter in the vm?
>>> ---
>>> bhyve-vm% sysctl kern.timecounter
>>>
>>> kern.timecounter.tsc_shift: 1
>>> kern.timecounter.smp_tsc_adjust: 0
>>> kern.timecounter.smp_tsc: 0
>>> kern.timecounter.invariant_tsc: 1
>>> kern.timecounter.fast_gettime: 1
>>> kern.timecounter.tick: 1
>>> kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) TSC-low(-100)
>>> dummy(-1000000)
>>> kern.timecounter.hardware: HPET
>>> kern.timecounter.alloweddeviation: 5
>>> kern.timecounter.stepwarnings: 1
>>> kern.timecounter.tc.ACPI-fast.quality: 900
>>> kern.timecounter.tc.ACPI-fast.frequency: 3579545
>>> kern.timecounter.tc.ACPI-fast.counter: 4161213491
>>> kern.timecounter.tc.ACPI-fast.mask: 4294967295
>>> kern.timecounter.tc.HPET.quality: 950
>>> kern.timecounter.tc.HPET.frequency: 10000000
>>> kern.timecounter.tc.HPET.counter: 3518036865
>>> kern.timecounter.tc.HPET.mask: 4294967295
>>> kern.timecounter.tc.i8254.quality: 0
>>> kern.timecounter.tc.i8254.frequency: 1193182
>>> kern.timecounter.tc.i8254.counter: 47597
>>> kern.timecounter.tc.i8254.mask: 65535
>>> kern.timecounter.tc.TSC-low.quality: -100
>>> kern.timecounter.tc.TSC-low.frequency: 1199886114
>>> kern.timecounter.tc.TSC-low.counter: 1274338278
>>> kern.timecounter.tc.TSC-low.mask: 4294967295
>>> ---
>> BTW, here is the host kern.timecounter:
>> ---
>> kern.timecounter.tsc_shift: 1
>> kern.timecounter.smp_tsc_adjust: 0
>> kern.timecounter.smp_tsc: 1
>> kern.timecounter.invariant_tsc: 1
>> kern.timecounter.fast_gettime: 1
>> kern.timecounter.tick: 1
>> kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) TSC-low(1000)
>> dummy(-1000000)
>> kern.timecounter.hardware: TSC-low
>> kern.timecounter.alloweddeviation: 5
>> kern.timecounter.stepwarnings: 0
>> kern.timecounter.tc.ACPI-fast.quality: 900
>> kern.timecounter.tc.ACPI-fast.frequency: 3579545
>> kern.timecounter.tc.ACPI-fast.counter: 9047194
>> kern.timecounter.tc.ACPI-fast.mask: 16777215
>> kern.timecounter.tc.HPET.quality: 950
>> kern.timecounter.tc.HPET.frequency: 14318180
>> kern.timecounter.tc.HPET.counter: 2232552795
>> kern.timecounter.tc.HPET.mask: 4294967295
>> kern.timecounter.tc.i8254.quality: 0
>> kern.timecounter.tc.i8254.frequency: 1193182
>> kern.timecounter.tc.i8254.counter: 43410
>> kern.timecounter.tc.i8254.mask: 65535
>> kern.timecounter.tc.TSC-low.quality: 1000
>> kern.timecounter.tc.TSC-low.frequency: 1200067168
>> kern.timecounter.tc.TSC-low.counter: 2463146362
>> kern.timecounter.tc.TSC-low.mask: 4294967295
>> ---
>>
>>>
>>>>
>>>> Also, what is the output of ntptime(8) in the vm?
>>> ---
>>> bhyve-vm% ntptime
>>>
>>> ntp_gettime() returns code 0 (OK)
>>> time dd94930f.20ea2900 Fri, Oct 20 2017 18:21:51.128, (.128573699),
>>> maximum error 1309110 us, estimated error 3 us, TAI offset 37
>>> ntp_adjtime() returns code 0 (OK)
>>> modes 0x0 (),
>>> offset 0.000 us, frequency 0.000 ppm, interval 1 s,
>>> maximum error 1309110 us, estimated error 3 us,
>>> status 0x2001 (PLL,NANO),
>>> time constant 6, precision 0.001 us, tolerance 496 ppm,
>>> ---
>>>
>>> Ian, thank you for your help!
>>>
>>
>
> It seems odd to me that the frequency of the host HPET is 14.3mhz and
> that of the guest is 10.0mhz, but maybe that's a normal condition for
> bhyve. I did find some google hits[1] for bhyve guest timekeeping
> trouble with the HPET timer which was solved by switching to a
> different timecounter. Timecounter choices can't be controlled from
> loader.conf, so I guess a sysctl.conf entry of
> kern.timecounter.hardware="ACPI-fast" is the only way to fix that. You
> can also just do that command interactively first and see if it stops
> the time steps and ntp settles down.
The process seems to become more monotonic. But steps nevertheless:
---
*XX.1 XX.245 4 u 20 64 1 0.717 -12.771 4.193
*XX.1 XX.245 4 u 28 64 3 0.751 -41.970 32.342
*XX.1 XX.245 4 u 23 64 7 0.748 -59.505 46.624
*XX.1 XX.245 4 u 18 64 17 0.699 -75.164 56.482
*XX.1 XX.245 4 u 14 64 37 0.669 -90.112 63.767
*XX.1 XX.245 4 u 11 64 77 0.605 -10.567 60.914
*XX.1 XX.245 4 u 7 64 177 0.591 -169.54 116.762
*XX.1 XX.245 4 u 3 64 377 0.591 -169.54 102.107
*XX.1 XX.245 4 u 68 64 377 0.591 -169.54 102.107
*XX.1 XX.245 4 u 63 64 377 0.591 -169.54 88.424
*XX.1 XX.245 4 u 58 64 377 0.591 -169.54 92.949
*XX.1 XX.245 4 u 55 64 377 0.591 -169.54 111.512
*XX.1 XX.245 4 u 50 64 377 0.591 -169.54 140.827
*XX.1 XX.245 4 u 45 64 377 0.591 -169.54 177.360
*XX.1 XX.245 4 u 43 64 377 0.591 -169.54 219.057
XX.1 .STEP. 16 u 588 64 0 0.000 0.000 0.000
---
> This would be a workaround, not a fix per se. If the time steps go
> away, then something in bhyve's emulation of HPET (maybe only on some
> hardware?) must be buggy.
>
>
> [1] https://lists.freebsd.org/pipermail/freebsd-virtualization/2015-April/003492.html
Also thanks for the link. Unfortunately the problem seems to persist.
--
WBR, Boris Samorodov (bsam)
FreeBSD Committer, http://www.FreeBSD.org The Power To Serve
More information about the freebsd-current
mailing list