Re: ntpd fails on recent -current/arm64
- In reply to: Mark Millard : "RE: ntpd fails on recent -current/arm64"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 23 Apr 2023 12:42:54 UTC
On Apr 23, 2023, at 05:29, Mark Millard <marklmi@yahoo.com> wrote: > Peter Jeremy <peter_at_rulingia.com> write on > Date: Sun, 23 Apr 2023 11:47:01 UTC : > >> Somewhere between c283016-g607bc91d90a3 and c283077-g7f658f99f7ed, > > I see the problem at main-n262371-72ef722b2a34-dirty (aarch64), > in the middle of your range. So in the smaller range is: > > Thu, 20 Apr 2023 > . . . > • git: 607bc91d90a3 - main - vmrun.sh: Fix a typo in usage() Mateusz Piotrowski > • Re: git: 373b95976bce - main - netstat: document that PCB information can't be read from corefiles tuexen_at_freebsd.org > • git: de1dde5dfea4 - main - network.subr: adjust regex for wlans_xxxxx rc.conf entries Bjoern A. Zeeb > • Re: git: 8dcf3a82c54c - main - libc: Implement bsort(3) a bitonic type of sorting algorithm. Brooks Davis > • git: 7db7bfe1a7b9 - main - iwlwifi: quieten more compiler warnings Bjoern A. Zeeb > • git: 35f7fa4ac1ae - main - LinuxKPI: 802.11: improve assertion and tkip code Bjoern A. Zeeb > • git: fdb987bebddf - main - inpcb: Split PCB hash tables Mark Johnston > • git: 3e98dcb3d574 - main - inpcb: Move inpcb matching logic into separate functions Mark Johnston > • git: 7b92493ab1d4 - main - inpcb: Avoid inp_cred dereferences in SMR-protected lookup Mark Johnston > • git: 5fd1a67e885e - main - inpcb: Release the inpcb cred reference before freeing the structure Mark Johnston > • git: dd9059b3e9a1 - main - makefs: set cd9660 Rock Ridge timestamps for . and .. Ed Maste > • git: 0df4d8ad7a1b - main - Add jobs.mk to allow for target-jobs Simon J. Gerraty > • git: d1f4c44aa8af - main - x86: Move i386 ppireg.h to x86 Dmitry Chagin > • git: de4da6cd04bf - main - x86: Move i386 timerreg.h to x86 Dmitry Chagin > • git: 8fe4f8f7a75f - main - Fix building host tools for host Simon J. Gerraty > • git: bb8e8e230d94 - main - Revert "libc: Implement bsort(3) a bitonic type of sorting algorithm." Hans Petter Selasky > • Re: git: bb8e8e230d94 - main - Revert "libc: Implement bsort(3) a bitonic type of sorting algorithm." Jessica Clarke > • Re: git: bb8e8e230d94 - main - Revert "libc: Implement bsort(3) a bitonic type of sorting algorithm." Brooks Davis > • git: 1a149d65baed - main - dtrace: get rid of uchar_t types Mark Johnston > • git: 080e56a6c98c - main - dtrace: expose dtrace_instr_size() to userland and implement it for riscv Mark Johnston > • git: 75081b9ed8e6 - main - dtrace: use dtrace_instr_size() in the riscv dtrace_subr.c Mark Johnston > • git: 1fef7abdc76b - main - dtrace: add register bindings for RISC-V Mark Johnston > • Re: git: bb8e8e230d94 - main - Revert "libc: Implement bsort(3) a bitonic type of sorting algorithm." Hans Petter Selasky > • Re: git: bb8e8e230d94 - main - Revert "libc: Implement bsort(3) a bitonic type of sorting algorithm." Hans Petter Selasky > • git: 47e888f8363d - main - Remove a few more references to riscv64sf. John Baldwin > • git: 048606bec11f - main - perfmon(4): Use a C89 function definition for a SYSINIT. John Baldwin > • git: bf043855213c - main - arm: Use C89 function declaration for db_read_bytes. John Baldwin > • Re: git: c1e813d12309 - main - hwpmc: Correct selection of Intel fixed counters. Alexander Motin > • git: 72ef722b2a34 - main - dpaa2: add console support for FDT based systems Bjoern A. Zeeb > . . . > >> some change in the kernel has made ntpd stop working on my arm64 test >> box. (My amd64 test box is a couple of days behind so I'm not sure if >> it's arm-specific). >> >> What I've identified so far: >> * The problem is in the kernel, not userland. > > See below for the truss output oddity of doing > 2 sendto's (no recvfrom) instead of sendto then > recvfrom. From an machine with an older context > that works: > > . . . > socket(PF_INET,SOCK_DGRAM,IPPROTO_UDP) = 3 (0x3) > connect(3,{ AF_INET 127.0.0.1:123 },16) = 0 (0x0) > sendto(3,"\^V\^A\0\^A\0\0\0\0\0\0\0\0",12,0,NULL,0) = 12 (0xc) > select(4,{ 3 },0x0,0x0,{ 5.000000 }) = 1 (0x1) > recvfrom(3,"\^V\M^A\0\^A\^F\^X\0\0\0\0\0\^X"...,516,0,NULL,0x0) = 36 (0x24) > fstat(1,{ mode=crw--w---- ,inode=138,size=0,blksize=4096 }) = 0 (0x0) > ioctl(1,TIOCGETA,0x735a71d98c58) = 0 (0x0) > >> >> * The impact seems to be limited to ntpd (in particular, ntpdate works). >> * ntpd appears to be correctly exchanging NTP packets with peers. >> * ntpd is not responding to "ntpq -p" queries > > I noticed that "ntpq -c as" end up doing: > > . . . > open("/etc/services",O_RDONLY|O_CLOEXEC,0666) = 3 (0x3) > fstat(3,{ mode=-rw-r--r-- ,inode=14579,size=72600,blksize=72704 }) = 0 (0x0) > lseek(3,0x0,SEEK_CUR) = 0 (0x0) > lseek(3,0x0,SEEK_SET) = 0 (0x0) > read(3,"#\n# Network services, Internet "...,72704) = 72600 (0x11b98) > close(3) = 0 (0x0) > socket(PF_INET,SOCK_DGRAM,IPPROTO_UDP) = 3 (0x3) > connect(3,{ AF_INET 127.0.0.1:123 },16) = 0 (0x0) > sendto(3,"\^V\^A\0\^A\0\0\0\0\0\0\0\0",12,0,NULL,0) = 12 (0xc) > select(4,{ 3 },0x0,0x0,{ 5.000000 }) = 0 (0x0) > sendto(3,"\^V\^A\0\^A\0\0\0\0\0\0\0\0",12,0,NULL,0) = 12 (0xc) > select(4,{ 3 },0x0,0x0,{ 5.000000 }) = 0 (0x0) > localhost: timed out, nothing received > write(2,"localhost: timed out, nothing re"...,39) = 39 (0x27) > ***Request timed out > write(2,"***Request timed out\n",21) = 21 (0x15) > exit(0x0) process exit, rval = 0 I should not have written: > Note the: socket(PF_INET,SOCK_DGRAM,IPPROTO_UDP) = 3 (0x3) > > I've no use of PF set up. It just confuses things because the older context also makes that call. It is really what follows (sendto->sendto vs. sendto->recvfrom) that matters from what I can tell. Sorry for the misdirection. > Your "ntpq -p" also gets such: > > socket(PF_INET,SOCK_DGRAM,IPPROTO_UDP) = 3 (0x3) > connect(3,{ AF_INET 127.0.0.1:123 },16) = 0 (0x0) > sendto(3,"\^V\^A\0\^A\0\0\0\0\0\0\0\0",12,0,NULL,0) = 12 (0xc) > select(4,{ 3 },0x0,0x0,{ 5.000000 }) = 0 (0x0) > sendto(3,"\^V\^A\0\^A\0\0\0\0\0\0\0\0",12,0,NULL,0) = 12 (0xc) > select(4,{ 3 },0x0,0x0,{ 5.000000 }) = 0 (0x0) > localhost: timed out, nothing received > write(2,"localhost: timed out, nothing re"...,39) = 39 (0x27) > ***Request timed out > write(2,"***Request timed out\n",21) = 21 (0x15) > exit(0x0) process exit, rval = 0 > > > >> * ntp_gettime and ntp_adjtime both return TIME_ERROR to ntptime >> >> I've looked through the commits and, beyond much of netinet being >> roto-tilled, I can't see anything obvious. >> >> Is anyone else seeing anything similar? > > Yes. I noticed via systems without a RTC that I'd set up > to have ntpd fix the times on. The times stopped being > fixed. > >> Can anyone suggest where >> to look next? > > See the truss output above? (I'm no expert in the area. > I'm just noting the odd sendto sendto sequence without > any recvfrom.) > === Mark Millard marklmi at yahoo.com