i386/115054: NTP errors out on startup but restart of NTP fixes problem

Bruce Evans brde at optusnet.com.au
Wed Aug 1 20:31:41 UTC 2007


On Wed, 1 Aug 2007, [utf-8] Dag-Erling Smørgrav wrote:

> Bruce Evans <brde at optusnet.com.au> writes:
>> Several versions of FreeBSD have annoying behaviouor for network
>> startup, involving the network not actually being up when ifconfig
>> returns and subsequent different mishandling of this by various
>> utilities.  [...]
>> This problem seems to get worse with each release of FreeBSD and/or
>> with newer NICs.  I never noticed fxp or even ed or rl NICs.  Now it
>> is barely noticeable with fxp and very noticeable with sk, bge and em
>> NICs.
>
> I have never seen this with any of the cards I've used (xl, fxp, rl, re,
> sis, bge, sk, msk and probably others, in no particular order).
>
> Perhaps there is a hardware issue involved?  Does the problem occur if
> you hardcode the link speed instead of relying on autonegotiation?

No difference.  I thought it might be the cheap switch, but going
direct makes no difference except to break hard-coding the link speed
for bge.  Thie followings is with bge (1Gbps capable but reduced to
100baseTX full-duplex by autonegotiation) under -current, connected
to fxp (100baseTX full-duplex by autonegotiation or hard-coded) under
FreeBSD-~5.2:

%%%
ttyv0:root at besplex:~> ifconfig bge0 down; time ifconfig bge0 up; time ping -c1
delplex; time route get delplex; time route get delplex

         0.48 real         0.00 user         0.47 sys
PING delplex.bde.org (192.168.2.4): 56 data bytes
Aug  2 05:57:49 besplex kernel: bge0: link state changed to DOWN
Aug  2 05:57:51 besplex kernel: bge0: link state changed to UP

--- delplex.bde.org ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
        11.01 real         0.00 user         0.00 sys
    route to: delplex
destination: delplex
   interface: bge0
       flags: <UP,HOST,DONE,LLINFO,WASCLONED>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
        0         0         0         0         0         0      1500      1191
         0.00 real         0.00 user         0.00 sys
    route to: delplex
destination: delplex
   interface: bge0
       flags: <UP,HOST,DONE,LLINFO,WASCLONED>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
        0         0         0         0         0         0      1500      1191
         0.00 real         0.00 user         0.00 sys
%%%

-current gives the differences that:
o ifconfig returns after 0.48 seconds instead of after 2+ seconds.  The
   "link state changed to UP" message still takes 2+ seconds altogether.
o The message is now printed to a different unwanted place (using tprintf()
   I think, instead of using printf(), but I want it in stderr).  The above
   output was captured using vidcontrol.
o The timestamps on the messages made by syslogd are almost precise enough
   to show the 2 second delay.
o ping still returns after 11+ seconds, but now it starts about 1.5 seconds
   earlier relative to the UP message, so the 11 seconds may be just ping's
   timeout and not related to UPness.

%%%
ttyv0:root at besplex:~> ifconfig bge0 down; time ifconfig bge0 up; time route get
  delplex; time route get delplex
         0.48 real         0.00 user         0.47 sys
    route to: delplex
Aug  2 05:58:25 besplex kernel: bge0: link state changed to DOWN
Aug  2 05:58:27 besplex kernel: bge0: link state changed to UP
destination: 192.168.2.0
        mask: 255.255.255.0
   interface: bge0
       flags: <UP,DONE,CLONING>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
        0         0         0         0         0         0      1500        -7
         5.26 real         0.00 user         0.00 sys
    route to: delplex
destination: delplex
   interface: bge0
       flags: <UP,HOST,DONE,LLINFO,WASCLONED>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
        0         0         0         0         0         0      1500      1196
         0.00 real         0.00 user         0.00 sys
%%%

The first "route get" still returns after 5+ seconds, but now it starts
about 1.5 seconds earlier relative to the UP message, so the 5 seconds
may be just route's timeout and not related to UPness.

The -current bge driver is acting identically to the ~5.2 bge driver.
Userland is ~5.2 all tests.  One reason I didn't report this earlier is
that it might be due to the ~5.2 userland and I don't have time to test
with a full -current userland, but ifconfig and route(8) seem to be portable
enough to mostly work with both kernels.  route(8) has a known problem
concerning the base for the expire time (it was broken for a long time
in -current due to the change to mono-time, but this causes few problems).

Bruce


More information about the freebsd-i386 mailing list