dhclient conflict between /sbin/devd and /etc/rc.d/netif ?
Robert Jenssen
robertjenssen at ozemail.com.au
Sun Feb 10 23:52:02 PST 2008
Hi Brooks and all,
On Mon, 11 Feb 2008 12:06:26 pm you wrote:
> On Mon, Feb 11, 2008 at 11:37:21AM +1100, Robert Jenssen wrote:
> > Hi,
> > Every so often I have trouble connecting rt2560 based PCI wireless network
> > card to my wireless router/access point. Typically I get:
> >
> > # sudo /etc/rc.d/netif restart ral0
> > Starting wpa_supplicant.
> > ral0: no link .............. giving up
> > ral0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > ether 00:11:50:63:cd:47
> > media: IEEE 802.11 Wireless Ethernet autoselect (DS/1Mbps)
> > status: no carrier
> >
> > Even though there seems to be plenty of signal power:
> >
> > # sudo ifconfig ral0 list scan
> > SSID BSSID CHAN RATE S:N INT CAPS
> > xxxxxxx... 00:xx:xx:xx:xx:xx 10 54M -74:-95 100 EPS WPA
> >
> > Recently I noticed that sometimes, after the above "netif restart" fails,
the
> > ral0 interface "automagically" comes up anyway. Then dhclient is owned
> > by /sbin/devd. The default devd.conf starts dhclient for both ethernet and
> > PCI-cardbus devices. Is it a good idea for both /sbin/devd
> > and /etc/rc.d/netif to start a dhclient on ral0 at about the same time?
In the "magical" case above what I think is happening is that the dhclient
startup from /etc/rc.d/netif called by rc fails. Later /etc/rc.d/netif is
called again from /etc/pccard_ether:pccard_ether_start() by /sbin/devd. That
call succeeds.
The rc system uses rcorder to determine the order in which to run the rc
scripts. On my system rcorder shows devd fairly early in the list. The
devd.conf file calls a number of rc scripts. So far as I can see /sbin/devd
doesn't check that these are called in the order listed by rcorder. Is this a
problem?
I have disabled devd (set the moused port explicitly in rc.conf) and done some
simple tests on /usr/src/sbin/dhclient.c. In particular, at line 365 main()
allows a hard-coded maximum of 10 seconds for the call to
interface_link_status() to succeed. I changed this to 20 seconds with a print
out and ran /etc/rc.d/netif restart a few times with rc_debug="YES". The
results were
15 15 5 5 5 5 5 15 15 5 5 5 5 5 21(timed out!) 5 5 and 5 seconds. Presumably
the (10n+5) seconds is a magic number inside my wireless card or router. I'm
going to set the hardcoded value to 25 seconds. Would it be possible for you
to commit a similar change? Here is a patch:
*** src/sbin/dhclient/dhclient.c 2007-02-10 04:50:26.000000000 +1100
--- /usr/src/sbin/dhclient/dhclient.c 2008-02-11 18:09:25.000000000 +1100
***************
*** 360,370 ****
fflush(stderr);
sleep(1);
while (!interface_link_status(ifi->name)) {
fprintf(stderr, ".");
fflush(stderr);
! if (++i > 10) {
fprintf(stderr, " giving up\n");
exit(1);
}
sleep(1);
}
--- 360,370 ----
fflush(stderr);
sleep(1);
while (!interface_link_status(ifi->name)) {
fprintf(stderr, ".");
fflush(stderr);
! if (++i > 25) {
fprintf(stderr, " giving up\n");
exit(1);
}
sleep(1);
}
("diff -C 5" to show the sleep()s!). Rather than dhclient.c timing 10 seconds
and calling exit(), as shown above, shouldn't the dhclient.conf "timeout"
configuration item cover this situation? I see that PR bin/98577 wants this
hardcoded timeout reduced or made adjustable via dhclient.conf.
Best regards,
Rob Jenssen
More information about the freebsd-net
mailing list