em device hangs on ifconfig alias ...
Pyun YongHyeon
pyunyh at gmail.com
Fri Jul 7 00:59:59 UTC 2006
On Thu, Jul 06, 2006 at 01:29:11PM -0700, Atanas wrote:
> Pyun YongHyeon said the following on 7/5/06 7:14 PM:
> >
> >Here is patch generated against RELENG_6.
> >
> OK, I just tested that, but it doesn't seem to make any difference.
>
> Here's what I did:
>
> I commented out the em device from my kernel (a 6-STABLE one from
> yesterday) and compiled three if_em kernel modules:
> - one taken from 6.1 release
> - the unpatched 6-STABLE one
> - the latter with the above patch applied
>
> So I was able to load and test each of these modules independently and
> without actually restarting the machine. I changed also the driver
> version string in if_em.c, just to ensure that I'm really loading the
> right em module by checking dmesg:
>
> em1: <Intel(R) PRO/1000 Network Connection Version - 3.2.18 (patched)>
> port 0xdc80-0xdcbf mem 0xfcfe0000-0xfcffffff irq 55 at device 4.1 on pci3
> em1: Ethernet address: 00:04:23:b5:1b:ff
> em1: link state changed to UP
>
> I used 2 machines - one running 6.1-RELEASE and using fxp (I'll call it
> "FXP"), and the test one running 6-STABLE with em (I'll call it "EM"),
> and tried exchanging/moving an IP alias between them.
>
> FXP# ifconfig
> fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> options=b<RXCSUM,TXCSUM,VLAN_MTU>
> inet 10.10.64.30 netmask 0xffffff00 broadcast 10.10.64.255
> ether 00:e0:81:31:f4:1e
> media: Ethernet autoselect (100baseTX <full-duplex>)
> status: active
>
> EM# ifconfig
> em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> options=b<RXCSUM,TXCSUM,VLAN_MTU>
> inet 10.10.64.63 netmask 0xffffff00 broadcast 10.10.64.255
> ether 00:04:23:b5:1b:ff
> media: Ethernet autoselect (100baseTX <full-duplex>)
> status: active
>
> First I brought up an IP alias on the FXP machine:
>
> FXP# ifconfig fxp0 inet alias 10.10.64.40 netmask 255.255.255.255
>
> and checked whether it's accessible from anywhere - yes. Then I moved
> that to EM:
>
> FXP# ifconfig fxp0 inet -alias 10.10.64.40
> EM# ifconfig em1 inet alias 10.10.64.40 netmask 255.255.255.255
>
> and checked again - no. It was accessible only from its own subnet
> (10.10.64.x), but not from anywhere else.
>
> Moving that back to FXP works, but moving it back to EM doesn't. The
> only way I found to make it accessible was to arping something from the
> aliased IP address:
>
> EM# arping -S10.10.64.40 -c1 somehost
>
> So it seems that when an IP alias has been recently used on some other
> machine (on FXP in my case), the em driver is unable to initialize that
> IP alias properly.
>
> It might be that the fxp driver is not sending something when releasing
> an alias, who knows. But fact is that fxp always initializes its aliases
> properly - I use it extensively and it always worked.
>
> I tried setting another IP alias that never has been used on these
> machines. I brought that up first on EM and it worked. The moved it to
> FXP and it also worked! But moving it back to EM made it inaccessible.
>
Hmm, that's strange. I've double checked that stock em(4) didn't
generate ARP packets when its addresses were changed. So I made
em(4) generate ARP. Could you see a gratuitous ARP with tcpdump
when you change its address?
> It looks like there's something fishy with the alias initialization.
>
> Another related problem is that the card gets re-initialized (reset?) on
> each alias you add (takes between 0.3 and 1 seconds, depending how fast
> the hardware is), which for mass aliased systems could be a serious
> hurdle after a crash or reboot.
>
This is other issue. em(4) performs two time-consuming operations
in its initialization routine. One is DMA tag/map creation and the
other is checksumming EEPROM contents in init routine.
I have an experimental patch for it but let's fix one at a time.
--
Regards,
Pyun YongHyeon
More information about the freebsd-stable
mailing list