multi-homed systems stop answering ARP on local addresses w/ifconfig
aliases
Chris Buechler
freebsd at chrisbuechler.com
Sun May 17 20:25:40 UTC 2009
There seems to be a regression between 6.x and 7.0 and 7.1 related to
ifconfig aliases on multi-homed hosts. Not sure on anything newer than
7.1 (this is pfSense, we're just starting to test 7.2 builds). For
periods of time, the system will stop answering ARP on some of its own
addresses and hence anything on that network completely stops
functioning. The same setup worked fine on 6.2.
The particular system illustrated here is a router on part of an ISP's
network. IPs are all public, in the info provided here they've been
replaced with 10. IPs. The subnets on the inside interfaces are routed
to the outside interface. When this problem occurs, the IPs assigned
locally on the system will still respond from the Internet, but the
system itself loses all connectivity with that subnet and nothing on
that subnet can communicate with the host due to the lack of ARP. That
makes some sense, I presume when routing to a locally assigned address
via another interface, the system doesn't need ARP on the address to
respond. But while it still responds from the Internet, even the host
itself can't initiate a ping to that IP. It behaves the same whether pf
is enabled or disabled.
I see two similar issues in the past, one with a PR:
http://www.freebsd.org/cgi/query-pr.cgi?pr=121437&cat=
that's exactly the same issue, it's not limited to VLANs, any
multi-homed host is affected.
And another:
http://thread.gmane.org/gmane.os.freebsd.stable/57125
fxp0 is the outside interface. It doesn't make any difference whether
the ifconfig aliases are on the em0 or fxp1 interfaces, they both behave
the same if they have any ifconfig aliases assigned.
# ifconfig
fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8<VLAN_MTU>
ether 00:90:27:86:8b:9d
inet6 fe80::290:27ff:fe86:8b9d%fxp0 prefixlen 64 scopeid 0x1
inet 10.11.185.146 netmask 0xfffffff8 broadcast 10.11.185.151
media: Ethernet 100baseTX <full-duplex>
status: active
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
ether 00:11:43:2c:62:03
inet 10.10.0.1 netmask 0xffffff00 broadcast 10.10.0.255
inet6 fe80::211:43ff:fe2c:6203%em0 prefixlen 64 scopeid 0x2
inet 10.13.40.1 netmask 0xffffff00 broadcast 10.13.40.255
inet 10.13.41.1 netmask 0xffffff00 broadcast 10.13.41.255
inet 10.13.42.1 netmask 0xffffff00 broadcast 10.13.42.255
inet 10.13.43.1 netmask 0xffffff00 broadcast 10.13.43.255
inet 10.13.44.1 netmask 0xffffff00 broadcast 10.13.44.255
inet 10.13.45.1 netmask 0xffffff00 broadcast 10.13.45.255
inet 10.13.46.1 netmask 0xffffff00 broadcast 10.13.46.255
inet 10.13.47.1 netmask 0xffffff00 broadcast 10.13.47.255
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
fxp1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8<VLAN_MTU>
ether 00:d0:b7:5d:25:9f
inet 10.1.242.1 netmask 0xffffff00 broadcast 10.1.242.255
inet6 fe80::2d0:b7ff:fe5d:259f%fxp1 prefixlen 64 scopeid 0x3
inet 10.1.243.1 netmask 0xffffff00 broadcast 10.1.243.255
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
When the problem is occurring, you can't even ping the affected locally
assigned addresses from the box itself:
# ping 10.10.0.1
PING 10.10.0.1 (10.10.0.1): 56 data bytes
ping: sendto: Network is unreachable
ping: sendto: Network is unreachable
ping: sendto: Network is unreachable
And when trying to ping something on one of the affected attached
subnets, you get:
# ping 10.10.0.30
PING 10.10.0.30 (10.10.0.30): 56 data bytes
ping: sendto: Invalid argument
ping: sendto: Invalid argument
In the logs, you get a flood of these messages:
May 14 02:55:12 kernel: arpresolve: can't allocate route for 10.10.0.1
May 14 02:55:12 kernel: arplookup 10.10.0.1 failed: host is not on
local network
May 14 02:55:12 kernel: arpresolve: can't allocate route for 10.10.0.1
May 14 02:55:12 kernel: arplookup 10.10.0.1 failed: host is not on
local network
It happens both with the primary IP assigned to the interface, and the
aliases assigned, but not all at once. Some of the addresses will
continue to work when others are failing. Somehow it thinks IPs that are
locally assigned are not on a local network... after a couple minutes,
it just starts working again without making any changes or even touching
the system.
If I can provide any additional information, please let me know.
thanks,
Chris
More information about the freebsd-net
mailing list