RELENG_6_3 ping and DUP packets
Mike Andrews
mandrews at fark.com
Fri Apr 11 19:59:16 UTC 2008
On Thu, 10 Apr 2008, Jeremy Chadwick wrote:
> On Fri, Apr 11, 2008 at 08:48:10AM +0200, Damian Weber wrote:
>>> From: Chuck Swiger <cswiger at mac.com>
>>> To: Damian Weber <dweber at htw-saarland.de>
>>> Cc: freebsd-stable at freebsd.org
>>> Subject: Re: RELENG_6_3 ping and DUP packets
>>>
>>> On Apr 10, 2008, at 1:58 PM, Damian Weber wrote:
>>>> But here is the problem, pinging the machine from remote gives
>>>>
>>>> A.B.C.X$ ping A.B.C.D
>>>> PING A.B.C.D (A.B.C.D): 56 data bytes
>>>> 64 bytes from A.B.C.D: icmp_seq=0 ttl=64 time=0.272 ms
>>>> 64 bytes from A.B.C.D: icmp_seq=0 ttl=255 time=0.391 ms (DUP!)
>>>
>>> Please run "tcpdump -e icmp" on this box and repeat your testing. It
>>> will be most interesting to know whether you're seeing the same MAC
>>> address....
>>
>> good point, but it's the same
>>
>> A.B.C.X# tcpdump -e icmp
>> tcpdump: listening on rl0, link-type EN10MB
>> 08:41:51.136023 0:20:ed:5f:3:3b 0:19:99:33:7c:9 ip 98: A.B.C.X > A.B.C.D: icmp: echo request
>> 08:41:51.136171 0:19:99:33:7c:9 0:20:ed:5f:3:3b ip 98: A.B.C.D: icmp: echo reply
>> 08:41:51.136343 0:19:99:33:7c:9 0:20:ed:5f:3:3b ip 98: A.B.C.D: icmp: echo reply
>> 08:41:52.138366 0:20:ed:5f:3:3b 0:19:99:33:7c:9 ip 98: A.B.C.X > A.B.C.D: icmp: echo request
>> 08:41:52.138447 0:19:99:33:7c:9 0:20:ed:5f:3:3b ip 98: A.B.C.D: icmp: echo reply
>> 08:41:52.138692 0:19:99:33:7c:9 0:20:ed:5f:3:3b ip 98: A.B.C.D: icmp: echo reply
>> ^C
>> 169 packets received by filter
>> 0 packets dropped by kernel
>
> Possibly an interrupt is being called twice on the same packet?
>
> Shot in the dark, but try disabling MSI/MSI-X and see if the problem
> recurs. Put this in /boot/loader.conf:
>
> hw.pci.enable_msi="0"
> hw.pci.enable_msix="0"
>
> Reboot, and see if the problem continues.
FYI, this did NOT solve it for me, even though it did solve it for the
original poster. But I did find the solution for my system...
While rebooting to try disabling MSI, I noticed that the machine was still
pingable during the reboot (and returning just one response each), while
the thing was still doing its POST routines -- which of course made me do
a few double-takes, given that the FreeBSD kernel wasn't even running :)
Weirder is that the responses all had the bogus 255 TTL that the dupes had
when the system was up. Once the system did finish booting, the dupes
returned.
Turns out this Intel S3000AHV motherboard has a built-in management
thingie that's kind of IPMI-ish but apparently not quite actually IPMI (at
least ipmitool and freeipmi want nothing to do with it). Somehow it had
gotten itself enabled and was pulling an IP from the DHCP server, and
bridging itself through the onboard LAN. So ping replies were coming from
both the management CPU and the main CPU when the system was up, and just
the management CPU when the system was down. The reason the other Intel
S3000AH* system I have didn't do this is because that other system just
happens to be the DHCP server for its subnet -- and the reason the other
systems w/ the same chipset didn't do it is because they're all Supermicro
boxes with different management CPU's.
So, yeah, short version, goofy pilot error, nothing wrong with FreeBSD, at
least in RELENG_7. Maybe there's an MSI issue in RELENG_6_3 though that
the original poster was hitting, but at least in my case that wasn't it.
More information about the freebsd-stable
mailing list