Help testing patch that may help diagnosing the PR 240106
Date: Wed, 29 Mar 2023 05:03:29 UTC
Hi, I write here so that the original PR 240106 is not polluted. Can you please test the attached patch with bridge / lagg setup? For long: In https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240106#c28 you encountered problem and I said: > The IF_BRIDGE(4) seems to hide some thing to protect itself get confused. Actually IF_BRIDGE(4) has a learning mode. You can `man ifconfig` and refer the `Bridge Interface Parameters` section. By default the learning mode of all bridge members is on, and the bridge will insert or update an entry to its (internal) forwarding table. When unicast packets come to the bridge member, the bridge will check if it is for itself, if not then the packets will be forwarded to one bridge member if a forwarding entry is found. While the magic is, if the bridge member to be forwarded is the receiving one, then the packets are silently discarded. That's perfect fine, but will be hard to diagnose if user has wrong network setup, bridge loops e.g., or some other ones set duplicated ether address for their nic, or some bad guys / virus / trojans send spoofed packets on the wire. Those are common and I think it will be good if IF_BRIDGE(4) can emit logs so that the symptoms will be obvious and it will be easy to diagnose. > If you can confirm, then please config you switch properly. The two ports cc0 and cc1 connected should be in same link aggregation group. If two ports (on physical switch), say 1 and 2, are not in same link aggregation group, then packets (typically broadcast ones) received on 1 will be forwarded to 2, and the lagg interface will be bounce-backed (from port 2) the packets it send (to port 1). If the lagg happenly be the member of IF_BRIDGE(4), then the bridge will update its forwarding entry as it learn mac address from lagg interface. Here is a simple diagram, the arrow shows the flow of ARP request from epair0a. 11:22:33:44:55:66 [1] -> cc0 -> port 1 -> epair0a -> epair0b -> bridge0 -> lagg0 physical-switch <-> host0 <- <- cc1 <- port 2 <- [2] On [1] bridge0 will learn MAC 11:22:33:44:55:66 on port member epair0b and add entry, after [2] it will learn same MAC on port member lagg0 and update the entry. Then subsequent ARP reply (to 11:22:33:44:55:66, epair0a i.e.) sent from host0 reach bridge0 via lagg0. Apparently bridge0 will dropped the ARP reply as it believes 11:22:33:44:55:66 (epair0a) is within segment of lagg0. > I'll see if I can teach IF_BRIDGE(4) to emit warnings in case it get ARP request packet sent from it self. Attached patch will enable IF_BRIDGE(4) to emit logs about MAC address port flapping. Various hardware vendors have similar facilities. Best regards, Zhenlei