Re: Help testing patch that may help diagnosing the PR 240106

From: Zhenlei Huang <zlei_at_FreeBSD.org>
Date: Wed, 29 Mar 2023 05:05:08 UTC

> On Mar 29, 2023, at 1:03 PM, Zhenlei Huang <zlei@FreeBSD.org> wrote:
> 
> Hi,
> 
> I write here so that the original PR 240106 is not polluted.
> 
> Can you please test the attached patch with bridge / lagg setup?
> 
> For long:
> 
> In https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240106#c28 you encountered
> problem and I said:
> 
>> The IF_BRIDGE(4) seems to hide some thing to protect itself get confused.
> Actually IF_BRIDGE(4) has a learning mode. You can `man ifconfig` and refer the
> `Bridge Interface Parameters` section.
> 
> By default the learning mode of all bridge members is on, and the bridge will
> insert or update an entry to its (internal) forwarding table. When unicast packets
> come to the bridge member, the bridge will check if it is for itself, if not then
> the packets will be forwarded to one bridge member if a forwarding entry is found.
> While the magic is, if the bridge member to be forwarded is the receiving one, then
> the packets are silently discarded.
> 
> That's perfect fine, but will be hard to diagnose if user has wrong network setup,
> bridge loops e.g., or some other ones set duplicated ether address for their nic,
> or some bad guys / virus / trojans send spoofed packets on the wire. Those are common
> and I think it will be good if IF_BRIDGE(4) can emit logs so that the symptoms will
> be obvious and it will be easy to diagnose.
> 
>> If you can confirm, then please config you switch properly. The two ports cc0 and cc1 connected should be in same link aggregation group.
> 
> If two ports (on physical switch), say 1 and 2, are not in same link aggregation group,
> then packets (typically broadcast ones) received on 1 will be forwarded to 2, and
> the lagg interface will be bounce-backed (from port 2) the packets it send (to port 1).
> If the lagg happenly be the member of IF_BRIDGE(4), then the bridge will update
> its forwarding entry as it learn mac address from lagg interface.
> 
> Here is a simple diagram, the arrow shows the flow of ARP request from epair0a.
> 
> 11:22:33:44:55:66         [1]                  -> cc0 ->  port 1 -> 
>       epair0a -> epair0b -> bridge0 -> lagg0                        physical-switch <-> host0
>                                     <-        <- cc1 <-  port 2 <-  
>                                     [2]                          
> 
> On [1] bridge0 will learn MAC 11:22:33:44:55:66 on port member epair0b and add entry,
> after [2] it will learn same MAC on port member lagg0 and update the entry. Then
> subsequent ARP reply (to 11:22:33:44:55:66, epair0a i.e.) sent from host0 reach bridge0
> via lagg0.
> 
> Apparently bridge0 will dropped the ARP reply as it believes 11:22:33:44:55:66 (epair0a) is
> within segment of lagg0.
> 
>> I'll see if I can teach IF_BRIDGE(4) to emit warnings in case it get ARP request packet sent from it self.
> 
> Attached patch will enable IF_BRIDGE(4) to emit logs about MAC address port flapping.
> Various hardware vendors have similar facilities.
> 
> 
> Best regards,
> Zhenlei
> 

Sorry forgot the patch.