[Bug 221122] Attaching interface to a bridge stops all traffic on uplink NIC for few seconds

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 31 Aug 2023 03:50:49 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221122

spork@bway.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |spork@bway.net

--- Comment #31 from spork@bway.net ---
I burned a few hours on this last night, first thinking something was amiss
with iocage (fair assumption, as it seems to be another abandoned project).
Then while troubleshooting, I started running the bridge creation and interface
additions by hand and noticed my prompt was hanging for a few seconds. Then I
found the link flaps in the logs:

Aug 29 20:42:56 clweb5 kernel: ext0: link state changed to DOWN
Aug 29 20:43:01 clweb5 kernel: ext0: Link is up, 1 Gbps Full Duplex, Requested
FEC: None, Negotiated FEC: None, Autoneg: True, Flow Control: None
Aug 29 20:43:01 clweb5 kernel: ext0: link state changed to UP
Aug 29 20:45:53 clweb5 kernel: ext0: link state changed to DOWN
Aug 29 20:45:57 clweb5 kernel: ext0: Link is up, 1 Gbps Full Duplex, Requested
FEC: None, Negotiated FEC: None, Autoneg: True, Flow Control: None
Aug 29 20:45:57 clweb5 kernel: ext0: link state changed to UP
Aug 29 20:48:10 clweb5 kernel: ext0: link state changed to DOWN
Aug 29 20:48:15 clweb5 kernel: ext0: Link is up, 1 Gbps Full Duplex, Requested
FEC: None, Negotiated FEC: None, Autoneg: True, Flow Control: None
Aug 29 20:48:15 clweb5 kernel: ext0: link state changed to UP

Seems to take about 5 seconds for it to recover, which is kind of rough on a
box that will be hosting multiple jails.

I understand there were workarounds posted, but I'm curious about the fix
mentioned here and under what conditions this should not happen?

NICs are ixl(4)
OS is: 13.2-RELEASE-p2 FreeBSD 13.2-RELEASE-p2 GENERIC amd64

I did dig through the manpage for if_bridge(4), and I'm sure I saw the note
about matching capabilities, but it didn't really jump out as a cause. Maybe a
note that specifically calls out the most common use case (bridging with
epair(4) for jails, bhyve or other virtualization methods) would be a good
idea? Or even something in epair(4)'s manpage?

-- 
You are receiving this mail because:
You are the assignee for the bug.