[Bug 240106] VNET issue with ARP and routing sockets in jails

From: <bugzilla-noreply_at_freebsd.org>
Date: Sat, 07 May 2022 08:13:44 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240106

O. Hartmann <ohartmann@walstatt.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ohartmann@walstatt.org

--- Comment #19 from O. Hartmann <ohartmann@walstatt.org> ---
Hello.
We also have an similar issue on FreeBSD 12.3-RELEASE-p2 (XigmaNAS, stuck at
-p2 for the moment) as described. The boxes in question do have two NICs, one
is supposed for the management (em0) access and the other one is supposed to be
bound to offered services. Additionally, the second NIC (igb0) is accessible
via an IP AND serves as the physical NIC as member of a bridge for vnet jails,
which do have epair interfaces (in Xigmanas created via the FreeBSD in-tree
tool "jib").
Binding provided services as SAMBA and NFS to the second NIC (igb0) works as
expected, also ping and ssh is no problem.

Base host's IP (both NICs) and those of the jails are within the same network.

When it comes to the vnet jails on the bridge, of which the igb0 NIC is member
of, trouble begins.
We use several jails on those boxes. Pinging those jails from outside the
campus network does work sporadically with some IPs, it takes a long time until
the jail starts repsonding. Same behaviour is within the LAN. 

We also already disabled pfil on the bridges as suggested:

device  if_bridge
net.link.bridge.ipfw: 0
net.link.bridge.allow_llz_overlap: 0
net.link.bridge.inherit_mac: 0
net.link.bridge.log_stp: 0
net.link.bridge.pfil_local_phys: 0
net.link.bridge.pfil_member: 0
net.link.bridge.ipfw_arp: 0
net.link.bridge.pfil_bridge: 0
net.link.bridge.pfil_onlyip: 0

A curiosity is that if one can ping one or two out of the five jails on the
host, in another attempt to do so one, at most two different hosts would answer
the ping then and the former working pinged hosts do not anymore. It is like
gambling.

We also run another host with the very same XigmaNAS version, in that case, he
second NIC is configured to be part of another network and attached to another
switch - not problem there!

In the problematic cases described above, we do not have direct access to the
switches of the backend of the department, so I can't see whether I'm the
culprit (misconfiguration, misunderstanding et cetera of network technology).

Hope the problem could be solved anyway within FreeBSD 12.3.

-- 
You are receiving this mail because:
You are the assignee for the bug.