ix driver vlanhwfilter issue - how to catch a lion

From: György Pásztor <coruscant0+freebsd_at_gmail.com>
Date: Sun, 03 Mar 2024 23:29:38 UTC
Hi everyone!

I'd like to report, that there's some issue with the ix driver's
vlanhwfilter feature.

To be more specific, I'm not sure, if this is a driver issue, a hardware
issue, or a firmware issue.
I'm just happy, that I could catch it.

If someone more experienced fella is interested here, I'm happy to help to
dive deeper to find the root cause.

For the rest, I'd rather tell the story, how I could catch this lion.
Maybe my recollection is not 100% correct, but whoever learned some
elevated level of mathematics in high school, learned how to catch a lion
in the desert:
Step 1: cut the desert -- or the remainder, exp below -- in half
Step 2: check in which half the lion is
Step 3: If the size of your half is not bigger than a cage, you caught the
lion. If it is, than repeat from Step 1.

Now I had a similar problem in my homelab network.
I've bought recently one of them cheap minipc's from Aliexpress.
This one has 4 i226v port. 2 X520-DA2 style SFP ports, 2 DDR5 slots, an
nvme slots, 2 sata ports, and a Pentium Gold 8505 CPU.

First I started to experiment with the network configuration. I run several
VMs on this box. I mean, the VMs were already given. They were running on
the predecessor, which only had an i3 cpu and was maxed with an 8GiB RAM
module.

What I wanted to achieve, to have redundant connection to two switches,
"speak rstp", and let the vm's talk to their respective vlans.
I remembered, that on FreeBSD, if you do a bridge between two ports, than
vlans won't work oin them, since even the tagged frames will be handled by
the bridge code.
I was than thinking, I could add another VM and the vm could do vlan
interfaces on the port, which is also connected to the bridge interface.
Than I was thinging a little bit further: I don't even need a VM, if I just
create an epair interface, and put the a half of the epair into the bridge,
and create the vlan interfaces on the b half of the epair.
Config was working, and successful:
ifconfig_igc0="mtu 9004"
ifconfig_igc1="mtu 1504"
ifconfig_ix0="mtu 9004"
ifconfig_epair0a="mtu 9004"
ifconfig_epair0b="mtu 9004"
cloned_interfaces="bridge0 epair0 vlan1 vlan7 vxlan30 bridge1"
ifconfig_bridge0="addm ix0 stp ix0 addm igc0 stp igc0 addm epair0a"
create_args_vlan1="vlan 1 vlandev epair0b mtu 1500"
ifconfig_vlan1="up"
create_args_vlan7="vlan 7 vlandev epair0b mtu 9000"
ifconfig_vlan7="inet 172.16.7.5 netmask 255.255.255.0 up"
ifconfig_bridge1="inet 172.16.33.5 netmask 255.255.255.0 addm vlan1"
create_args_vxlan30="vxlanid 30 vxlanlocal 172.16.7.5 vxlangroup 225.0.0.1
vxlandev vlan7 mtu 1500"
ifconfig_vxlan30="up"

The rest of the vlan interfaces, and the respective switches were created
during the boot via the vm-bhyve's initscript, and managed there.
The bridge for vlan1 was only necessary, because my defaultroute needed
that to be configured earlier in the boot process.
As I said, this was working just fine, until I started to use it.
As soon as I started all the 8 VMs, system crashed within 2 minutes. Wasn't
even reactive to the serial port, nothing. Only the long press of the power
button to turn it off, than after a second turn it back too, could help me
to gain back control over the mini-pc-router-host.

At this point, I try to spend several hours to catch the lion, considering
one of the VMs are the culprit.
Long story short: They weren't.
The more and bigger VMs I started, just made sure, the issue happens
sooner, but the culprit was none of my VMs. Though, I wasn't 100% sure, at
this point. There is a really lightweight, running only an nsd, and have
only one interface. For that alone to make the system crash, would took
much more time. There was another one which had 2 interfaces + carp ip + a
bird for running ospf. The one which has the master carp interface
advertises the stub network from the 2nd interface to the ospf routers on
the 1st interface. In case roles switched from master to backup, bird is
restarted automatically to use the appropriate config file.
Not as big and complex VM like the ones, running pfSense. This vm only
needs 256 MiB ram, but could almost predictably fail the host within ~2hrs.
If I started all 8 VMs, host crashed within 2 minutes.

Anyway... As I said, my educated guess at this point was, that the culprit
are not the VMs, somewhere in the network.

I consulted Zahy, my old friend, who is a more seasoned bsd user than
myself, and has a few decades more experience.
He asked, why am I doing this complex scenario with the bridge.
Well, I had my answer: Let's say the connection fails between the two
switches where this host is connected to, the bsd could still do connect
the two of them. A well desined loop in the network, with the proper
configuration just makes it more redundant. This way the connection between
the two switch won't be an SPF.
Anyway, I listened to him, and tried to simplify the config and replaced
bridge0 with a lagg interface:
####### Fallback network config with failover
#cloned_interfaces="lagg0 vlan1 vlan7 vxlan30 bridge0"
#create_args_lagg0="laggproto failover laggport ix0 laggport igc0 mtu 9004"
#ifconfig_lagg0="up"
#create_args_vlan1="vlan 1 vlandev lagg0 mtu 1500"
#ifconfig_vlan1="up"
#create_args_vlan7="vlan 7 vlandev lagg0 mtu 9000"
#ifconfig_vlan7="inet 172.28.7.5 netmask 255.255.255.0"
#ifconfig_bridge0="inet 172.28.33.5 netmask 255.255.255.0 addm vlan1"
#create_args_vxlan30="vxlanid 30 vxlanlocal 172.28.7.5 vxlangroup 225.0.0.1
vxlandev vlan7 mtu 1500"
#ifconfig_vxlan30="up"

Using lagg instead of bridge didn't solve my problems. I even tried to not
use bridge0, and configure that ip address directly on vlan1, and not run
those VMs, which want to interact with vlan1, but this also was not solving
my problems.
pfsense VMs could crash the systems quite fast.
Then I was thinking of further simplifying the system, and throwing out the
redundancy, and configure everything directly onto the ix0 interface.

Long story short: The issue remains.

And that was the point, I was 100% sure, it's not the bridge, not the lagg,
and not the VMs are the culprit. The system freeze just randomly during one
time during the boot process.
The last message I've seen on the serial console was the initialization of
vlan1.

I even tried to take the miniPC apart. I reseated the 10g card,. Put the
ddr5 module in the other slot. None of those helped.

So, Let's check the other half of the desert:
I configured everything onto the igc0 interface instead of the ix0.

Surprise surprise: everything worked.
After started all 8 VMs, the system was up and running, even 40 minutes
later. That was a 100% proof, that the issue relates to the ix0 interface.
I was even considered to configure everything onto the ix1. Maybe, the port
is the culprit.
But, what are the odds, that one port has a hw faliure and the other has
not?
Pretty slim, so I instead went with my educated guess (one might only call
that a gut feeling)
I checked the difference between the features like between the ix0
interface and the igx0 interface:
root@vjun:~ # ifconfig ix1 | grep options | head -1 ; ifconfig igc0 | grep
options | head -1
options=4e53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
options=4a420b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWTSO,RXCSUM_IPV6,HWSTATS,MEXTPG>
And now a definitive gut feeling moment: Lat's turn off the vlan_hwfilter.
No crash.
I restored the "big" redundant rstp speaking configuration, but with one
tiny difference:
ifconfig_ix0="-vlanhwfilter mtu 9004"
I even put back the vm_list setting to the rc.conf, so the vm's can start
automatically on boot.
root@vjun:~ # uptime
11:20PM  up  7:06, 5 users, load averages: 0.08, 0.19, 0.20

So far so good!

As I said, the lion is now in a cage sized part of the desert.
I only don't know, if this is a driver issue, of firmware issue, or
something with the hardware.
Can someone help me to find out?

Since I have a workaround, I can sleep well now.
But if we could just solve the root cause of the problem it would be even
better.

PS.: A pic and a few second video about the frozen system on my monitor:
https://drive.google.com/drive/folders/1b0TRcd-W0XnHG_Uia5oXgXxk8XH_Jpv5?usp=sharing

PS2: Zahy already told me, I should use the vlanmtu config parameter
instead of configuring the mtu 4 bytes bigger on the main interface. But
this works. I don't want to further mess with it. Also, I presume, that'd
only work, if the vlan interfaces would be created directly on the ix0 and
igc0 devices, and not on the epair0b device. Anyway: This IS working.

TYA!
gyu