Re: Performance issues with vnet jails + epair + bridge

From: Zhenlei Huang <zlei_at_FreeBSD.org>
Date: Tue, 17 Sep 2024 03:25:07 UTC

> On Sep 16, 2024, at 10:47 PM, Aleksandr Fedorov <wigneddoom@yandex.ru> wrote:
> 
> If we are talking about local traffic between jails and/or host, then in terms of TCP throughput we have a room to improve, for example:

Without RSS option enabled, if_epair will only use one thread to move packets between the pair of interfaces. I reviewed the code
and I think it can be improved event without RSS.

>  
> 1. Stop calculating checksums for packets between VNET jails and host.

I've local WIP for this, inspired by the introduction of IFCAP_VLAN_MTU. Should have better improvement especially  on low freq CPUs.

>  
> 2. Use large packets (TSO) up to 64k in size.
>  
> Just for example, a simple patch increases the throughput of if_pair(4) between two ends from 10 Gbps to 30 Gbps.

That is impressing !

>  
> diff --git a/sys/net/if_epair.c b/sys/net/if_epair.c
> index aeed993249f5..79c2dfcfc445 100644
> --- a/sys/net/if_epair.c
> +++ b/sys/net/if_epair.c
> @@ -164,6 +164,10 @@ epair_tx_start_deferred(void *arg, int pending)
>         while (m != NULL) {
>                 n = STAILQ_NEXT(m, m_stailqpkt);
>                 m->m_nextpkt = NULL;
> +
> +               m->m_pkthdr.csum_flags = CSUM_IP_CHECKED | CSUM_IP_VALID | CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
> +               m->m_pkthdr.csum_data = 0xFFFF;
> +
>                 if_input(ifp, m);
>                 m = n;
>         }
> @@ -538,8 +542,9 @@ epair_setup_ifp(struct epair_softc *sc, char *name, int unit)
>         ifp->if_dunit = unit;
>         ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
>         ifp->if_flags |= IFF_KNOWSEPOCH;
> -       ifp->if_capabilities = IFCAP_VLAN_MTU;
> -       ifp->if_capenable = IFCAP_VLAN_MTU;
> +       ifp->if_capabilities = IFCAP_VLAN_MTU | IFCAP_HWCSUM | IFCAP_HWCSUM_IPV6 | IFCAP_TSO;
> +       ifp->if_capenable = ifp->if_capabilities;
> +       ifp->if_hwassist = (CSUM_IP | CSUM_TCP | CSUM_UDP | CSUM_IP_TSO);

I've not tried TSO on if_epair yet. TSO has special treatment so I guess the above is not sufficient.

>         ifp->if_transmit = epair_transmit;
>         ifp->if_qflush = epair_qflush;
>         ifp->if_start = epair_start;
>  
> 14.09.2024, 05:45, "Zhenlei Huang" <zlei@freebsd.org>:
>  
>  
> 
>  On Sep 13, 2024, at 10:54 PM, Sad Clouds <cryintothebluesky@gmail.com> wrote:
>  
>  On Fri, 13 Sep 2024 08:08:02 -0400
>  Mark Saad <nonesuch@longcount.org> wrote:
>  
>  Sad
>    Can you go back a bit you mentioned there is a RPi in the mix ? Some of the raspberries have their nic usb attached under the covers . Which will kill the total speed of things.
>  
>  Can you cobble together a diagram of what you have on either end ? 
>  Hello, I'm not sending data across the network, only between the host
>  and the jails. I'm trying to evaluate how FreeBSD handles TCP data
>  locally within a single host.
> 
> When you take vnet into account, the **locally** traffic should within
> on single vnet jail. If you want traffic across vnet jails, if_epair or netgraph
> hooks should be employed, and it of course will introduce some overhead.
> 
>  
>  I understand that vnet jails will have more overhead, compared to a
>  shared TCP/IP stack via localhost. So I'm trying to measure it and see
>  where the bottlenecks are.
> 
> The overhead of vnet jail should be neglectable, compared to legacy jail
> or no-jail. Bare in mind when VIMAGE option is enabled, there is a default
> vnet 0. It is not visible via jls and can not be destroyed. So when you see
> bottlenecks, for example this case, it is mostly caused by other components
> such as if_epair, but not the vnet jail itself.
> 
>  
>  The Raspberry Pi 4 host has a single vnet jail, exchanging data with
>  the host via epair(4) and if_bridge(4) interfaces. I don't really know
>  what topology FreeBSD is using to represent all this so can't draw any
>  diagrams, but I think all data flows through the kernel internally and
>  never leaves the physical network interface.
> 
> For vnet jails, when you try to describe the network topology, you can
> treat them as VM / physical boxes.
> 
> I have one box with dozens of vnet jails. Each of them has very single
> responsibility, e.g. DHCP, LADP, pf firewall, OOB access. The topology looks quite
> clear and it is easy to maintenance. The only overhead is too much
> hops between the vnet jail instances. For my use case the performance
> is not critical and it works great for years.
> 
>  
> 
> Best regards,
> Zhenlei