Re: IPv6 inflight fragmentation

From: Peter <pmc_at_citylink.dinoex.sub.org>
Date: Mon, 01 Nov 2021 20:56:58 UTC
Hi Andrey!

On Mon, Nov 01, 2021 at 02:44:50PM +0300, Andrey V. Elsukov wrote:
! 31.10.2021 05:24, Peter пишет:
! > From what I understood, inflight fragmentation (on an intermediate router)
! > is not practical with IPv6. But it happens:
! > And it doesn't seem like these packets would be answered at all.
! > 
! > This happens when there is a dummynet pipe/queue rule (or a divert
! > rule) in the outbound rules to an interface that must reduce the MTU.
! > As soon as we skip over that dummynet (or divert), we get these ICMPv6
! > messages at the other end, and the fragmentation ceases:

! 
! divert rule does implicit IP fragments reassembling before passing a
! packet to application. I don't think dummynet is affected by this.

No, we're not going to an application, we are routing to the
Internet. And the uplink iface (tun0) has mtu=1492. And we have a rule
in ipfw, like:

> queue 21 proto all <whatever> xmit tun0 out

And we have sysctl net.inet.ip.fw.one_pass=0

So, at the time when we go thru the queue, we do not yet know the
actual interface to use for xmit (because there might still be a
"forward" rule following), so we do not yet know the mtu.

Only when we finally give the packet out for sending, *after* passing
the queue, then we will recognize our actual mtu. And then the
difference happens:

 * if we did *not* go through the queue, the packet is (probably)
   dropped and an ICMPv6 type 2 ("too big") is sent back to the
   originator. This is how I understand that it should work, and
   that works.

 * if we *did* go through the queue, the packet is split into
   fragments although it is IPv6. And that does not work; such packet
   does not get answered by Youtube, and playback hangs. From a quick
   glance the fragments do look technically correct - and I have no
   idea why YT would receive a fullsized packet from the player,
   anyway (and I won't analyze their stuff).

The behaviour is the same if there is either a "queue" action or
a "divert" action or both.
With "divert" we know that the mbuf flags are lost - with dummynet
I did not yet look into the code. I had a hard time finding the cause
in bulky video data, and then I simply reduced the mtu one hop earlier
within my intranet, to workaround the issue for now.

Later on I will start to move my nested VPN tunnels (VPN within VPN
within ...) to IPv6, and look into the mtu handling with these, and
I will probably need NPTv6 and divert for these, and I expect there
will be more fun then, so I will have to setup a test environment
anyway.

So much for now,
cheerio

PMc