Re: panic: syncache: mbuf too small

From: Bjoern A. Zeeb <bzeeb-lists_at_lists.zabbadoz.net>
Date: Tue, 08 Feb 2022 22:14:12 UTC
On Tue, 8 Feb 2022, Drew Gallatin wrote:

> I suspect that it's ic->ic_headroom, which seems to be driver dependent.
> And that its going kaboom because of the combo of IPv6 plus some driver
> with a large ic_headroom..

Yeah, one of the Realtek drivers I was looking at sets it to 40/48
depending on chipset.

Others vendor drivers are in the order of 26/28-ish max which would be
an exact fit (without UDP tunneling)...

> It would be really unfortunate if we had to expand mbufs because of some
> wifi driver.   Perhaps they could be taught to chain headers..

Realtek is doing a few "funny" things there; a lot of being single
segment DMAs up-to 12k-ish .. not being helpful at all.

I'll go and see if I can figure it out for this one specifically
then *sigh*.  For as long as no other drivers do similar things
I am happy to work around it.


Hmm  bwi(4)  is probably not much used anymore as from a quick glance
that is also going big (82 by manual counting) and bwn(4) even more?

So either our size massively shrunk in mbufs or that problem was there
a decade ago already ... and we didn't notice?


/bz


> On Tue, Feb 8, 2022 at 2:45 PM Bjoern A. Zeeb <
> bzeeb-lists@lists.zabbadoz.net> wrote:
>
>> On Tue, 8 Feb 2022, Bjoern A. Zeeb wrote:
>>
>>> On Tue, 8 Feb 2022, Drew Gallatin wrote:
>>>
>>>> Can you examine max_linkhdr?
>>>
>>> Yes, was still sitting in ddb (thankfully watchdog got disabled):
>>>
>>> db> x max_linkhdr
>>> max_linkhdr:    58
>>>
>>> And for consistency checks:
>>>
>>> db> x max_hdr
>>> max_hdr:        94
>>> db> x max_datalen
>>> max_datalen:    14
>>> db> x max_protohdr
>>> max_protohdr:   3c
>>
>> If I do the maths correctly:
>>
>> MHLEN = 168             (0x94  + 0x14)
>>
>> TCP_MAXHLEN = 60 - 24 = 36 TCP_MAXOLEN
>>
>> max_linkhdr =           88
>>
>> 168 - 88 - 36 = 44
>>
>> ipv6_hdr size = 40
>>
>> Leaves us with 4 for the tcp_header again?  Which would be 24?
>>
>>
>> Why would this not go kaboom all the time?
>>
>> Hmm I assume it's ieee80211_proto.c .. it changes max_linkhdr ..
>>
>>
>>
>>
>>
>>> db> show reg
>>> cs                        0x20
>>> ds                        0x3b
>>> es                        0x3b
>>> fs                        0x13
>>> gs                        0x1b
>>> ss                        0x28
>>> rax                       0x12
>>> rcx                        0x1
>>> rdx         0xffffffff811f6d0a
>>> rbx         0xffffffff812e614c
>>> rsp         0xfffffe0007fa15a0
>>> rbp         0xfffffe0007fa15b0
>>> rsi                       0x80
>>> rdi         0xffffffff81e8cec0  cnputs_mtx
>>> r8                        0x10
>>> r9                       0x1d0
>>> r10         0xffffffff81cfa820  vga_conssoftc
>>> r11                       0x10
>>> r12         0xffffffff812961ab
>>> r13                       0x28
>>> r14                      0x100
>>> r15         0xfffffe000937a740
>>> rip         0xffffffff80c545a7  kdb_enter+0x37
>>> rflags                    0x86
>>> kdb_enter+0x37: movq    $0,0x1283a5e(%rip)
>>>
>>> Found a console log;  the system was idle, right after a boot for a few
>>> minutes.
>>> It's a lab machine having booted off IPv4 (grml) but also having IPv6 on
>>> the network.
>>>
>>> According to terminal backlogs it was an incoming IPv6 ssh session likely
>>> to have triggered this.  Always great if things are "idle" and only few
>>> people
>>> to ask.
>>>
>>> it is amd64;  main @ 773e3a71b2f11d422694495aca988d4c7143601b from Jan
>> 31st.
>>>
>>> /bz
>>>
>>>
>>>> Drew
>>>>
>>>> On Tue, Feb 8, 2022 at 1:58 PM Bjoern A. Zeeb <
>>>> bzeeb-lists@lists.zabbadoz.net> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I just came to a console finding this.  The tree is from a few days
>> ago;
>>>>> is this known or should I investigate if it happens again?   I sadly
>>>>> cannot
>>>>> dump on this machine.
>>>>>
>>>>> /bz
>>>>>
>>>>> db> show panic
>>>>> panic: syncache: mbuf too small
>>>>> db> where
>>>>> Tracing pid 0 tid 100014 td 0xfffffe000937a740
>>>>> kdb_enter() at kdb_enter+0x37/frame 0xfffffe0007fa15b0
>>>>> vpanic() at vpanic+0x1b0/frame 0xfffffe0007fa1600
>>>>> panic() at panic+0x43/frame 0xfffffe0007fa1660
>>>>> syncache_respond() at syncache_respond+0x777/frame 0xfffffe0007fa1730
>>>>> syncache_add() at syncache_add+0xa71/frame 0xfffffe0007fa18c0
>>>>> tcp_input_with_port() at tcp_input_with_port+0x14f5/frame
>>>>> 0xfffffe0007fa1a20
>>>>> tcp6_input_with_port() at tcp6_input_with_port+0x69/frame
>>>>> 0xfffffe0007fa1a50
>>>>> tcp6_input() at tcp6_input+0xb/frame 0xfffffe0007fa1a60
>>>>> ip6_input() at ip6_input+0xc2f/frame 0xfffffe0007fa1b40
>>>>> netisr_dispatch_src() at netisr_dispatch_src+0xaf/frame
>> 0xfffffe0007fa1ba0
>>>>> ether_demux() at ether_demux+0x16e/frame 0xfffffe0007fa1bd0
>>>>> ether_nh_input() at ether_nh_input+0x3fc/frame 0xfffffe0007fa1c30
>>>>> netisr_dispatch_src() at netisr_dispatch_src+0xaf/frame
>> 0xfffffe0007fa1c90
>>>>> ether_input() at ether_input+0x99/frame 0xfffffe0007fa1cf0
>>>>> iflib_rxeof() at iflib_rxeof+0xcb3/frame 0xfffffe0007fa1e00
>>>>> _task_fn_rx() at _task_fn_rx+0x7a/frame 0xfffffe0007fa1e40
>>>>> gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame
>>>>> 0xfffffe0007fa1ec0
>>>>> gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame
>>>>> 0xfffffe0007fa1ef0
>>>>> fork_exit() at fork_exit+0x80/frame 0xfffffe0007fa1f30
>>>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0007fa1f30
>>>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---

-- 
Bjoern A. Zeeb                                                     r15:7