[freebsd-current]Who should reset M_PKTHDR flag in m_buf when IP packets are fragmented. m_unshare panic throw when IPSec is enabled
Navdeep Parhar
np at FreeBSD.org
Wed Dec 27 20:09:42 UTC 2017
On 12/26/2017 03:33, Andrey V. Elsukov wrote:
> On 26.12.2017 13:22, Harsh Jain wrote:
>>>> panic: m_unshare: m0 0xfffff80020f82600, m 0xfffff8005d054100 has M_PKTHDR
>>>> cpuid = 15
>>>> time = 1495578455
>>>> KDB: stack backtrace:
>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe044e9bb890
>>>> kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe044e9bb960
>>>> vpanic() at vpanic+0x269/frame 0xfffffe044e9bba30
>>>> kassert_panic() at kassert_panic+0xc7/frame 0xfffffe044e9bbac0
>>>> m_unshare() at m_unshare+0x578/frame 0xfffffe044e9bbbc0
>>>> esp_output() at esp_output+0x44c/frame 0xfffffe044e9bbe40
>>>> ipsec4_perform_request() at ipsec4_perform_request+0x5df/frame 0xfffffe044e9bbff0
>>> Hi,
>>>
>>> it seems unusual that IP reassembly happens on outbound path.
>> It can be re-produced with single Ping packet on chelsio(cxgbe) NIC. I tried with Intel NIC. It seems they re-produce M_WRITEABLE() buffer(follows different path in m_unshare) which is not true for cxgbe.
>
> In my view, IP fragmentation should occur in ip_output after IPsec
> encryption. Something like:
>
> 1. rip_output() has mbuf chain where only first mbuf has M_PKTHDR flag
> 2. ip_output() -> IPSEC_OUTPUT() -> esp_output() -> m_unshare(). We
> should still have only one mbuf with M_PKTHDR flag here.
> 3. esp_output_cb() -> ipsec_process_done() -> ip_output()
> 4. Now IP fragmentation should occur: ip_fragment() creates chain of
> mbufs to send, where M_PKTHDR flag will be set for each fragment.
>
>>> Do you have some packet normalization using firewall?
>> Default FREEBSD current installation. No explicit firewall.
>> What you think above patch makes sense.
>
> It is not clear to me why it helps. The panic happens on outbound path,
> where mbuf should be allocated by network stack and should be writeable.
> ip_reass() usually used on inbound path. I think the patch just hides
> the problem in another place.
> Do you mean that cxgbe can produce !WRITEABLE mbuf for received packet
> and then pass it to the network stack?
>
Yes, cxgbe does that. But I think the real bug here is in ip_reass
because it doesn't properly get rid of the pkthdr of the fragments while
creating the reassembled datagram. cxgbe happens to trip on this easily
because it often creates !WRITEABLE mbufs.
This should fix it:
https://people.freebsd.org/~np/ip_reass_demotehdr.diff
It will also fix leaks in configurations where mbuf tags are in use by
default (for example with MAC), ip_reass is involved during rx, and the
mbuf chain never gets m_demote'd elsewhere (meaning ip_reass should have
freed the tags itself).
Regards,
Navdeep
More information about the freebsd-net
mailing list