Re: git: b279e84a47dd - main - sctp: improve consistency

From: <tuexen_at_freebsd.org>
Date: Fri, 04 Aug 2023 21:12:35 UTC
> On 4. Aug 2023, at 17:16, Mark Johnston <markj@FreeBSD.org> wrote:
> 
> On Fri, Aug 04, 2023 at 03:53:31PM +0200, tuexen@freebsd.org wrote:
>>> On 4. Aug 2023, at 15:03, Kristof Provost <kp@FreeBSD.org> wrote:
>>> 
>>> On 29 Jul 2023, at 0:03, Michael Tuexen wrote:
>>> The branch main has been updated by tuexen:
>>> URL: https://cgit.FreeBSD.org/src/commit/?id=b279e84a47ddb59e55b5a3cec31c51cd41bf0dc3
>>> commit b279e84a47ddb59e55b5a3cec31c51cd41bf0dc3 
>>> Author: Michael Tuexen <tuexen@FreeBSD.org> 
>>> AuthorDate: 2023-07-28 12:36:11 +0000 
>>> Commit: Michael Tuexen <tuexen@FreeBSD.org> 
>>> CommitDate: 2023-07-28 12:36:11 +0000
>>> sctp: improve consistency
>>> This is simplifying a patch to address PR 260116.
>>> PR: 260116 
>>> MFC after: 1 week
>>> It looks like this commit (or maybe the next one, c620788150d274c09a070ab486602c98407d73b0) causes a panic in the SCTP code during the sys/netpfil/pf/sctp:basic_v4 test:
>>> panic: Counter goes negative
>>> cpuid = 7
>>> time = 1691145655
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe026ea9a3a0
>>> vpanic() at vpanic+0x132/frame 0xfffffe026ea9a4d0
>>> panic() at panic+0x43/frame 0xfffffe026ea9a530
>>> sctp_abort_notification() at sctp_abort_notification/frame 0xfffffe026ea9a540
>>> sctp_express_handle_sack() at sctp_express_handle_sack+0x647/frame 0xfffffe026ea9a640
>>> sctp_process_control() at sctp_process_control+0xf62/frame 0xfffffe026ea9a990
>>> sctp_common_input_processing() at sctp_common_input_processing+0x561/frame 0xfffffe026ea9ab10
>>> sctp_input_with_port() at sctp_input_with_port+0x1fa/frame 0xfffffe026ea9abe0
>>> sctp_input() at sctp_input+0x10/frame 0xfffffe026ea9abf0
>>> ip_input() at ip_input+0x2ab/frame 0xfffffe026ea9ac50
>>> netisr_dispatch_src() at netisr_dispatch_src+0xad/frame 0xfffffe026ea9acb0
>>> ether_demux() at ether_demux+0x17a/frame 0xfffffe026ea9ace0
>>> ether_nh_input() at ether_nh_input+0x39f/frame 0xfffffe026ea9ad30
>>> netisr_dispatch_src() at netisr_dispatch_src+0xad/frame 0xfffffe026ea9ad90
>>> ether_input() at ether_input+0xd9/frame 0xfffffe026ea9adf0
>>> epair_tx_start_deferred() at epair_tx_start_deferred+0xd7/frame 0xfffffe026ea9ae40
>>> taskqueue_run_locked() at taskqueue_run_locked+0xab/frame 0xfffffe026ea9aec0
>>> taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe026ea9aef0
>>> fork_exit() at fork_exit+0x82/frame 0xfffffe026ea9af30
>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe026ea9af30
>>> --- trap 0, rip = 0x35c39ec8bda0, rsp = 0, rbp = 0x35c39ec8ec90 ---
>>> KDB: enter: panic
>>> 
>>> That seems to be panicking during SCTP packet handling, so I do not believe this to be a pf bug.
>>> Reverting both commits avoids the problem.
>>> It’s also failing during the CI tests: https://ci.freebsd.org/view/Test/job/FreeBSD-main-amd64-test/23957/console
>>> (To reproduce, kldload pf sctp ; cd /usr/tests/sys/netpfil/pf ; sudo kyua test sctp:basic_v4).
>> Thank you very much for that line!!
>> 
>> I think I know what the problem is. It comes down to an inconsistent handling of shutdown() calls.
>> This results in inconsistent sb_ccc / sb_acc values which are now caught.
>> 
>> I'm working on a patch which also fixes several syzkaller issues. Using the above line, I'll also make
>> sure it fixes the issue you are observing. Should be fixed by Monday.
> 
> In the meantime, it's impossible to run through the test suite.  Would
> it be possible to revert the offending commit(s) until a bug fix is
> ready?
I committed a fix for the particular problem triggering the panic when
running the SCTP tests in
https://cgit.FreeBSD.org/src/commit/?id=efb04fb404b240a99c618e49174cd6260217edaa

Thanks to kp@ for providing a hint how to run the particular tests. The SCTP
pf tests should pass now again.

Best regards
Michael