From nobody Mon Aug 30 08:22:03 2021 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 363D317A2773 for ; Mon, 30 Aug 2021 08:22:23 +0000 (UTC) (envelope-from melifaro@ipfw.ru) Received: from forward106o.mail.yandex.net (forward106o.mail.yandex.net [IPv6:2a02:6b8:0:1a2d::609]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Gyjwt5PdCz4tFP; Mon, 30 Aug 2021 08:22:22 +0000 (UTC) (envelope-from melifaro@ipfw.ru) Received: from sas2-5e673da07037.qloud-c.yandex.net (sas2-5e673da07037.qloud-c.yandex.net [IPv6:2a02:6b8:c14:6009:0:640:5e67:3da0]) by forward106o.mail.yandex.net (Yandex) with ESMTP id 69CAC568D50D; Mon, 30 Aug 2021 11:22:11 +0300 (MSK) Received: from sas1-e00c2743cdb8.qloud-c.yandex.net (sas1-e00c2743cdb8.qloud-c.yandex.net [2a02:6b8:c14:3a22:0:640:e00c:2743]) by sas2-5e673da07037.qloud-c.yandex.net (mxback/Yandex) with ESMTP id Wnj3LJAcJe-MBHiMQ0g; Mon, 30 Aug 2021 11:22:11 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfw.ru; s=mail; t=1630311731; bh=d+T5OKcivlerAFQuN+tH+MfLikuFwD7JY6dkLXd/a30=; h=To:In-Reply-To:References:Date:Subject:Cc:Message-Id:From; b=gFmtntWKVYfj1OoyXMoWkWgHv32z46cKpBS9d04qC4ZOXHG9N13bSVJkQNOP2Mdu0 pMXvf7Byxo9MYP+194W5VeTQ59OXX0NQYdZmEXPh0m3qP19OFtJfGhGogR7nU1bQLE NjN0m4Ltc+6KuHb2HD50Y5OAh4CK8tAd537X+v24= Received: by sas1-e00c2743cdb8.qloud-c.yandex.net (smtp/Yandex) with ESMTPSA id D8YchqEPNK-MAHuSIbX; Mon, 30 Aug 2021 11:22:10 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) From: "Alexander V. Chernikov" Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_573A381F-7F67-4E34-BCB1-D7E43658E04C" List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Subject: Re: network crash in nhop_free Date: Mon, 30 Aug 2021 09:22:03 +0100 In-Reply-To: <24574d7e-6986-99ee-86f5-a1c1c513b3f6@FreeBSD.org> Cc: freebsd-net To: Andriy Gapon References: <2fbc5205-3fcc-d233-dae1-cf6ddc8d691d@FreeBSD.org> <95F4F779-91A0-482B-B26B-6C95A60FC281@ipfw.ru> <70d1091d-07ec-1c76-29bc-1f2e2264b55a@FreeBSD.org> <869483A6-FA65-40A2-9CCC-05216588EAC8@ipfw.ru> <6a1874fc-a5ac-31a3-13ee-390177091ce6@FreeBSD.org> <24574d7e-6986-99ee-86f5-a1c1c513b3f6@FreeBSD.org> X-Mailer: Apple Mail (2.3654.120.0.1.13) X-Rspamd-Queue-Id: 4Gyjwt5PdCz4tFP X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-Spam: Yes X-ThisMailContainsUnwantedMimeParts: Y --Apple-Mail=_573A381F-7F67-4E34-BCB1-D7E43658E04C Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 30 Aug 2021, at 08:32, Andriy Gapon wrote: >=20 > On 30/08/2021 10:28, Andriy Gapon wrote: >> On 01/08/2021 16:36, Alexander V. Chernikov wrote: >>>=20 >>>=20 >>>> On 10 Jul 2021, at 10:07, Andriy Gapon wrote: >>>>=20 >>>> On 09/07/2021 00:02, Alexander V. Chernikov wrote: >>>>> Hi Andriy, >>>>> Could you by any chance provide a bit more info on the system = networking configuration and the steps leading to panic? >>>>> No chance for a coredump? >>>>> destroy_nhgrp() suggests that there was a multipath route = (default?) that was deleted. >>>>> nhops are created with UMA_ALIGN_PTR, so I suspect there is a = garbage inside nhgrp pointer.. >>>>=20 >>>> I've just reproduced the problem and got a crash dump. >>>> The new panic is a little bit different, but I think that it = confirms your analysis. >>>> Also, you are right about the multipath route, although its = creation was not intentional. >>>=20 >>> Should be fixed by = https://cgit.freebsd.org/src/commit/?id=3D054948bd81bb9e4e32449cf351b62e50= 1b8831ff . >> I have to report that, unfortunately, as of main = bb958dcf3d8af3a033dacbf8133681c9b0c73b2f I can still reproduce the same = panic using the same steps. Thanks for reporting! >> To be clear, as I reported two similar but still distinct panics, = it's the first panic, "Misaligned access from kernel space!". >> I should also add that the commit message does not really match my = scenario. >> In my case routes do not change quite fast. I have generous pauses = between starting and stopping ppp. >> I have a feeling that there must something more deterministic that = leads to the crash. I tried to reproduced it with delays, then removed the delays and got a = similar panic - that=E2=80=99s how I got to the previous bug. I=E2=80=99ll set up an arm64 machine hopefully this week and try to = reproduce it. >=20 > Some more details from the today's crash: Is there any chance you can share kernel/core? >=20 > panic: Misaligned access from kernel space! > cpuid =3D 0 > time =3D 1630308311 > KDB: stack backtrace: > db_trace_self() at db_trace_self > db_trace_self_wrapper() at db_trace_self_wrapper+0x30 > vpanic() at vpanic+0x184 > panic() at panic+0x44 > align_abort() at align_abort+0xb8 > handle_el1h_sync() at handle_el1h_sync+0x78 > --- exception, esr 0x96000021 > nhop_free() at nhop_free+0x100 > destroy_nhgrp() at destroy_nhgrp+0x38 > epoch_call_task() at epoch_call_task+0x158 > gtaskqueue_run_locked() at gtaskqueue_run_locked+0x178 > gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc8 > fork_exit() at fork_exit+0x74 > fork_trampoline() at fork_trampoline+0x14 > Uptime: 11m17s > Dumping 150 out of 998 MB:..3%..11%..22%..32%..43%..51%..62%..72%..83% >=20 > get_curthread () at /usr/devel/git/rock/sys/arm64/include/pcpu.h:68 > 68 /usr/devel/git/rock/sys/arm64/include/pcpu.h: No such file or = directory. > (kgdb) bt > #0 get_curthread () at = /usr/devel/git/rock/sys/arm64/include/pcpu.h:68 > #1 doadump (textdump=3Dtextdump@entry=3D1) at = /usr/devel/git/rock/sys/kern/kern_shutdown.c:417 > #2 0xffff0000003bebf0 in kern_reboot (howto=3D260) at = /usr/devel/git/rock/sys/kern/kern_shutdown.c:504 > #3 0xffff0000003bf10c in vpanic (fmt=3D, ap=3D...) at = /usr/devel/git/rock/sys/kern/kern_shutdown.c:947 > #4 0xffff0000003bee3c in panic (fmt=3D0x0) at = /usr/devel/git/rock/sys/kern/kern_shutdown.c:871 > #5 0xffff0000006c2054 in align_abort (td=3D, = frame=3D, esr=3D2516582433, far=3D16045693110842147062, = lower=3D) at = /usr/devel/git/rock/sys/arm64/arm64/trap.c:212 > #6 > #7 atomic_fetchadd_32_llsc (p=3D0xdeadc0dedeadc0f6, val=3D4294967295) = at /usr/devel/git/rock/sys/arm64/include/atomic.h:316 > #8 atomic_fetchadd_32 (p=3D, val=3D4294967295) at = /usr/devel/git/rock/sys/arm64/include/atomic.h:316 > #9 refcount_releasen (count=3D0xdeadc0dedeadc0f6, n=3D1) at = /usr/devel/git/rock/sys/sys/refcount.h:152 > #10 refcount_release (count=3D0xdeadc0dedeadc0f6) at = /usr/devel/git/rock/sys/sys/refcount.h:174 > #11 nhop_free (nh=3D) at = /usr/devel/git/rock/sys/net/route/nhop_ctl.c:669 > #12 0xffff000000506268 in free_nhgrp_nhops = (nhg_priv=3D0xffffa00000d31b98) at = /usr/devel/git/rock/sys/net/route/nhgrp_ctl.c:423 > #13 destroy_nhgrp (nhg_priv=3D0xffffa00000d31b98) at = /usr/devel/git/rock/sys/net/route/nhgrp_ctl.c:380 > #14 0xffff000000405434 in epoch_call_task (arg=3D) at = /usr/devel/git/rock/sys/kern/subr_epoch.c:819 > #15 0xffff000000408ee4 in gtaskqueue_run_locked = (queue=3Dqueue@entry=3D0xffffa00000c03c00) at = /usr/devel/git/rock/sys/kern/subr_gtaskqueue.c:371 > #16 0xffff000000408c38 in gtaskqueue_thread_loop = (arg=3Darg@entry=3D0xffff000089b69008) at = /usr/devel/git/rock/sys/kern/subr_gtaskqueue.c:547 > #17 0xffff00000037701c in fork_exit (callout=3D0xffff000000408b6c = , arg=3D0xffff000089b69008, = frame=3D0xffff000087934990) at = /usr/devel/git/rock/sys/kern/kern_fork.c:1087 >=20 > --=20 > Andriy Gapon --Apple-Mail=_573A381F-7F67-4E34-BCB1-D7E43658E04C--