From nobody Sat Nov 25 03:39:13 2023
X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ScczL4W9Wz52Q1G
	for <freebsd-stable@mlmmj.nyi.freebsd.org>; Sat, 25 Nov 2023 03:39:26 +0000 (UTC)
	(envelope-from freebsd-stable1@sentry.org)
Received: from shadow.sentry.org (shadow.sentry.org [210.8.237.106])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "shadow.sentry.org", Issuer "R3" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4ScczK3BWrz3WtS
	for <freebsd-stable@freebsd.org>; Sat, 25 Nov 2023 03:39:25 +0000 (UTC)
	(envelope-from freebsd-stable1@sentry.org)
Authentication-Results: mx1.freebsd.org;
	dkim=none;
	spf=pass (mx1.freebsd.org: domain of freebsd-stable1@sentry.org designates 210.8.237.106 as permitted sender) smtp.mailfrom=freebsd-stable1@sentry.org;
	dmarc=none
Received: from shadow.sentry.org (localhost [127.0.0.1])
	by shadow.sentry.org (8.17.1/8.16.1) with ESMTP id 3AP3dDeY027203
	for <freebsd-stable@freebsd.org>; Sat, 25 Nov 2023 14:39:13 +1100 (AEDT)
	(envelope-from freebsd-stable1@sentry.org)
To: FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
From: Trev <freebsd-stable1@sentry.org>
Subject: ZFS high CPU use after backup and panic on shutdown
Message-ID: <6ccc544a-7919-57ab-5572-db67fa09ae76@sentry.org>
Date: Sat, 25 Nov 2023 14:39:13 +1100
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101
 Firefox/91.0
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-stable
List-Help: <mailto:stable+help@freebsd.org>
List-Post: <mailto:stable@freebsd.org>
List-Subscribe: <mailto:stable+subscribe@freebsd.org>
List-Unsubscribe: <mailto:stable+unsubscribe@freebsd.org>
Sender: owner-freebsd-stable@freebsd.org
X-BeenThere: freebsd-stable@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Message whitelisted by Sendmail access database, not delayed by milter-greylist-4.6.4 (shadow.sentry.org [0.0.0.0]); Sat, 25 Nov 2023 14:39:13 +1100 (AEDT)
X-Spamd-Result: default: False [-0.59 / 15.00];
	FORGED_MUA_MOZILLA_MAIL_MSGID_UNKNOWN(2.50)[];
	NEURAL_HAM_SHORT(-1.00)[-1.000];
	NEURAL_HAM_LONG(-1.00)[-0.996];
	NEURAL_HAM_MEDIUM(-0.90)[-0.895];
	R_SPF_ALLOW(-0.20)[+mx];
	MIME_GOOD(-0.10)[text/plain];
	ONCE_RECEIVED(0.10)[];
	PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org];
	R_DKIM_NA(0.00)[];
	FROM_EQ_ENVFROM(0.00)[];
	MIME_TRACE(0.00)[0:+];
	ASN(0.00)[asn:2764, ipnet:210.8.0.0/15, country:AU];
	MLMMJ_DEST(0.00)[freebsd-stable@freebsd.org];
	TO_MATCH_ENVRCPT_ALL(0.00)[];
	ARC_NA(0.00)[];
	FROM_HAS_DN(0.00)[];
	DMARC_NA(0.00)[sentry.org];
	RCVD_COUNT_ONE(0.00)[1];
	RCPT_COUNT_ONE(0.00)[1];
	TO_DN_ALL(0.00)[];
	MID_RHS_MATCH_FROM(0.00)[];
	RCVD_TLS_LAST(0.00)[]
X-Rspamd-Queue-Id: 4ScczK3BWrz3WtS
X-Spamd-Bar: /

I recently updated from source FreeBSD 12-STABLE to FreeBSD 13-STABLE 
(stable/13-221a60a42: Tue Nov 14 15:36:40 AEDT 2023).

Ever since, after my ZFS backup to an external USB drive, the system 
continues to consume 100% of one core of the 2011 Mac mini (i7, 16G).

Example backup command from my shell script:

zfs send data/www@${snapshot} | bzip2 > /mnt/zfs-data-www-${snapshot}.bz2

Neither top, vmstat nor iostat give any clue as to what system process 
is using 100% of one core. To stop this phenomenon I have to shutdown 
and reboot the system which I do with "shutdown -r now".

This always results in a kernel panic:

Waiting (max 60 seconds) for system process `vnlru' to stop... done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining... 4 6 4 2 1 2 1 0 0 0 done
All buffers synced.
Uptime: 1d23h34m0s


Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address   = 0x440
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80584d3c
stack pointer           = 0x28:0xfffffe00c7352d80
frame pointer           = 0x28:0xfffffe00c7352df0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (arc_prune)
trap number             = 12
panic: page fault
cpuid = 3
time = 1700828648
KDB: stack backtrace:
#0 0xffffffff805bf675 at kdb_backtrace+0x65
#1 0xffffffff8057b0d2 at vpanic+0x152
#2 0xffffffff8057af73 at panic+0x43
#3 0xffffffff80845519 at trap_fatal+0x389
#4 0xffffffff8084556f at trap_pfault+0x4f
#5 0xffffffff8081f30e at calltrap+0x8
#6 0xffffffff81a7274a at arc_prune_task+0x7a
#7 0xffffffff81a2865f at taskq_run+0x1f
#8 0xffffffff805d2e52 at taskqueue_run_locked+0x162
#9 0xffffffff805d3d72 at taskqueue_thread_loop+0xb2
#10 0xffffffff8053ed81 at fork_exit+0x71
#11 0xffffffff8082038e at fork_trampoline+0xe
Uptime: 1d23h34m0s
Dumping 1306 out of 16346 
MB:..2%..12%..21%..31%..41%..51%..61%..72%..81%..91%

The current process is always "arc_prune".

Where to go from here to resolve this?