From nobody Sat Nov 25 03:39:13 2023 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ScczL4W9Wz52Q1G for ; Sat, 25 Nov 2023 03:39:26 +0000 (UTC) (envelope-from freebsd-stable1@sentry.org) Received: from shadow.sentry.org (shadow.sentry.org [210.8.237.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "shadow.sentry.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ScczK3BWrz3WtS for ; Sat, 25 Nov 2023 03:39:25 +0000 (UTC) (envelope-from freebsd-stable1@sentry.org) Authentication-Results: mx1.freebsd.org; dkim=none; spf=pass (mx1.freebsd.org: domain of freebsd-stable1@sentry.org designates 210.8.237.106 as permitted sender) smtp.mailfrom=freebsd-stable1@sentry.org; dmarc=none Received: from shadow.sentry.org (localhost [127.0.0.1]) by shadow.sentry.org (8.17.1/8.16.1) with ESMTP id 3AP3dDeY027203 for ; Sat, 25 Nov 2023 14:39:13 +1100 (AEDT) (envelope-from freebsd-stable1@sentry.org) To: FreeBSD-STABLE Mailing List From: Trev Subject: ZFS high CPU use after backup and panic on shutdown Message-ID: <6ccc544a-7919-57ab-5572-db67fa09ae76@sentry.org> Date: Sat, 25 Nov 2023 14:39:13 +1100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101 Firefox/91.0 List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Message whitelisted by Sendmail access database, not delayed by milter-greylist-4.6.4 (shadow.sentry.org [0.0.0.0]); Sat, 25 Nov 2023 14:39:13 +1100 (AEDT) X-Spamd-Result: default: False [-0.59 / 15.00]; FORGED_MUA_MOZILLA_MAIL_MSGID_UNKNOWN(2.50)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-0.996]; NEURAL_HAM_MEDIUM(-0.90)[-0.895]; R_SPF_ALLOW(-0.20)[+mx]; MIME_GOOD(-0.10)[text/plain]; ONCE_RECEIVED(0.10)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; R_DKIM_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:2764, ipnet:210.8.0.0/15, country:AU]; MLMMJ_DEST(0.00)[freebsd-stable@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; DMARC_NA(0.00)[sentry.org]; RCVD_COUNT_ONE(0.00)[1]; RCPT_COUNT_ONE(0.00)[1]; TO_DN_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_LAST(0.00)[] X-Rspamd-Queue-Id: 4ScczK3BWrz3WtS X-Spamd-Bar: / I recently updated from source FreeBSD 12-STABLE to FreeBSD 13-STABLE (stable/13-221a60a42: Tue Nov 14 15:36:40 AEDT 2023). Ever since, after my ZFS backup to an external USB drive, the system continues to consume 100% of one core of the 2011 Mac mini (i7, 16G). Example backup command from my shell script: zfs send data/www@${snapshot} | bzip2 > /mnt/zfs-data-www-${snapshot}.bz2 Neither top, vmstat nor iostat give any clue as to what system process is using 100% of one core. To stop this phenomenon I have to shutdown and reboot the system which I do with "shutdown -r now". This always results in a kernel panic: Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 4 6 4 2 1 2 1 0 0 0 done All buffers synced. Uptime: 1d23h34m0s Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x440 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80584d3c stack pointer = 0x28:0xfffffe00c7352d80 frame pointer = 0x28:0xfffffe00c7352df0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (arc_prune) trap number = 12 panic: page fault cpuid = 3 time = 1700828648 KDB: stack backtrace: #0 0xffffffff805bf675 at kdb_backtrace+0x65 #1 0xffffffff8057b0d2 at vpanic+0x152 #2 0xffffffff8057af73 at panic+0x43 #3 0xffffffff80845519 at trap_fatal+0x389 #4 0xffffffff8084556f at trap_pfault+0x4f #5 0xffffffff8081f30e at calltrap+0x8 #6 0xffffffff81a7274a at arc_prune_task+0x7a #7 0xffffffff81a2865f at taskq_run+0x1f #8 0xffffffff805d2e52 at taskqueue_run_locked+0x162 #9 0xffffffff805d3d72 at taskqueue_thread_loop+0xb2 #10 0xffffffff8053ed81 at fork_exit+0x71 #11 0xffffffff8082038e at fork_trampoline+0xe Uptime: 1d23h34m0s Dumping 1306 out of 16346 MB:..2%..12%..21%..31%..41%..51%..61%..72%..81%..91% The current process is always "arc_prune". Where to go from here to resolve this?