From nobody Sun Nov 12 18:47:36 2023 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ST1lW1wJQz50ktR for ; Sun, 12 Nov 2023 18:47:51 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4ST1lV2vyfz3L4d; Sun, 12 Nov 2023 18:47:50 +0000 (UTC) (envelope-from kostikbel@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: from tom.home (kib@localhost [127.0.0.1] (may be forged)) by kib.kiev.ua (8.17.1/8.17.1) with ESMTP id 3ACIlatp075977; Sun, 12 Nov 2023 20:47:39 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 3ACIlatp075977 Received: (from kostik@localhost) by tom.home (8.17.1/8.17.1/Submit) id 3ACIlahJ075976; Sun, 12 Nov 2023 20:47:36 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 12 Nov 2023 20:47:36 +0200 From: Konstantin Belousov To: Alexander Motin Cc: Ronald Klop , current@freebsd.org Subject: Re: crash zfs_clone_range() Message-ID: References: <349700057.3452.1699611152405@localhost> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM,LOTS_OF_MONEY, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=4.0.0 X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-14) on tom.home X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US] X-Rspamd-Queue-Id: 4ST1lV2vyfz3L4d On Sun, Nov 12, 2023 at 11:51:40AM -0500, Alexander Motin wrote: > Hi Ronald, > > As I can see, the clone request to ZFS came through nullfs, and it crashed > immediately on enter. I've never been a VFS layer expert, but to me it may > be a nullfs problem, not zfs. Is there chance you was (un-)mounting > something when this happened? It is not nullfs issue, I believe, but the lack of the busy reference on the upper mount. I think https://reviews.freebsd.org/D42554 should cover it. > > On 10.11.2023 05:12, Ronald Klop wrote: > > Hi, > > > > Had this crash today on RPI4/15-CURRENT. > > > > FreeBSD rpi4 15.0-CURRENT FreeBSD 15.0-CURRENT #19 > > main-b0203aaa46-dirty: Sat Nov  4 11:48:33 CET 2023 ronald@rpi4:/home/ronald/dev/freebsd/obj/home/ronald/dev/freebsd/src/arm64.aarch64/sys/GENERIC-NODEBUG > > arm64 > > > > $ sysctl -a | grep bclon > > vfs.zfs.bclone_enabled: 1 > > > > I started a jail with poudriere to build a package. The jail uses null > > mounts over ZFS. > > > > [root]# cu -s 115200 -l /dev/cuaU0 > > Connected > > > > db> bt > > Tracing pid 95213 tid 100438 td 0xffff0000e1e97900 > > db_trace_self() at db_trace_self > > db_stack_trace() at db_stack_trace+0x120 > > db_command() at db_command+0x2e4 > > db_command_loop() at db_command_loop+0x58 > > db_trap() at db_trap+0x100 > > kdb_trap() at kdb_trap+0x334 > > handle_el1h_sync() at handle_el1h_sync+0x18 > > --- exception, esr 0xf2000000 > > kdb_enter() at kdb_enter+0x48 > > vpanic() at vpanic+0x1dc > > panic() at panic+0x48 > > data_abort() at data_abort+0x2fc > > handle_el1h_sync() at handle_el1h_sync+0x18 > > --- exception, esr 0x96000004 > > rms_rlock() at rms_rlock+0x1c > > zfs_clone_range() at zfs_clone_range+0x68 > > zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0x19c > > null_bypass() at null_bypass+0x118 > > vn_copy_file_range() at vn_copy_file_range+0x18c > > kern_copy_file_range() at kern_copy_file_range+0x36c > > sys_copy_file_range() at sys_copy_file_range+0x8c > > do_el0_sync() at do_el0_sync+0x634 > > handle_el0_sync() at handle_el0_sync+0x48 > > --- exception, esr 0x56000000 > > > > > > Oh.. While typing this I rebooted the machine and it happened again. I > > didn't start anything in particular although the machine runs some > > jails. > > > > x0: 0x00000000000000e0 > >   x1: 0xffffa00090317a48 > >   x2: 0xffffa000f79d4f00 > >   x3: 0xffffa000c61a44a8 > >   x4: 0xffff0000deefe460 ($d.2 + 0xdd776560) > >   x5: 0xffffa001250e4c00 > >   x6: 0xffff0000e54025b5 ($d.5 + 0xc) > >   x7: 0x000000000000030a > >   x8: 0xffff0000e1559000 ($d.2 + 0xdfdd1100) > >   x9: 0x0000000000000001 > >  x10: 0x0000000000000000 > >  x11: 0x0000000000000001 > >  x12: 0x0000000000000002 > >  x13: 0x0000000000000000 > >  x14: 0x0000000000000001 > >  x15: 0x0000000000000000 > >  x16: 0xffff0000016dce88 (__stop_set_modmetadata_set + 0x1310) > >  x17: 0xffff0000004e0d44 (rms_rlock + 0x0) > >  x18: 0xffff0000deefe280 ($d.2 + 0xdd776380) > >  x19: 0x0000000000000000 > >  x20: 0xffff0000deefe460 ($d.2 + 0xdd776560) > >  x21: 0x7fffffffffffffff > >  x22: 0xffffa00090317a48 > >  x23: 0xffffa000f79d4f00 > >  x24: 0xffffa001067ef910 > >  x25: 0x00000000000000e0 > >  x26: 0xffffa000158a8000 > >  x27: 0x0000000000000000 > >  x28: 0xffffa000158a8000 > >  x29: 0xffff0000deefe280 ($d.2 + 0xdd776380) > >   sp: 0xffff0000deefe280 > >   lr: 0xffff000001623564 (zfs_clone_range + 0x6c) > >  elr: 0xffff0000004e0d60 (rms_rlock + 0x1c) > > spsr: 0x00000000a0000045 > >  far: 0x0000000000000108 > >  esr: 0x0000000096000004 > > panic: data abort in critical section or under mutex > > cpuid = 1 > > time = 1699610885 > > KDB: stack backtrace: > > db_trace_self() at db_trace_self > > db_trace_self_wrapper() at db_trace_self_wrapper+0x38 > > vpanic() at vpanic+0x1a0 > > panic() at panic+0x48 > > data_abort() at data_abort+0x2fc > > handle_el1h_sync() at handle_el1h_sync+0x18 > > --- exception, esr 0x96000004 > > rms_rlock() at rms_rlock+0x1c > > zfs_clone_range() at zfs_clone_range+0x68 > > zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0x19c > > null_bypass() at null_bypass+0x118 > > vn_copy_file_range() at vn_copy_file_range+0x18c > > kern_copy_file_range() at kern_copy_file_range+0x36c > > sys_copy_file_range() at sys_copy_file_range+0x8c > > do_el0_sync() at do_el0_sync+0x634 > > handle_el0_sync() at handle_el0_sync+0x48 > > --- exception, esr 0x56000000 > > KDB: enter: panic > > [ thread pid 3792 tid 100394 ] > > Stopped at      kdb_enter+0x48: str     xzr, [x19, #768] > > db> > > > > I'll keep the debugger open for a while. Can I type something for > > additional info? > > > > Regards, > > Ronald. > > -- > Alexander Motin