From nobody Fri Sep 01 02:18:30 2023 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RcMCD3w7Bz4rvLP; Fri, 1 Sep 2023 02:18:32 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4RcMCD3N65z4HS0; Fri, 1 Sep 2023 02:18:32 +0000 (UTC) (envelope-from kevans@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1693534712; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M0FvUJz3gWCxdpouV8eacwaakOXBGK/PpbOlCwZbyBs=; b=k2XbLOvzRaiqSVzhsmUlcUsmxSrc/Sc5XUlY3iGH/hOw7KgG85zWNqEGwX2ioXBAnvnKJu bWrGXEMn2s/M+WLFZGvcFPqpTgWDNXWYepI98cMqI7hgNM8lX7OFkG4a27jqH6rXMs/C5A Rkh6T61tRgpOQyVtfs/udf2cJKYrQUdV8tJ26dq/6HI7wJZouaKT1dfs39E1wluYxwraJo cOg3OPFAqtR3nPnB1QIw3Q2G+6D0MSx7ZuC34qFf6LCDYaMbyOcde1L2KoWhCgqU5np/i2 xVX3HGTaw0IjAj0lppxk/+x0L72J8rb2MEqT7MEpEvg4JXzlUshcgRKMpR65RQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1693534712; a=rsa-sha256; cv=none; b=j/gUBHMv6ovfhGfE5+FqMCAFx5ynDJgV+xpRZCDJGxRaAKBP8KqK3UnHGUKlEr6dov8EkJ zDgAKEgdcc2aQDKgHepyp6Kkr9WmssE5jFXjEOB24g8/FhsM/XhebrNQNBG2vvmJV5zlMH thakvcuGMbZ5qDxau3O+2ZaQjb/RR8vnGh1CqTAVifVX6HJbF1pERh0mes9vROjdI8UyPm Bt4qXLEj3WTJ7wPVyif0+B04GWTmWUQs8pr00ZDJp4RTMc2LWJIEqFxuYtveKM50hmSpV/ t4DRdnbxwjGxHRRxZZpvL/AhqmRkJZ2ouKeLlf9v2n3iEuMJRKc9+xi1v73omg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1693534712; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M0FvUJz3gWCxdpouV8eacwaakOXBGK/PpbOlCwZbyBs=; b=jYtqo7Uee7X9OjB3rjVVGbjmq0hMUPp20RIVLmrfNd7Nc0peI+xh7ZwoknYscVjRccRIKe I3FkOpbLiKUW3wSL/6/tGaZYdhKo/DGtgjmeSqt7FjrZyjx17Hr08uBiFX+nNXvN3sgGwY uTD/fbTc1u7gm8Wwb2tGgdC+pr1apbL/Llc9L5ML36Wqu8hpwVPG3jwaOSJAYyeZyULWEl HYmLHDE9Fdli4qNBvoUMaBdPBdbwdPCA9bJIcG21jUurLlshD08y6KIr5II6GtlJV6y8R9 RUljQkeSpin7+Te82r12CsBUn1cxRdpCl66qPWvCjTjPGrdRuxZWGR3oKQ2w1A== Received: from [10.9.4.95] (unknown [209.182.120.176]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: kevans/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4RcMCC5bk7z1Csc; Fri, 1 Sep 2023 02:18:31 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Message-ID: Date: Thu, 31 Aug 2023 21:18:30 -0500 List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: git: 315ee00fa961 - main - zfs: merge openzfs/zfs@804414aad Content-Language: en-US To: Alexander Motin , Gleb Smirnoff , Martin Matuska Cc: src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org References: <202308270509.37R596B5048298@gitrepo.freebsd.org> From: Kyle Evans In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 8/30/23 20:01, Alexander Motin wrote: > On 30.08.2023 20:09, Gleb Smirnoff wrote: >>    Hi Martin, >> >> we hit this panic with this update: >> >> panic: VERIFY(avl_is_empty(&lwb->lwb_vdev_tree)) failed >> >> cpuid = 8 >> time = 1693432004 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >> 0xfffffe09519aa800 >> vpanic() at vpanic+0x132/frame 0xfffffe09519aa930 >> spl_panic() at spl_panic+0x3a/frame 0xfffffe09519aa990 >> zil_free_lwb() at zil_free_lwb+0x1de/frame 0xfffffe09519aa9c0 >> zil_sync() at zil_sync+0x3cb/frame 0xfffffe09519aaaa0 >> [Wed Aug 30 21:46:44 2023]dmu_objset_sync() at >> dmu_objset_sync+0x51b/frame 0xfffffe09519aab70 >> dsl_pool_sync() at dsl_pool_sync+0x11d/frame 0xfffffe09519aabf0 >> spa_sync() at spa_sync+0xc68/frame 0xfffffe09519aae20 >> txg_sync_thread() at txg_sync_thread+0x2eb/frame 0xfffffe09519aaef0 >> fork_exit() at fork_exit+0x82/frame 0xfffffe09519aaf30 >> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe09519aaf30 >> --- trap 0x26d90a2f, rip = 0xbba54c81b25fd8d4, rsp = >> 0xff60c2f6f9874bcd, rbp = 0x37940e3e9793bd40 -- >> >> I have two cores dumped for this panic. >> >> Our previous version is based on >> 9c0a2d522f3a18fc795a7dd6266d2be2aff4b084, which >> I guess has the most recent ZFS merge openzfs/zfs@009d3288d. >> > Hi Gleb, > > It is the first time I see a panic like this. I'll think about it > tomorrow. But I'd appreciate any information on what is your workload > and what are you doing related to ZIL (O_SYNC, fsync(), sync=always, > etc) to trigger it? What is your pool configuration? > Hello! Probably not related, but I seem to have hit a use-after-free in ZIL bits; is this something you're interested in? Backtrace is: #17 zio_nowait (zio=0xdeadc0dedeadc0de) at /usr/src.kbsd/sys/contrib/openzfs/module/zfs/zio.c:2367 #18 0xffff00000160720c in zil_lwb_write_issue (zilog=zilog@entry=0xffffa00021631800, lwb=0xffffa00933d34c40) at /usr/src.kbsd/sys/contrib/openzfs/module/zfs/zil.c:1935 --Type for more, q to quit, c to continue without paging-- #19 0xffff0000016031c0 in zil_commit_writer (zilog=0xffffa00021631800, zcw=0xffffa000fb01e140) at /usr/src.kbsd/sys/contrib/openzfs/module/zfs/zil.c:2985 #20 zil_commit_impl (zilog=0xffffa00021631800, foid=) at /usr/src.kbsd/sys/contrib/openzfs/module/zfs/zil.c:3489 #21 0xffff000001602f44 in zil_commit (zilog=, foid=) at /usr/src.kbsd/sys/contrib/openzfs/module/zfs/zil.c:3452 #22 0xffff0000015fdac8 in zfs_fsync (zp=0xffffa007e27b23a0, syncflag=, cr=) at /usr/src.kbsd/sys/contrib/openzfs/module/zfs/zfs_vnops.c:75 #23 0xffff00000056a864 in VOP_FSYNC (vp=0xffffa001d2003000, waitfor=1, td=0xffff00015aa94620) at ./vnode_if.h:763 #24 kern_fsync (td=0xffff00015aa94620, fd=, fullsync=true) at /usr/src.kbsd/sys/kern/vfs_syscalls.c:3556 #25 0xffff000000800630 in syscallenter (td=0xffff00015aa94620) at /usr/src.kbsd/sys/arm64/arm64/../../kern/subr_syscall.c:187 #26 svc_handler (td=0xffff00015aa94620, frame=) at /usr/src.kbsd/sys/arm64/arm64/trap.c:198 #27 do_el0_sync (td=0xffff00015aa94620, frame=) at /usr/src.kbsd/sys/arm64/arm64/trap.c:625 If I pop up to the zil_lwb_write_issue frame: (kgdb) print *lwb $4 = {lwb_zilog = 0xdeadc0dedeadc0de, lwb_blk = {blk_dva = {{dva_word = {16045693110842147038, 16045693110842147038}}, {dva_word = {16045693110842147038, 16045693110842147038}}, {dva_word = {16045693110842147038, 16045693110842147038}}}, blk_prop = 16045693110842147038, blk_pad = {16045693110842147038, 16045693110842147038}, blk_phys_birth = 16045693110842147038, blk_birth = 16045693110842147038, blk_fill = 16045693110842147038, blk_cksum = {zc_word = {16045693110842147038, 16045693110842147038, 16045693110842147038, 16045693110842147038}}}, lwb_slim = 3735929054, lwb_slog = 3735929054, lwb_error = -559038242, lwb_nmax = -559038242, lwb_nused = -559038242, lwb_nfilled = -559038242, lwb_sz = -559038242, lwb_state = 3735929054, lwb_buf = 0xdeadc0dedeadc0de , lwb_child_zio = 0xdeadc0dedeadc0de, lwb_write_zio = 0xdeadc0dedeadc0de, lwb_root_zio = 0xdeadc0dedeadc0de, lwb_issued_timestamp = -2401050962867404578, lwb_issued_txg = 16045693110842147038, lwb_alloc_txg = 16045693110842147038, lwb_max_txg = 16045693110842147038, lwb_node = {list_next = 0xdeadc0dedeadc0de, list_prev = 0xdeadc0dedeadc0de}, lwb_issue_node = {list_next = 0xdeadc0dedeadc0de, list_prev = 0xdeadc0dedeadc0de}, lwb_itxs = {list_size = 16045693110842147038, list_offset = 16045693110842147038, list_head = {list_next = 0xdeadc0dedeadc0de, list_prev = 0xdeadc0dedeadc0de}}, lwb_waiters = { list_size = 16045693110842147038, list_offset = 16045693110842147038, list_head = {list_next = 0xdeadc0dedeadc0de, list_prev = 0xdeadc0dedeadc0de}}, lwb_vdev_tree = {avl_root = 0xdeadc0dedeadc0de, avl_compar = 0xdeadc0dedeadc0de, avl_offset = 16045693110842147038, avl_numnodes = 16045693110842147038}, lwb_vdev_lock = {lock_object = { lo_name = 0xdeadc0dedeadc0de , lo_flags = 3735929054, lo_data = 3735929054, lo_witness = 0xdeadc0dedeadc0de}, sx_lock = 16045693110842147038}} It seems to have clearly been stomped on by uma trashing. Encountered while running a pkgbase build, I think while it was in the packaging phase. I note in particular in that frame: (kgdb) p/x lwb->lwb_issued_timestamp $4 = 0xdeadc0dedeadc0de So I guess it was freed sometime during one of the previous two zio_nowait() calls. Thanks, Kyle Evans