From nobody Tue Oct 31 14:16:00 2023 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SKXHQ3D2Cz4yQtj for ; Tue, 31 Oct 2023 14:16:02 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SKXHQ0rGGz3Ngw for ; Tue, 31 Oct 2023 14:16:02 +0000 (UTC) (envelope-from mjguzik@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-oi1-x22c.google.com with SMTP id 5614622812f47-3add37de892so2874898b6e.1 for ; Tue, 31 Oct 2023 07:16:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698761761; x=1699366561; darn=freebsd.org; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=0ZfA8SyXKR32Bbnq/28ddJC/pC65dUPe0rsyyVHuIC4=; b=fgIVyVF8AEcVNMIL42AXTz4sxPxsNUxRHfKe7weFU7+NqYAqTNp1raDzP9Oz55BxBQ qJA0sJVBwztJ6EF2APnYV9gbOKQqqRyp0Qvok6LYipdqy6RZYx7sn+im+HvbhyFJWrAw HNEmVPJrAEcvJuxJTCtbOh3k1XXqJx71ntci5ptcdpRncxxNzfKEi4/ONFFRyruFK8bm IzYTgfefxnILiPplZ8fTgNq2HStIxUUpD9JDecojIPri8SDuo7k4DDqQOorZ75aiMGfc wdy2udrbF1fE1+avZH7/UPRpwEpLQ/QP2IBsOD3i++9MhMgVmwGBNszl0F8DcjKP4o0U hZIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698761761; x=1699366561; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=0ZfA8SyXKR32Bbnq/28ddJC/pC65dUPe0rsyyVHuIC4=; b=rUZsC5K7EpB8Cs4v6FjpbMFD3mjPzIXJcrB1xjuGI+CajMYkR1asphpFVHMfU6UprO FpUDHXhkuGSihkyo4ibrTZJAxpfKIhOUoIkTRUqrm2JmTCxVUY4GLFahwPv8qpyqDlmH IUS6dqgeS9BKqsVNg365mJfCFb0PXZSsQUxYcQcoiQU7rT9eUs58BEPT+Na1bo5QicQH HBwOtrcihWKNdJHaFctuJzU4pwGmpd37igqgPAwJzw9gaMoFOSYkpQVIQHryIXzRu51v +Han4qbM4NR4XdxsGldVQ4QJfn4hfbIJ3APnPgooG8YZuYYgfjS3t3BsjOWM2zY71ACC gpGA== X-Gm-Message-State: AOJu0YwU/JAckt6NEFRkeLjJ8G3Dxz/tFQtpSDbvAILnFFUDqUxN6uge 8rL3skb1aXe826WSGdDPGEZE/XYDEmjx6c50scFC1g42 X-Google-Smtp-Source: AGHT+IHHXrvKRRa5cPzdNMJmfSQBpQmktEpIm7GRPKMCuOjE3KPpwt1vdsAWYiLBkX1ufod1t46ivXxPy1b1ca+LvAg= X-Received: by 2002:aca:1816:0:b0:3a6:febf:fb with SMTP id h22-20020aca1816000000b003a6febf00fbmr10615213oih.22.1698761760670; Tue, 31 Oct 2023 07:16:00 -0700 (PDT) List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 Received: by 2002:a8a:d1:0:b0:4f0:1250:dd51 with HTTP; Tue, 31 Oct 2023 07:16:00 -0700 (PDT) In-Reply-To: <67FBC292-E83D-4171-BE79-D28D5578F031@mit.edu> References: <67FBC292-E83D-4171-BE79-D28D5578F031@mit.edu> From: Mateusz Guzik Date: Tue, 31 Oct 2023 15:16:00 +0100 Message-ID: Subject: Re: vnode_init took over 6 minutes To: John F Carr Cc: Freebsd fs Content-Type: text/plain; charset="UTF-8" X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4SKXHQ0rGGz3Ngw On 10/31/23, John F Carr wrote: > During a poudriere build control-T showed an awk process was stuck > for over 6 minutes trying to acquire a lock in vnode_init. > > load: 8.58 cmd: awk 51238 [running] 392.88r 0.00u 392.88s 100% 2604k > __mtx_lock_sleep+0xf8 __mtx_lock_flags+0xa4 vnode_init+0xc3 > keg_alloc_slab+0x277 zone_import+0x143 cache_alloc+0x3ed > cache_alloc_retry+0x23 getnewvnode_reserve+0x20 zfs_zget+0x1f > zfs_dirent_lookup+0x16d zfs_dirlook+0x7f zfs_lookup+0x3e0 > zfs_freebsd_cachedlookup+0x74 vfs_cache_lookup+0xa7 > cache_fplookup_noentry+0x241 cache_fplookup+0x575 namei+0x1ea > vn_open_cred+0x48d > > The stack trace was the same for several minutes. System CPU time was > increasing. > > Address vnode_init+0xc3 corresponds to the mtx_lock call here. > > vp->v_holdcnt = VHOLD_NO_SMR; > vp->v_type = VNON; > mtx_lock(&vnode_list_mtx); > TAILQ_INSERT_BEFORE(vnode_list_free_marker, vp, v_vnodelist); > mtx_unlock(&vnode_list_mtx); > return (0); > > Address __mtx_lock_sleep+0xf8 is the instruction after a call to > lock_delay. > > ps says the command line was > > /usr/bin/awk -f /usr/bin/awk > old-default/2023-10-31_08h21m03s/.poudriere.builders > old-default/2023-10-31_08h21m03s/.poudriere.buildname ... > > with the full list of input files exceeding the ~2KB command line length > limit of ps. > "/usr/bin/awk" is probably not the real second argument. It would cause an > immediate syntax error. > > The hang resolved within a few more minutes and poudriere is continuing > happily. > > I have never seen such behavior before. Code in vfs_subr.c tries not to > hold the > vnode_list_mtx lock for too long. > > Any thoughts? > > > FreeBSD 13.2-STABLE up through commit b180f0040f95, 24 core 48 thread Zen > 2, > zfs pool on SSD connected via 12 Gbps SAS with cache on NVME, 160 GB RAM. > what does "sysctl vfs.vnode" say -- Mateusz Guzik