From nobody Tue Oct 31 15:42:10 2023 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SKZBr3NqBz4yX2H for ; Tue, 31 Oct 2023 15:42:12 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SKZBr0clyz3XCB for ; Tue, 31 Oct 2023 15:42:12 +0000 (UTC) (envelope-from mjguzik@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-5872c6529e1so437128eaf.0 for ; Tue, 31 Oct 2023 08:42:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698766931; x=1699371731; darn=freebsd.org; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=XlsqFRRu8jGVXKMFLZE2HUhuAJpkE/TTAurOJG4znNM=; b=YaNGlKLfh8U6XGftIxAWL2UFvT2t66CbY35EHtqIFcwczk3WDdUGuW9VzAvb5xzlBr R/6F6U2R0Rt4dzfl8/9bb57v3qTqBJBQ62SiRIpy5h6vIAYluxxd9twrrFQSz6bZuaTq IulzRzQ+jo6K4HxHSH141ZRSxAkHTsdEW1AIyXWb5F2nXRAEODW4ZjuEBixiWFd5Fq9r NsVhIjatjVOq1ukKoObCyVFcNcrk8kjO0n2XD2chpMGpc+0jPlj1lhVxvHxCLyh00EC+ VNIeuzN/MjzhN4PbncHREUWEBXISeLCw3K0KuJfYdNaa90FLGiSC5mUQldSUuVziT/hG i/gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698766931; x=1699371731; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XlsqFRRu8jGVXKMFLZE2HUhuAJpkE/TTAurOJG4znNM=; b=eRVFpcb3VKmGBLTQLxqrg4VozakrmdwV4zOQnNYdEsffXLRmuobwVj0e5r2uu9uPaq sDgjGrcEWGT+RLxkoAzL+9AiMxi24/1LdCNlJGi9SxNbXSpDOe+bDj8AlAp6ToDzh3zn EXpzDxYTfT2EkC3MKWmwFaHT/IFV8xRC/E4QcQM2UnwTSVzv5lgQl5EFASZM25FBPH7B Vgp+dtejx7w/3SMltA9EYwhzLBX1MArWmT4+cmDQnCaGLGZQ7bQ2vZgut1MW1h9HMNIz qV970A1OKIW9GYY7dgB8E2eQAeAwDqyv4lSl5LxnhMjbs+byHS5WkEbiWHyLeG8MqB4v 9krA== X-Gm-Message-State: AOJu0YxWhFFU+npUWnF4pkkKAaSSVzyopslZ4ssxBBHK1zBiTrtakXrK 7dey1j/Ds01O7nq8hzT/ZGq78idCJqZPbmyZe1c= X-Google-Smtp-Source: AGHT+IEQI7D50K3CJ/E9uxZfBUrWSZSH+/918txpCj+t9raEMZu3sT/zgNng6AxcCuXX6TL3oZV4jGfFA3hX1BWkjPo= X-Received: by 2002:a05:6820:3d3:b0:586:a887:4633 with SMTP id s19-20020a05682003d300b00586a8874633mr10967917ooj.5.1698766930797; Tue, 31 Oct 2023 08:42:10 -0700 (PDT) List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 Received: by 2002:a8a:d1:0:b0:4f0:1250:dd51 with HTTP; Tue, 31 Oct 2023 08:42:10 -0700 (PDT) In-Reply-To: <8F17FEC3-C3CB-410D-ACEC-ADD12B3A747A@mit.edu> References: <67FBC292-E83D-4171-BE79-D28D5578F031@mit.edu> <8F17FEC3-C3CB-410D-ACEC-ADD12B3A747A@mit.edu> From: Mateusz Guzik Date: Tue, 31 Oct 2023 16:42:10 +0100 Message-ID: Subject: Re: vnode_init took over 6 minutes To: John F Carr Cc: Freebsd fs Content-Type: text/plain; charset="UTF-8" X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4SKZBr0clyz3XCB On 10/31/23, John F Carr wrote: > > >> On Oct 31, 2023, at 10:16, Mateusz Guzik wrote: >> >> On 10/31/23, John F Carr wrote: >>> During a poudriere build control-T showed an awk process was stuck >>> for over 6 minutes trying to acquire a lock in vnode_init. >>> >>> load: 8.58 cmd: awk 51238 [running] 392.88r 0.00u 392.88s 100% 2604k >>> __mtx_lock_sleep+0xf8 __mtx_lock_flags+0xa4 vnode_init+0xc3 >>> keg_alloc_slab+0x277 zone_import+0x143 cache_alloc+0x3ed >>> cache_alloc_retry+0x23 getnewvnode_reserve+0x20 zfs_zget+0x1f >>> zfs_dirent_lookup+0x16d zfs_dirlook+0x7f zfs_lookup+0x3e0 >>> zfs_freebsd_cachedlookup+0x74 vfs_cache_lookup+0xa7 >>> cache_fplookup_noentry+0x241 cache_fplookup+0x575 namei+0x1ea >>> vn_open_cred+0x48d >>> >>> The stack trace was the same for several minutes. System CPU time was >>> increasing. >>> >>> Address vnode_init+0xc3 corresponds to the mtx_lock call here. >>> >>> vp->v_holdcnt = VHOLD_NO_SMR; >>> vp->v_type = VNON; >>> mtx_lock(&vnode_list_mtx); >>> TAILQ_INSERT_BEFORE(vnode_list_free_marker, vp, v_vnodelist); >>> mtx_unlock(&vnode_list_mtx); >>> return (0); >>> >>> Address __mtx_lock_sleep+0xf8 is the instruction after a call to >>> lock_delay. >>> >>> ps says the command line was >>> >>> /usr/bin/awk -f /usr/bin/awk >>> old-default/2023-10-31_08h21m03s/.poudriere.builders >>> old-default/2023-10-31_08h21m03s/.poudriere.buildname ... >>> >>> with the full list of input files exceeding the ~2KB command line length >>> limit of ps. >>> "/usr/bin/awk" is probably not the real second argument. It would cause >>> an >>> immediate syntax error. >>> >>> The hang resolved within a few more minutes and poudriere is continuing >>> happily. >>> >>> I have never seen such behavior before. Code in vfs_subr.c tries not to >>> hold the >>> vnode_list_mtx lock for too long. >>> >>> Any thoughts? >>> >>> >>> FreeBSD 13.2-STABLE up through commit b180f0040f95, 24 core 48 thread >>> Zen >>> 2, >>> zfs pool on SSD connected via 12 Gbps SAS with cache on NVME, 160 GB >>> RAM. >>> >> >> what does "sysctl vfs.vnode" say >> >> >> -- >> Mateusz Guzik > > With the builds now complete, > > vfs.vnode.vnlru.uma_reclaim_calls: 0 > vfs.vnode.vnlru.kicks: 0 > vfs.vnode.vnlru.max_free_per_call: 10000 > vfs.vnode.vnlru.failed_runs: 0 > vfs.vnode.vnlru.direct_recycles_free: 152871194 > vfs.vnode.vnlru.recycles_free: 77190881 > vfs.vnode.vnlru.recycles: 0 > vfs.vnode.stats.alloc_sleeps: 0 > vfs.vnode.stats.free: 1269702 > vfs.vnode.stats.skipped_requeues: 9375158 > vfs.vnode.stats.created: 235594976 > vfs.vnode.stats.count: 1407640 > vfs.vnode.param.wantfree: 681059 > vfs.vnode.param.limit: 2724237 > > Uptime is 11 days, mostly with low I/O rates. > > please add the uma stats: "sysctl vm.uma.VNODE" -- Mateusz Guzik