From nobody Tue Oct 31 15:25:27 2023 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SKYrm3BYpz4yVyP for ; Tue, 31 Oct 2023 15:26:32 +0000 (UTC) (envelope-from jfc@mit.edu) Received: from outgoing-exchange-5.mit.edu (outgoing-exchange-5.mit.edu [18.9.28.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.outgoing-exchange.mit.edu", Issuer "InCommon RSA Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SKYrm1MrZz3W2p for ; Tue, 31 Oct 2023 15:26:32 +0000 (UTC) (envelope-from jfc@mit.edu) Authentication-Results: mx1.freebsd.org; none Received: from w92exedge4.exchange.mit.edu (W92EXEDGE4.EXCHANGE.MIT.EDU [18.7.73.16]) by outgoing-exchange-5.mit.edu (8.14.7/8.12.4) with ESMTP id 39VFQS2k012067; Tue, 31 Oct 2023 11:26:29 -0400 Received: from w92exhyb3.exchange.mit.edu (18.7.71.73) by w92exedge4.exchange.mit.edu (18.7.73.16) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 31 Oct 2023 11:24:59 -0400 Received: from oc11exhyb6.exchange.mit.edu (18.9.1.111) by w92exhyb3.exchange.mit.edu (18.7.71.73) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 31 Oct 2023 11:25:29 -0400 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (104.47.55.168) by oc11exhyb6.exchange.mit.edu (18.9.1.111) with Microsoft SMTP Server (TLS) id 15.0.1497.48 via Frontend Transport; Tue, 31 Oct 2023 11:25:29 -0400 Received: from SA3PR01MB8450.prod.exchangelabs.com (2603:10b6:806:382::17) by BN0PR01MB6831.prod.exchangelabs.com (2603:10b6:408:14b::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.31; Tue, 31 Oct 2023 15:25:28 +0000 Received: from SA3PR01MB8450.prod.exchangelabs.com ([fe80::2e94:6383:c13:23b1]) by SA3PR01MB8450.prod.exchangelabs.com ([fe80::2e94:6383:c13:23b1%4]) with mapi id 15.20.6907.032; Tue, 31 Oct 2023 15:25:27 +0000 From: "John F Carr" To: Mateusz Guzik CC: Freebsd fs Subject: Re: vnode_init took over 6 minutes Thread-Topic: vnode_init took over 6 minutes Thread-Index: AQHaDAIzJL6ZDeRtpkqkNz6JuK6CTLBj8YcAgAATW4A= Date: Tue, 31 Oct 2023 15:25:27 +0000 Message-ID: <8F17FEC3-C3CB-410D-ACEC-ADD12B3A747A@mit.edu> References: <67FBC292-E83D-4171-BE79-D28D5578F031@mit.edu> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-traffictypediagnostic: SA3PR01MB8450:EE_|BN0PR01MB6831:EE_ x-ms-office365-filtering-correlation-id: 5043ea77-ddb7-42b7-69ee-08dbda259c67 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: P4evgtrnDYXO4+K0ETSvIozTZssnvOyOdXdqWcCuMJsGgJtpHMHr6aXraGrojsWU0OqohbJ0vBAL7wbKLjw3YdJIhUqmKKIdWRACnCWrFseGeWHp6gGSBdV1CcqTMo8AoGX6lLr2SzKNF2HTZLXPzmCJ7SL3QCtuQAhwXSsGk0bTvD5406Ad7FyrlM9zQQRGLyQ99RlhQicY3fLA90qKCWb4/cK3rplnxdFUxtM6Xeh9MAQ7Jd/qxlUhSymnHSmrd+yqv0GSg8CD5eULbOF7Ngo+WFFYAd2eo/VFMlG9kpJNxSOGWK7sRyHmvMU3sBMNsl3rjjvybu0ObcEflGXzt12N8WJrm4PT0dWkx1KDxxUOmzm1SCWJs6+IS8GOwsWICTyqjJcqNTeqfyjGd0yJv7PkuSkLr5w6+N/VTWbT+Z1fiAEM9u7gxNAnGM1sW0LpeeAfWSqfS5noHTZ1NeXC24f7J12EMQ/cBaRj2FdsYQ6y/oSMFs3su/umJ8ei+15oUrYuQRrkUeZPh5iEWyWteOcSBsmPk0QfB7wRaY7zHNeUxzRoHQ3tfqbHif7xKdjPqgUO6JmsuEPaRbGNXuykZVzyHmEDFApmQlR0r+k9savLV7ZxaYhPuwsWi34gzt/I x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SA3PR01MB8450.prod.exchangelabs.com;PTR:;CAT:NONE;SFS:(13230031)(376002)(39860400002)(366004)(346002)(136003)(396003)(230922051799003)(451199024)(1800799009)(64100799003)(186009)(38070700009)(38100700002)(2906002)(36756003)(4326008)(33656002)(66946007)(8676002)(8936002)(66556008)(75432002)(5660300002)(122000001)(76116006)(478600001)(71200400001)(6916009)(316002)(66446008)(64756008)(83380400001)(6506007)(91956017)(66476007)(786003)(86362001)(2616005)(53546011)(41300700001)(6486002)(6512007);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?qNrGwNwEUIgCkIDNMdb/eRaBnnyr+/+k64N4ALQRk6VWEBeMyojI1Ihtr1RW?= =?us-ascii?Q?sxyGu+Wlq/fy8Hj1Aar5RBEM9h9o3Mo0c/GcE5C0VKIfcnhPz/fzdkRcT9PQ?= =?us-ascii?Q?a/k+sKpp0ly0Av8w+miQEQbGBpGhFV/pRQu5JW7Bn9VpLeXko82YpI2KjPCG?= =?us-ascii?Q?MCyW9je78ef/2NwQ5OecplaFpwMqMoiFRGcFJJP9Z18XwB2XfEDXDnudXnKm?= =?us-ascii?Q?lldcYTXijTEktMykHtY4mGmMDXQu7T84P1c8a0s21yxnx4DuwE6LgcenisKB?= =?us-ascii?Q?+fwLrrBPF5BddOEaXduTI5URhfziEuDqtJLvQCA/2r7ti5XQqqijEpJFImh+?= =?us-ascii?Q?tQi9+WuX6uZfPReZoG03OEOZeJwoh2M15UjHgFv9l43u+8dxsfzruTiUpIFl?= =?us-ascii?Q?x6yTy+j/snxzYAZELTF9Xg7DxH9/NNC0+r3exlwYsKjCi40LqxooHGPOfcOm?= =?us-ascii?Q?TJUWyMYmb1NJMlOYBlpSa0TqQNbppq+ZBp69GZ+cgd0iv2kXckqg9ScMeoZc?= =?us-ascii?Q?9c8upuScAZSjpno7tCoMgoDbUWRJ0Jndxf8rZn5JtLtfg3l7+KgkSyO71Mvx?= =?us-ascii?Q?Df4i22WojVCVBScoYuKDdBPaYM9bHIff8Wf+8hMnporRQeR9KiAnp9PUgXpy?= =?us-ascii?Q?nNZQ/0HB2PkoMKaKW1AX+XH4CUz/HNy8A7yxnDSw7ISSzRNxVVJOYpyJpbSb?= =?us-ascii?Q?5Zpy80ELB0suUeWv79D2RhLqRF0moipAgpdoGeJu6bvKvrtaAkr7HNuIOZRr?= =?us-ascii?Q?AKVPpPT87W6blSGK1Im2isi0LKj2LEWQ69iXTQ2MntZLp3aywuF/jJuRx1hZ?= =?us-ascii?Q?Ph/2bzE8bVHmRNMenV7/BYrkI07rS1qOd9muPQljUT6jsCzo+Rm2fRv9hguA?= =?us-ascii?Q?+2hHvtLD9XL71RucAdwBXxTcPEGQ7u58JJWO/+/jx6yx4b/c+6sXavYJnWii?= =?us-ascii?Q?CqypkTW3pSizODxTTY/JyP6stvx1JjKORSSQ2MSaPLLHOQtO9v5eY2bhjSqc?= =?us-ascii?Q?NUynaXnJRP7WqTw3yWv4ql2Q5u4Y+kbcNoC7EitT8pw1e5bPMwR3CvIWdrXh?= =?us-ascii?Q?FuM4IcBYevLRbhsMMqvqH0SvabilGPGxwSUN+N0H6DUf7AwJsP1DtZN7iYa2?= =?us-ascii?Q?7HXlf5ShSi9/1yCg6KLNVWez2H2A3hezC3xdsN/iSED1cWu/1Ty+lzQhRA3G?= =?us-ascii?Q?RlGa0v9YGi0k1/0aQdGgTfXtyL5ej9pvjrtAZVJYox5RamaD0++c0h2cRHXg?= =?us-ascii?Q?I3ApS6YZ//uvtxGNkOGJMbpSzJ1luA6jvG7wa6QyxNtmHahngi8Uh1JPkO1L?= =?us-ascii?Q?o0P61M7HbNP+VUaofFmwEvvFV4yFx4YgcKtszO3mAaSFIWTDq7JEnpgu52am?= =?us-ascii?Q?PJQVe/qQ2qVt/lzWdB2siZr5Ep3WIimFdF0DkHETbgK2kPTDo2qfdsfu2A+j?= =?us-ascii?Q?XxZ0iyBFP+HeBp6WHHt2eI1nyApgRtLzDTJDij/Iglb0UvNdNGpLq/Vm4PPe?= =?us-ascii?Q?Fy/gW1DyKd+ZN/blycgq/5R1N1LDGSWXmjvOjjD2tdFtBMVh3spkM5ptR05Q?= =?us-ascii?Q?v1M4iDv2H9W0JqKKFEzv0JszdfstcAz/T1gMEXsPbBVKV4zZDPGP82rXNfQn?= =?us-ascii?Q?h6HM6bghADgPVnvL2UEgzA/KOegr/BbH35TN3ATFiJhv?= Content-Type: text/plain; charset="us-ascii" Content-ID: <843CC50C8A236746B1C81FD30FA2A044@prod.exchangelabs.com> Content-Transfer-Encoding: quoted-printable List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SA3PR01MB8450.prod.exchangelabs.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5043ea77-ddb7-42b7-69ee-08dbda259c67 X-MS-Exchange-CrossTenant-originalarrivaltime: 31 Oct 2023 15:25:27.5539 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 64afd9ba-0ecf-4acf-bc36-935f6235ba8b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: EVmHVbnKhtHaZ7tXfi0WiO33qLEujsRDGjPI9g3z34ItYj7brkEaYvtyRtXVWcCj X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN0PR01MB6831 X-OriginatorOrg: mit.edu X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:3, ipnet:18.9.0.0/16, country:US] X-Rspamd-Queue-Id: 4SKYrm1MrZz3W2p > On Oct 31, 2023, at 10:16, Mateusz Guzik wrote: >=20 > On 10/31/23, John F Carr wrote: >> During a poudriere build control-T showed an awk process was stuck >> for over 6 minutes trying to acquire a lock in vnode_init. >>=20 >> load: 8.58 cmd: awk 51238 [running] 392.88r 0.00u 392.88s 100% 2604k >> __mtx_lock_sleep+0xf8 __mtx_lock_flags+0xa4 vnode_init+0xc3 >> keg_alloc_slab+0x277 zone_import+0x143 cache_alloc+0x3ed >> cache_alloc_retry+0x23 getnewvnode_reserve+0x20 zfs_zget+0x1f >> zfs_dirent_lookup+0x16d zfs_dirlook+0x7f zfs_lookup+0x3e0 >> zfs_freebsd_cachedlookup+0x74 vfs_cache_lookup+0xa7 >> cache_fplookup_noentry+0x241 cache_fplookup+0x575 namei+0x1ea >> vn_open_cred+0x48d >>=20 >> The stack trace was the same for several minutes. System CPU time was >> increasing. >>=20 >> Address vnode_init+0xc3 corresponds to the mtx_lock call here. >>=20 >> vp->v_holdcnt =3D VHOLD_NO_SMR; >> vp->v_type =3D VNON; >> mtx_lock(&vnode_list_mtx); >> TAILQ_INSERT_BEFORE(vnode_list_free_marker, vp, v_vnodelist); >> mtx_unlock(&vnode_list_mtx); >> return (0); >>=20 >> Address __mtx_lock_sleep+0xf8 is the instruction after a call to >> lock_delay. >>=20 >> ps says the command line was >>=20 >> /usr/bin/awk -f /usr/bin/awk >> old-default/2023-10-31_08h21m03s/.poudriere.builders >> old-default/2023-10-31_08h21m03s/.poudriere.buildname ... >>=20 >> with the full list of input files exceeding the ~2KB command line length >> limit of ps. >> "/usr/bin/awk" is probably not the real second argument. It would cause= an >> immediate syntax error. >>=20 >> The hang resolved within a few more minutes and poudriere is continuing >> happily. >>=20 >> I have never seen such behavior before. Code in vfs_subr.c tries not to >> hold the >> vnode_list_mtx lock for too long. >>=20 >> Any thoughts? >>=20 >>=20 >> FreeBSD 13.2-STABLE up through commit b180f0040f95, 24 core 48 thread Ze= n >> 2, >> zfs pool on SSD connected via 12 Gbps SAS with cache on NVME, 160 GB RAM= . >>=20 >=20 > what does "sysctl vfs.vnode" say >=20 >=20 > --=20 > Mateusz Guzik With the builds now complete, vfs.vnode.vnlru.uma_reclaim_calls: 0 vfs.vnode.vnlru.kicks: 0 vfs.vnode.vnlru.max_free_per_call: 10000 vfs.vnode.vnlru.failed_runs: 0 vfs.vnode.vnlru.direct_recycles_free: 152871194 vfs.vnode.vnlru.recycles_free: 77190881 vfs.vnode.vnlru.recycles: 0 vfs.vnode.stats.alloc_sleeps: 0 vfs.vnode.stats.free: 1269702 vfs.vnode.stats.skipped_requeues: 9375158 vfs.vnode.stats.created: 235594976 vfs.vnode.stats.count: 1407640 vfs.vnode.param.wantfree: 681059 vfs.vnode.param.limit: 2724237 Uptime is 11 days, mostly with low I/O rates.