From nobody Tue Aug 22 16:59:36 2023 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RVbDV3LjGz4r8NW for ; Tue, 22 Aug 2023 16:59:38 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4RVbDV1C5wz4fNp for ; Tue, 22 Aug 2023 16:59:38 +0000 (UTC) (envelope-from mjguzik@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-56c4c4e822eso3026981eaf.3 for ; Tue, 22 Aug 2023 09:59:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692723577; x=1693328377; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Qb6BIDwBDJmjuSUBP221zOmLxC3/lFUm8fRqUTtcEjY=; b=VvO9ekBeZTakjDm8dJCY0Q1gHnLDRq5TIp7rg56NE6W0H0ObINQ+GWr81v8IdiHYFc mfdIxJ4pBGPDr2vmrkRtVSndLq4iPje2q7taYQUlabFUUD7EUR6qO10Ct1teSwGnRKJ8 yLk7FnhQLnqT/pbnBD0ZQzxuyf7m3PtSAMhtxFwo2NnZm/YMyNNkj0Me9/cAgXdEw/ns Hpr6nvlWJtwiMBW+7v+bmSLXY9az1HfJNuwOBuevYDy9/PNknavPIKNmLGZlyGdNKyc/ cir6B2UuywKf4MJc3Sa0futudD/+rJ3CDj2Q8//ioBm9xemUf7Wsr5GCwIx6ZL6IMuND 8gFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692723577; x=1693328377; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Qb6BIDwBDJmjuSUBP221zOmLxC3/lFUm8fRqUTtcEjY=; b=Z7X+N6Iek7WG+oGH9VrCxpnzq1RsOXJl1d2fegYa3ssDehzjFgbovV5sTUYnwzfBO0 J7/YIGt+3QUoUjCnikgnHSn+0CgdrOnITowuZR8IVtH1FnBzEx37XltOo8XvzCKyfgxL zFrCw3/rNMVHUT5rmN4WMqSbnMu488awAOS/3WCTe0kTYkemafT5hFgljIX5RXJE3nhk 6jjRJ5TRRIFqIDPCQaSZa5Wy74Y0PFSA47c/iL1NMeXLWPQJrR1rlUV0UvQgZQhSL8LH 95WgCfOT0DQ9aiHs+11I1pLEwWt9GYWaqPzEP3UEXKbDA4UgPoUBdNh50aq6w7H+mxRE 7fAQ== X-Gm-Message-State: AOJu0YwGhgWJKbUj02NzSXRJCjKCeEFs2cabldQ7FkYYTEG9o0McaCTh xh3nf8zP3bBPTpzGb0PBWzb4ezLGKAlBzQN6e4M= X-Google-Smtp-Source: AGHT+IH6Aj+0uA/KSuiItzJfwIB+6kc83a3VUqwbcKn1moUzdd9cshzlD4gU0C0fIT+gBqV7qJCYGjWzjWJmqYLKN4M= X-Received: by 2002:a4a:d1d7:0:b0:56e:975a:1290 with SMTP id a23-20020a4ad1d7000000b0056e975a1290mr11258644oos.0.1692723576859; Tue, 22 Aug 2023 09:59:36 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Received: by 2002:ac9:5797:0:b0:4f0:1250:dd51 with HTTP; Tue, 22 Aug 2023 09:59:36 -0700 (PDT) In-Reply-To: <444770b977b02b98985928bea450e4ce@Leidinger.net> References: <88e837aeb5a65c1f001de2077fb7bcbd@Leidinger.net> <4d60bd12b482e020fd4b186a9ec1a250@Leidinger.net> <73f7c9d3db8f117deb077fb17b1e352a@Leidinger.net> <58493b568dbe9fb52cc55de86e01f5e2@Leidinger.net> <58ac6211235c52d744666e8ae2ec7568@Leidinger.net> <444770b977b02b98985928bea450e4ce@Leidinger.net> From: Mateusz Guzik Date: Tue, 22 Aug 2023 18:59:36 +0200 Message-ID: Subject: Re: Speed improvements in ZFS To: Alexander Leidinger Cc: Konstantin Belousov , current@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4RVbDV1C5wz4fNp X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] On 8/22/23, Alexander Leidinger wrote: > Am 2023-08-21 10:53, schrieb Konstantin Belousov: >> On Mon, Aug 21, 2023 at 08:19:28AM +0200, Alexander Leidinger wrote: >>> Am 2023-08-20 23:17, schrieb Konstantin Belousov: >>> > On Sun, Aug 20, 2023 at 11:07:08PM +0200, Mateusz Guzik wrote: >>> > > On 8/20/23, Alexander Leidinger wrote: >>> > > > Am 2023-08-20 22:02, schrieb Mateusz Guzik: >>> > > >> On 8/20/23, Alexander Leidinger wrote: >>> > > >>> Am 2023-08-20 19:10, schrieb Mateusz Guzik: >>> > > >>>> On 8/18/23, Alexander Leidinger >>> > > >>>> wrote: >>> > > >>> >>> > > >>>>> I have a 51MB text file, compressed to about 1MB. Are you >>> > > >>>>> interested >>> > > >>>>> to >>> > > >>>>> get it? >>> > > >>>>> >>> > > >>>> >>> > > >>>> Your problem is not the vnode limit, but nullfs. >>> > > >>>> >>> > > >>>> https://people.freebsd.org/~mjg/netchild-periodic-find.svg >>> > > >>> >>> > > >>> 122 nullfs mounts on this system. And every jail I setup has >>> > > >>> several >>> > > >>> null mounts. One basesystem mounted into every jail, and then >>> > > >>> shared >>> > > >>> ports (packages/distfiles/ccache) across all of them. >>> > > >>> >>> > > >>>> First, some of the contention is notorious VI_LOCK in order to >>> > > >>>> do >>> > > >>>> anything. >>> > > >>>> >>> > > >>>> But more importantly the mind-boggling off-cpu time comes from >>> > > >>>> exclusive locking which should not be there to begin with -- as >>> > > >>>> in >>> > > >>>> that xlock in stat should be a slock. >>> > > >>>> >>> > > >>>> Maybe I'm going to look into it later. >>> > > >>> >>> > > >>> That would be fantastic. >>> > > >>> >>> > > >> >>> > > >> I did a quick test, things are shared locked as expected. >>> > > >> >>> > > >> However, I found the following: >>> > > >> if ((xmp->nullm_flags & NULLM_CACHE) != 0) { >>> > > >> mp->mnt_kern_flag |= >>> > > >> lowerrootvp->v_mount->mnt_kern_flag & >>> > > >> (MNTK_SHARED_WRITES | MNTK_LOOKUP_SHARED | >>> > > >> MNTK_EXTENDED_SHARED); >>> > > >> } >>> > > >> >>> > > >> are you using the "nocache" option? it has a side effect of >>> > > >> xlocking >>> > > > >>> > > > I use noatime, noexec, nosuid, nfsv4acls. I do NOT use nocache. >>> > > > >>> > > >>> > > If you don't have "nocache" on null mounts, then I don't see how >>> > > this >>> > > could happen. >>> > >>> > There is also MNTK_NULL_NOCACHE on lower fs, which is currently set >>> > for >>> > fuse and nfs at least. >>> >>> 11 of those 122 nullfs mounts are ZFS datasets which are also NFS >>> exported. >>> 6 of those nullfs mounts are also exported via Samba. The NFS exports >>> shouldn't be needed anymore, I will remove them. >> By nfs I meant nfs client, not nfs exports. > > No NFS client mounts anywhere on this system. So where is this exclusive > lock coming from then... > This is a ZFS system. 2 pools: one for the root, one for anything I need > space for. Both pools reside on the same disks. The root pool is a 3-way > mirror, the "space-pool" is a 5-disk raidz2. All jails are on the > space-pool. The jails are all basejail-style jails. > While I don't see why xlocking happens, you should be able to dtrace or printf your way into finding out. -- Mateusz Guzik