From nobody Sun Oct 29 20:30:21 2023 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SJShV5Bbtz4y9G1 for ; Sun, 29 Oct 2023 20:30:34 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SJShV0DPQz4V64 for ; Sun, 29 Oct 2023 20:30:34 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=OUPuliTm; spf=pass (mx1.freebsd.org: domain of rick.macklem@gmail.com designates 2607:f8b0:4864:20::102a as permitted sender) smtp.mailfrom=rick.macklem@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pj1-x102a.google.com with SMTP id 98e67ed59e1d1-28028f92709so770454a91.0 for ; Sun, 29 Oct 2023 13:30:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698611432; x=1699216232; darn=freebsd.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Sx5iOtvxBlK7/5juWu/6RUT6jJG4sDxErxES0aPTtKg=; b=OUPuliTmMHIq2fRfgpylXybsOcfQHycCbQPcPLazjK43CY7p6BBSH7n9CVz6dvKm/Q drqB73JGROZY5ZvP7MFzB7kxXZgO/kUMiEwZn+ZXB4yJY57hl3qd93I+LTOeW5u2uANs mdxGj9lMzgGrzT8Hu6R8eRxkpcIBJLA0Z+BaxM4o4TKvf+V/L+Egj3gTlbs4Lrj9OLeX /4gwb7NcsvE2ofIIbVOifxFhkVIVeusy6SkxR20vorylc3LQPgiqWShvbxw/pkYrKrb3 yJVGmNO4dZA1c0KgkLDCx6iEZd2cAgJUygPm+HthF22FxsqwVI2ZTrPJg5RUarcZg3VF o8sQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698611432; x=1699216232; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Sx5iOtvxBlK7/5juWu/6RUT6jJG4sDxErxES0aPTtKg=; b=xBtld1ah62f0ptENUItk2yx7rjeG8WV2D/rpxMUiAbz8OW7lqtQld40NQERVxVb4ky 5zEXJBjg7vi3x1HJBNoJ3e3f6Q6NojZmSfI8+sANWQNLhNIPGTCV/aSA9G0mS0Vc2zO2 fQLWvibHPaz1rmBHVXeb+mFK7Hp9nbvJzn8urFsCgJuAXABuVGO0WXEnCGEb2JEpI+Q6 ETf+ylaIC1IhJZKI2IOtNrcv5GAF/6G0dW2QOQWkUA0ySDKC8tLpVb72RVVSOUuZPZar i9FGNvF7JbFmtjQzn6e/sIkvdxlhE6E/wWRKtBeePHziVbbB4PKftb2QZJlo7AXsBcyx T3Bw== X-Gm-Message-State: AOJu0Ywyo88jnehPK5L5nB2hiDkvw8/QIwDd48+Sm0XJB0DAfGJiDOmM 1+rNU+Xl+B0kQKX7RzfnMj/ENhlIZCzxvU3e6RrikEN+GA== X-Google-Smtp-Source: AGHT+IEYDY+ueF8ivRbxkvPdLkSslIN0vETpiFUuicZ/AdTkApnrKxqa0CNMMLpm/Kl6yVsZt5Wy+j/cf9ur8UoaYgE= X-Received: by 2002:a17:90a:9708:b0:268:808:8e82 with SMTP id x8-20020a17090a970800b0026808088e82mr9940504pjo.1.1698611432268; Sun, 29 Oct 2023 13:30:32 -0700 (PDT) List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Sun, 29 Oct 2023 13:30:21 -0700 Message-ID: Subject: Re: optimising nfs and nfsd To: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Result: default: False [-3.97 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.97)[-0.966]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; ARC_NA(0.00)[]; BLOCKLISTDE_FAIL(0.00)[2607:f8b0:4864:20::102a:server fail]; RCPT_COUNT_ONE(0.00)[1]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; DKIM_TRACE(0.00)[gmail.com:+]; MID_RHS_MATCH_FROMTLD(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::102a:from]; RCVD_COUNT_ONE(0.00)[1]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; MLMMJ_DEST(0.00)[freebsd-fs@freebsd.org] X-Rspamd-Queue-Id: 4SJShV0DPQz4V64 X-Spamd-Bar: --- Oh, and since I only get to test in my little at home nowhere near server grade environment, I'd appreciate comments from others related to NFS performance, too. rick On Sun, Oct 29, 2023 at 1:28=E2=80=AFPM Rick Macklem wrote: > > On Sun, Oct 29, 2023 at 4:41=E2=80=AFAM void wrote: > > > > Hello list, > > > > The nfs instructions in the handbook are rather terse. > > > > I know there's been lots of new development with nfs. > > The zfs property of sharenfs, version4, KTLS etc. > > The "readahead" property for clients. > > > > Would anyone here please point me to up-to-date > > resources? My context is nfs server exporting > > via sharenfs with freebsd14, -current and debian-based > > linux clients on a gigabit LAN. > I have a few primitive docs, but they do not cover what you > are interested in (all found at https://people.freebsd.org/~rmacklem): > nfs-krb5-setup.txt > nfs-over-tls-setup.txt > nfsd-vnet-prison-setup.txt > pnfs-planb-setup.txt > > However, here are a few comments that might be useful... > - If you do "nfsstat -m" on the client(s), you will find out what > they are using. If the mounts are NFSv4.0, consider switching > to NFSv4.1/4.2. (I consider NFSv4.0 a deprecated protocol. > NFSv4.1/4.2 does assorted things better, including something > called "sessions" which replaces use of the DRC.) > If the mounts are NFSv3 and work ok for you, that's fine. NFSv3 > will actually perform better than NFSv4, but will lack things like > good byte range locking support. > - If you do something like: > dd if=3D/nfsmount/bigfile of=3D/dev/null bs=3D1M > and get wire speed (100+ Mbytes/sec for 1Gbps), then > there probably is not much more you can do performance wise. > Mount options like "readahead/rsize/wsize/nconnect" can improve > performance if the mount is not already running at close to wirespeed. > (They are all in the "try it and see if it helps/your mileage may vary" > category.) > The Linux client folk do try and make defaults work well. > > Interrupt moderation... > - Most NICs do not generate an interrupt for every packet sent/received > to avoid an interrupt flood. Unfortunately, this can delay RPC message > handling and have a negative impact on NFS performance, since it > primarily depends on RPC RTT and not bandwidth. Most NIC drivers > do have tunable(s) for this. Again, your mileage may vary.. > > If you are using NFSv3 or NFSv4.0 mounts, performance can be > improved by tuning or disabling the duplicate request cache (DRC). > The DRC is an oddball, in that it improves correctness (avoiding > non-idempotent RPCs being performed multiple times), but slows > performance. For a good LAN, TCP may not need this. (For TCP > mounts, RPCs are only retried after the RPC layer gives up on a > TCP connection and creates a new one. With a good LAN, this > should be a rare occurrence.) > # sysctl vfs.nfsd.cachetcp=3D0 > is the extreme tuning case that turns the DRC off to TCP. > (Again, this is irrelevant for NFSv4.1/4.2 mounts, since they do > not use the DRC.) > > NFS-over-TCP (called RPC-over-TCP by the Linux folk) is > discussed in one of the primitive docs mentioned above. > It uses the KTLS to run all the NFS traffic within a TLS1.3 > session (not the same as an NFSv4.1/4.2 session). > This has obvious security advantages, but can result in > about 1/3rd of a CPU core being used/per NFS connection > for encryption/decryption when the NFS mount is busy. > I am not sure quite where the LInux client patches are > at this point. I know they have been testing them, but I > suspect you need a very recent Linux kernel to get the > support, so I doubt they are in most distros yet? > > In summary, if you are getting near wire speed and you > are comfortable with your security situation, then there > isn't much else to do. > > rick > > > > > > > Some workloads are very interactive. Some are not, > > such as backups. It works, but it maybe can work > > better. > > > > Thanks! > > -- > >