From nobody Sun Jan 26 21:44:51 2025 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Yh4pV6xmqz5ldYM for ; Sun, 26 Jan 2025 21:45:06 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Yh4pV0hcxz3lph; Sun, 26 Jan 2025 21:45:06 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=J9Dehkxs; spf=pass (mx1.freebsd.org: domain of rick.macklem@gmail.com designates 2a00:1450:4864:20::535 as permitted sender) smtp.mailfrom=rick.macklem@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ed1-x535.google.com with SMTP id 4fb4d7f45d1cf-5d647d5df90so6275989a12.2; Sun, 26 Jan 2025 13:45:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737927903; x=1738532703; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6QGgjgtgbbzU0a35l6H2hgeBzrebsfy7vNsknBtEK0A=; b=J9DehkxsW1y5JJJVZiW8zHHeogA8UFY648cJ5tAtkx//ZuqnXwhUYEFR6uTPoD4Gxl fY+tUXanIaE0lJDzJWV5XUW/oi88zyPHmHLG85C5Jin70AOg3MLgOsnnr6H39rW8oVjn hX7eq5BgX+9LfiGEoihkNk2yvcxifyvVdoQmF4piUYwN4KtpizRxS1S89gDBprLusDDa 1Nx8aG9nbZAS0Y5tTtF6nHwnWFK7/70BBgR7RvJ6z4kxlOLJTAx7ss0VvDIPwyn9GnLY wlOqzvzmhkraMN5u6ev79e8fFDECYbVMgqqgTkwiyHyGYTMmXajFT9TPaPXMF+WJJk/V jihg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737927903; x=1738532703; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6QGgjgtgbbzU0a35l6H2hgeBzrebsfy7vNsknBtEK0A=; b=Y36I+Ki+lBs6GQuhR2rDkTwhrV7k3PDgrkwiNY7ei5udrwdP/dYqGi1v9LMsHGB86/ PNtkfjBH1xTP5MK5AQebImKq2RBu2n/VkHBD6qp7L3M4NF7zVvj2ycWCA2CJxWvbOER1 1zOea3nCq58Ix9axbwb4fjtW7Ev3zaqBFiEZ4rMJhvX32sWrK3T7jA+gJv9ZQZlBX6UW 39QsI12inkMzVyciKKVl9jaR/lu/wVCxtV7IJtt88s5r8ocBwUe3TL1JbE5Npt5W/pd5 ikjwX3Gnd54gGGL0xIFMhx/sw6aDQLS7BPQPw0bc2+NezY29REmARBzcx31/9VAxeaQX iAZQ== X-Forwarded-Encrypted: i=1; AJvYcCW4wEMZjyXqTQrh6yAGByF/QomOhIjnFMhleSZXitJinMHL7xA6Y/o4bOXweiPERNM5Fp/dkgLaxQ==@freebsd.org X-Gm-Message-State: AOJu0Yx9rYunL4luAMx0ZseFPvWAF8wy/pHkaQqCFK8HAkJxdF2k+eMb FCnSRUoIyExh+glnQCnci8vHSks0mh+UkjDBXX5pLW6Cbw+X6AIJlnOiEtwCuwXG41xfkwcpp0s MlUGXzX2MQ7ktA0Vhp3HoltYFsQ6MKkE= X-Gm-Gg: ASbGncshaXFW8lU4ch8li/r9MxsVot+3E/U0xl6GKIKyPTZM1fvIQmQ8jiS6OCIrYdi 3h42YADaWEreh1zszzIcOxybR284rUTrMj79hMHDDBuOsqHRjrA6A99dIk6pyZN9guqlEgWaWZ3 XFqvQuSWXuhFh/96/K4C8= X-Google-Smtp-Source: AGHT+IGEQZH3UCbRyPrFr8pilu2XdbaGtQ4/9SEM6BZXMB6xf1zPiJIj7fnQml5D7UfRbBtpHuIdiMiatTxe/K5ZEOQ= X-Received: by 2002:a05:6402:2682:b0:5db:f83c:2473 with SMTP id 4fb4d7f45d1cf-5dbf83c26bdmr17710406a12.30.1737927902968; Sun, 26 Jan 2025 13:45:02 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Sun, 26 Jan 2025 13:44:51 -0800 X-Gm-Features: AWEUYZnKZW8HPCZgmPebVkZ5YCdaHXIqWlb29ur8E8EM6EAc8Bt3gzJbPuUhYP0 Message-ID: Subject: Re: HEADS UP: NFS changes coming into CURRENT early February To: Gleb Smirnoff Cc: current@freebsd.org, rmacklem@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Result: default: False [-3.99 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.99)[-0.995]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; FREEMAIL_FROM(0.00)[gmail.com]; TAGGED_FROM(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::535:from]; MID_RHS_MATCH_FROMTLD(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MLMMJ_DEST(0.00)[current@freebsd.org]; RCVD_COUNT_ONE(0.00)[1]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; MISSING_XM_UA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com] X-Spamd-Bar: --- X-Rspamd-Queue-Id: 4Yh4pV0hcxz3lph On Tue, Jan 21, 2025 at 10:27=E2=80=AFPM Gleb Smirnoff wrote: > > CAUTION: This email originated from outside of the University of Guelph. = Do not click links or open attachments unless you recognize the sender and = know the content is safe. If in doubt, forward suspicious emails to IThelp@= uoguelph.ca. > > > Hi, > > TLDR version: > users of NFS with Kerberos (e.g. running gssd(8)) as well as users of NFS= with > TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users of > network lock manager (e.g. having 'options NFSLOCKD' and running rpcbind(= 8)) > are affected. You would need to recompile & reinstall both the world and= the > kernel together. Of course this is what you'd normally do when you track > FreeBSD CURRENT, but better be warned. I will post hashes of the specifi= c > revisions that break API/ABI when they are pushed. > > Longer version: > last year I tried to check-in a new implementation of unix(4) SOCK_STREAM= and > SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to seve= ral > kernel side abusers of a unix(4) socket. The most difficult ones are the= NFS > related RPC services, that act as RPC clients talking to an RPC servers i= n > userland. Since it is impossible to fully emulate a userland process > connection to a unix(4) socket they need to work with the socket internal > structures bypassing all the normal KPIs and conventions. Of course they > didn't tolerate the new implementation that totally eliminated intermedia= te > buffer on the sending side. > > While the original motivation for the upcoming changes is the fact that I= want > to go forward with the new unix/stream and unix/seqpacket, I also tried t= o make > kernel to userland RPC better. You judge if I succeeded or not :) Here a= re > some highlights: > > - Code footprint both in kernel clients and in userland daemons is reduce= d. > Example: gssd: 1 file changed, 5 insertions(+), 64 deletions(-) > kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-) > 4 files changed, 1 insertion(+), 11 deletions(-) > - You can easily see all RPC calls from kernel to userland with genl(1): > # genl monitor rpcnl > - The new transport is multithreaded in kernel by default, so kernel clie= nts > can send a bunch of RPCs without any serialization and if the userland > figures out how to parallelize their execution, such parallelization wo= uld > happen. Note: new rpc.tlsservd(8) will use threads. > - One ad-hoc single program syscall is removed - gssd_syscall. Note: > rpctls syscall remains, but I have some ideas on how to improve that, t= oo. > Not at this step though. > - All sleeps of kernel RPC calls are now in single place, and they all ha= ve > timeouts. I believe NFS services are now much more resilient to hangs. > A deadlock when NFS kernel thread is blocked on unix socket buffer, and > the socket can't go away because its application is blocked in some oth= er > syscall is no longer possible. > > The code is posted on phabricator, reviews D48547 through D48552. > Reviewers are very welcome! > > I share my branch on Github. It is usually rebased on today's CURRENT: > > https://github.com/glebius/FreeBSD/commits/gss-netlink/ > > Early testers are very welcome! Ok, I can now do minimal testing and crashed it... I did a mount with option "tls" and then partitioned it from the NFS server by doing "ifconfig bridge0 down". Waited until the TCP connection closed and then did "ifconfig bridge0 up". The crash is a NULL pointer at rpctls_impl.c:255 (in rpctls_connect(), called from nfscl_renewthread(). The problem is that you made rpctls_connect_handle a vnet'd variable. The client side (aka an NFS mount) does not happen inside a jail and cannot use any vnet'd variables. Why? Well, any number of threads enter the NFS client via VOP_xxx() calls etc. Any one of them might end up doing a TCP reconnect when the underlying TCP connection is broken and then heals. I don't know why you made rpctls_connect_handle a vnet'd variable, but it cannot be that way. (I once looked at making NFS mounts work inside a vnet prison and gave up when I realized any old thread ends up in the code and it would have taken many, many CURVNET_SET() calls to make it work.) In summary, no global variable on the client side can be vnet'd and no global variable on the server side that is vnet'd can be shared with the client side code. I realize you are enthusiastic about this, but I'd suggest you back off to the minimal changes required to make this stuff work with netlink instead of unix domain sockets and stick with that, at least for the initial commit cycle. One thing to note is that few (if any) people who run main test this stuff. It may be 1-2years before it sees third party testing and I can only do min= imal testing until at least April. Anyhow, thanks for all the good work you are doing with this, rick > > -- > Gleb Smirnoff >