Re: git: ef6fcc5e2b07 - main - nfsd: Add VNET_SYSUNINIT() macros for vnet cleanup

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Thu, 23 Feb 2023 13:56:06 UTC
On Wed, Feb 22, 2023 at 10:28 PM Gleb Smirnoff <glebius@freebsd.org> wrote:
>
>   Rick,
>
> On Tue, Feb 21, 2023 at 07:29:30PM -0800, Rick Macklem wrote:
> R> I did that for one of the ones related to nfsd (sys/fs/nfsserver/nfs_fha_new.c).
>
> This one actually doesn't look correct to me. What happens here is that the sysctls
> will affect only the default VNET.
>
Yes, but the sysctls are mostly useless anyhow. I don't know how to make them
work in a prison. (I know how to use SYSCTL_FLAG_VNET, but that does not
work in this case.)

> I think the VNET-itezation of this file went a bit wrong. You don't need to convert
> static structures to malloced when you VNET-ize a module. The infrastructure should
> take care of memory management.
>
Not if you want to keep the vnet footprint small. This was necessary
so that nfsd.ko
(and friends) will load dynamically. Without the conversion to
mallocs, it would complain
the vnet was out of memory when nfsd.ko tried to load.
(I'm sure I didn't need to do all of them, but it made sense to keep
the vnet footprint
as small as possible.)

> The dynamic sysctl context seems to be unneeded from the very beginning. It was
> always attached to the static softc. Suggested patch:
>
> https://reviews.freebsd.org/D38742
>
I'll look at it, but if it stops malloc'ng softc, then I will be
worried w.r.t. vnet footprint and
dynamic loading. (Note that the structure has an array of hash list
pointers in it, so it
is rather large.

Another reason (along with dynamic loading of the modules) for keeping
the foot print
small is to try and keep the jails that do not use nfsd(8) and friends
lightweight.
When I originally coded it, I put it under a kernel option called
VNET_NFSD (still in
kern_jail.c at this time), but others felt that a new kernel option
wasn't desirable,
(There is a lot of discussion under D37519. The downside of discussions that
happen in phabricator is that not as many people see them. I started an email
thread on freebsd-current@, but it quickly migrated to D37519. Just
the way things
currently happen.)

> It gets rid of one case of IS_DEFAULT_VNET.
>
> Tested very little: run two jails and observed difference in sysctl.
>
> R> It so happens it doesn't work for this case.  If you look at the patch, you'll
> R> see that the variable is vnet'd. It happens that it is only malloc'd for prisons
> R> other than prison0. (For prison0, it points to a global structure that is not
> R> malloc'd and is shared with the NFS client, which is not vnet'd.
>
> I think all of this instances can be fixed.
>
> For example nfsstatsv1, which is now malloced for non-default vnets and static for
> the default. Lots of modules still incorrectly update the global one.
>
Nope. "struct nfsstatsv1" is a structure that is shared with the NFS
client, which is
not vnet'd. As such, a static "struct nfsstatsv1" needs to exist for prison0.

My original solution was to create a separate structure for the server
side stuff
(vnet it), but then the result was a messy copying exercise for the system call,
which returns the entire structure (with client and server info).

Once the client side is vnet'd (I've already had a couple of emails asking me
to do it, so I plan on starting to work on it) then, yes, the
structure can become
a malloc'd vnet'd one and the IS_DEFAULT_VNET can go away.

I am not sure why you think IS_DEFAULT_VNET() is such a big deal in
the initialization code? It work. I can see an argument for using both
SYSINIT() and VNET_SYSINIT() for cases where you have both non-vnet;d
and vnet'd data to initialize. However, I don't see any problem with using
IS_DEFAULT_VNET() when needed.

> This patch makes it right:
>
> https://reviews.freebsd.org/D38743
>
> Tested very little: run two jails, use some NFS, observe difference in nfsstat.
>
> Removes another IS_DEFAULT_VNET.
>
> Note that D38743 is still far from perfect. NFS would benefit a lot if used macros
> for per-VNET per-CPU (raceless!) stats, that are used in TCP/IP. Sorry, they aren't
> documented :( You can find them in vnet.h starting from VNET_PCPUSTAT_DECLARE()
> and look for example usage in TCP/IP.
>
Ok, I was not aware of these. Since the things that are accessed
frequently by threads
are malloc'd (various lists off of hash tables that are also malloc'd
so that they can be
sized by tunables), I don't think they will be useful. The data is
global to all nfsd threads
and not thread specific. W.r.t. stats, it is one structure used for
all threads (client and server)
and is only updated once per RPC, so I don't see how these are useful
there either.

But I will take a look.

> Please fill free to commandeer those reviews from me and finalize them if you
> agree with general idea.
>
> --
> Gleb Smirnoff