`sysctl vm.pmap.kernel_maps' spins on 12.2-RELEASE-p3 w/ nvdimm.ko
Pokala, Ravi
rpokala at panasas.com
Thu Feb 25 02:06:09 UTC 2021
-----Original Message-----
From: Alan Somers <asomers at freebsd.org>
Date: 2021-02-24, Wednesday at 16:27
To: Konstantin Belousov <kostikbel at gmail.com>
Cc: Ravi Pokala <rpokala at freebsd.org>, "freebsd-hackers at freebsd.org" <freebsd-hackers at freebsd.org>
Subject: Re: `sysctl vm.pmap.kernel_maps' spins on 12.2-RELEASE-p3 w/ nvdimm.ko
On Wed, Feb 24, 2021 at 5:26 PM Konstantin Belousov <kostikbel at gmail.com> wrote:
On Wed, Feb 24, 2021 at 04:55:46PM -0700, Alan Somers wrote:
> On Wed, Feb 24, 2021 at 4:49 PM Konstantin Belousov <kostikbel at gmail.com>
> wrote:
>
> > On Wed, Feb 24, 2021 at 03:37:12PM -0800, Ravi Pokala wrote:
> > > Hi folks,
> > >
> > > A colleague and I both independently observed `sysctl -a' appear to hang
> > on nodes running FreeBSD 12.2-RELEASE-p3; it didn't emit any output, and ^C
> > didn't kill it. We could still establish a new terminal session to the
> > node, via SSH or serial console, and we were able to see that it was
> > actually spinning, not hung, and was consuming an entire CPU.
> > >
> > > We eventually determined that it was specifically `sysctl
> > vm.pmap.kernel_maps' which was spinning, and subsequently that it only
> > spinned if nvdimm.ko was loaded. It was not necessary to access the device
> > node associated with the NVDIMM; merely having the module loaded was
> > sufficient.
> > >
> > > I know nvdimm(4) isn't terribly widely used, but hopefully someone who
> > uses it can at least confirm my findings on this. Help in debugging would
> > be even more appreciated.
> > >
> >
> > How large your nvdimms are? Their' SPAs are mapped into KVA fully and this
> > could be quite large. It could be busy dumping page tables.
On these nodes, 16GB.
> > Try to skip large map in pmap.c:sysctl_kmaps() (just increment i over it).
Thanks! This worked for me:
| case LMSPML4I:
| - sbuf_printf(sb, "\nLarge map:\n");
| - break;
| + sbuf_printf(sb, "\nLarge map: SKIPPING\n");
| + continue;
| }
> Speaking of vm.pmap.kernel_maps, that thing is huge. It easily dwarfs all
> other sysctls combined, and tends to grow with time. Would it be possible
> to hide it from sysctl -a's output? I think there are other sysctls like
> that, that are treated as opaque binary values. I once had to fix a bug in
> py37-salt that caused a common operation to take _6_hours_ as opposed to <
> 1 minute because of a huge vm.pmap.kernel_maps value, coupled with some
> O(n^2) string processing.
With nvdimm.ko unloaded on these nodes, it's around 3000 lines! O_O
It occurred to me that it might eventually finish, so I started it last night, and it hadn't completed overnight.
jhb already marked this sysctl ask CTLFLAG_SKIP, several months ago.
The change was not merged back to 12.
Nice!
Found it: r368768 (1dce7d9e7eef)
That's probably a cleaner fix than the patch above; I'll see about applying that in my tree.
Thanks, all!
-Ravi (rpokala@)
Ahh, good.
More information about the freebsd-hackers
mailing list