Re: git: 36f1526a598c - main - Add experimental 16k page support on arm64

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Wed, 10 Aug 2022 22:59:32 UTC
On 7/19/22 2:57 AM, Andrew Turner wrote:
> The branch main has been updated by andrew:
> 
> URL: https://cgit.FreeBSD.org/src/commit/?id=36f1526a598c373ca660910c9772d28a61383c3b
> 
> commit 36f1526a598c373ca660910c9772d28a61383c3b
> Author:     Andrew Turner <andrew@FreeBSD.org>
> AuthorDate: 2022-03-23 17:39:58 +0000
> Commit:     Andrew Turner <andrew@FreeBSD.org>
> CommitDate: 2022-07-19 09:57:03 +0000
> 
>      Add experimental 16k page support on arm64
>      
>      Add initial 16k page support on arm64. It is considered experimental,
>      with no guarantee of compatibility with a userspace or kernel modules
>      built with the current a 4k page size as code will likely try to pass
>      in a too small size when working with APIs that take a multiple of a
>      page, e.g. mmap.
>      
>      As this is experimental, and because userspace and the kernel need to
>      have the PAGE_SIZE macro kept in sync there is no kernel option to
>      enable this. To test a new image should be built with the
>      PAGE_{SIZE,SHIFT,MASK} macros changed to the 16k versions.
>      
>      There are currently known issues with loading modules from an old
>      loader as it can misalign them to load on a non-16k boundary.
>      
>      Testing has shown good results in kernel workloads that allocate and
>      free large amounts of memory as only a quarter of the number of calls
>      into the VM subsystem are needed in the best case.
>      
>      Reviewed by:    markj
>      Tested by:      gallatin
>      Sponsored by:   The FreeBSD Foundation
>      Differential Revision: https://reviews.freebsd.org/D34793

While merging this into CheriBSD (which has somewhat similar but conflicting
set of PV entry changes due to a different set of constants for PV entries)
it seems you did not correctly update places that check if a PV entry chunk
is full.  For example, in get_pv_entry:

		if (field < _NPCM) {
			pv = &pc->pc_pventry[field * 64 + bit];
			pc->pc_map[field] &= ~(1ul << bit);
			/* If this was the last item, move it to tail */
=>			if (pc->pc_map[0] == 0 && pc->pc_map[1] == 0 &&
=>			    pc->pc_map[2] == 0) {
				TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
				TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc,
				    pc_list);
			}
			PV_STAT(atomic_add_long(&pv_entry_count, 1));
			PV_STAT(atomic_subtract_int(&pv_entry_spare, 1));
			return (pv);

That should presumably be comparing the entire pc_map[] to zero?  (There
are two similar checks in pmap_pv_demote_l2 as well, one in a KASSERT)

I would suggest perhaps allocating a separate pc_fullmask[] array and
a PC_IS_FULL() wrapper that uses memcmp.

Btw, given that the compiler can inline memcmp against fixed-size const
arrays, it seems like you could just use the memcmp() version for PC_IS_FREE
always and avoid the #ifdef there.

-- 
John Baldwin