dynamically calculating NKPT [was: Re: huge ktr buffer]
Neel Natu
neelnatu at gmail.com
Tue Feb 5 17:12:43 UTC 2013
Hi Konstantin,
On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov <kostikbel at gmail.com> wrote:
> On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote:
>> Hi,
>>
>> I have a patch to dynamically calculate NKPT for amd64 kernels. This
>> should fix the various issues that people pointed out in the email
>> thread.
>>
>> Please review and let me know if there are any objections to committing this.
>>
>> Also, thanks to Alan (alc@) for reviewing and providing feedback on
>> the initial version of the patch.
>>
>> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt):
>>
>> Index: sys/amd64/include/pmap.h
>> ===================================================================
>> --- sys/amd64/include/pmap.h (revision 246277)
>> +++ sys/amd64/include/pmap.h (working copy)
>> @@ -113,13 +113,7 @@
>> ((unsigned long)(l2) << PDRSHIFT) | \
>> ((unsigned long)(l1) << PAGE_SHIFT))
>>
>> -/* Initial number of kernel page tables. */
>> -#ifndef NKPT
>> -#define NKPT 32
>> -#endif
>> -
>> #define NKPML4E 1 /* number of kernel PML4 slots */
>> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */
>>
>> #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */
>> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */
>> @@ -181,6 +175,7 @@
>> #define PML4map ((pd_entry_t *)(addr_PML4map))
>> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e))
>>
>> +extern int nkpt; /* Initial number of kernel page tables */
>> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */
>> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */
>>
>> Index: sys/amd64/amd64/minidump_machdep.c
>> ===================================================================
>> --- sys/amd64/amd64/minidump_machdep.c (revision 246277)
>> +++ sys/amd64/amd64/minidump_machdep.c (working copy)
>> @@ -232,7 +232,7 @@
>> /* Walk page table pages, set bits in vm_page_dump */
>> pmapsize = 0;
>> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys);
>> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR,
>> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR,
>> kernel_vm_end); ) {
>> /*
>> * We always write a page, even if it is zero. Each
>> @@ -364,7 +364,7 @@
>> /* Dump kernel page directory pages */
>> bzero(fakepd, sizeof(fakepd));
>> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys);
>> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR,
>> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR,
>> kernel_vm_end); va += NBPDP) {
>> i = (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1);
>>
>> Index: sys/amd64/amd64/pmap.c
>> ===================================================================
>> --- sys/amd64/amd64/pmap.c (revision 246277)
>> +++ sys/amd64/amd64/pmap.c (working copy)
>> @@ -202,6 +202,10 @@
>> vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */
>> vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */
>>
>> +int nkpt;
>> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0,
>> + "Number of kernel page table pages allocated on bootup");
>> +
>> static int ndmpdp;
>> static vm_paddr_t dmaplimit;
>> vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS;
>> @@ -495,17 +499,42 @@
>>
>> CTASSERT(powerof2(NDMPML4E));
>>
>> +/* number of kernel PDP slots */
>> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG)
>> +
>> static void
>> +nkpt_init(vm_paddr_t addr)
>> +{
>> + int pt_pages;
>> +
>> +#ifdef NKPT
>> + pt_pages = NKPT;
>> +#else
>> + pt_pages = howmany(addr, 1 << PDRSHIFT);
>> + pt_pages += NKPDPE(pt_pages);
>> +
>> + /*
>> + * Add some slop beyond the bare minimum required for bootstrapping
>> + * the kernel.
>> + *
>> + * This is quite important when allocating KVA for kernel modules.
>> + * The modules are required to be linked in the negative 2GB of
>> + * the address space. If we run out of KVA in this region then
>> + * pmap_growkernel() will need to allocate page table pages to map
>> + * the entire 512GB of KVA space which is an unnecessary tax on
>> + * physical memory.
>> + */
>> + pt_pages += 4; /* 8MB additional slop for kernel modules */
> 8MB might be to low. I just checked one of my machines with fully
> modularized kernel, it takes slightly more than 6 MB to load 50 modules.
> I think that 16MB would be safer, but it probably needs to be scaled
> down based on the available phys memory. amd64 kernel could be booted
> on 128MB machine still.
Sounds fine. I can bump it up to 8 pages.
Also, wrt your comment about scaling this number based on available
memory, I wonder if it makes sense to optimize for 16KB of additional
space.
I would much rather work with you and Alan to fix pmap_growkernel() so
we don't need to care about this slack in the first place :-)
best
Neel
More information about the freebsd-hackers
mailing list