From nobody Wed Oct 20 01:23:22 2021 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 2CF80180FE83; Wed, 20 Oct 2021 01:23:24 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HYtCv6CDdz3lgs; Wed, 20 Oct 2021 01:23:23 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0BFE41E9D5; Wed, 20 Oct 2021 01:23:23 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 19K1NMos095620; Wed, 20 Oct 2021 01:23:22 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 19K1NMpV095619; Wed, 20 Oct 2021 01:23:22 GMT (envelope-from git) Date: Wed, 20 Oct 2021 01:23:22 GMT Message-Id: <202110200123.19K1NMpV095619@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mark Johnston Subject: git: 6d3c78d97028 - main - Rewrite the vm_page_alloc manual page List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 6d3c78d970283dc5e64eaf10a4ae40b54613d608 Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=6d3c78d970283dc5e64eaf10a4ae40b54613d608 commit 6d3c78d970283dc5e64eaf10a4ae40b54613d608 Author: Mark Johnston AuthorDate: 2021-10-20 00:26:30 +0000 Commit: Mark Johnston CommitDate: 2021-10-20 01:22:56 +0000 Rewrite the vm_page_alloc manual page Document the new allocator variants and flesh out the description of some details of the page allocator interface. Reviewed by: kib, alc Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32035 --- share/man/man9/Makefile | 11 ++ share/man/man9/vm_page_alloc.9 | 339 ++++++++++++++++++++++++++++++++++------- 2 files changed, 292 insertions(+), 58 deletions(-) diff --git a/share/man/man9/Makefile b/share/man/man9/Makefile index fdaa14bc93e4..eb4465259226 100644 --- a/share/man/man9/Makefile +++ b/share/man/man9/Makefile @@ -2330,6 +2330,17 @@ MLINKS+=vm_map_stack.9 vm_map_growstack.9 MLINKS+=vm_map_wire.9 vm_map_wire_mapped.9 \ vm_page_wire.9 vm_page_unwire.9 \ vm_page_wire.9 vm_page_unwire_noq.9 +MLINKS+=vm_page_alloc.9 vm_page_alloc_after.9 \ + vm_page_alloc.9 vm_page_alloc_contig.9 \ + vm_page_alloc.9 vm_page_alloc_contig_domain.9 \ + vm_page_alloc.9 vm_page_alloc_domain.9 \ + vm_page_alloc.9 vm_page_alloc_domain_after.9 \ + vm_page_alloc.9 vm_page_alloc_freelist.9 \ + vm_page_alloc.9 vm_page_alloc_freelist_domain.9 \ + vm_page_alloc.9 vm_page_alloc_noobj.9 \ + vm_page_alloc.9 vm_page_alloc_noobj_contig.9 \ + vm_page_alloc.9 vm_page_alloc_noobj_contig_domain.9 \ + vm_page_alloc.9 vm_page_alloc_noobj_domain.9 MLINKS+=vm_page_bits.9 vm_page_clear_dirty.9 \ vm_page_bits.9 vm_page_dirty.9 \ vm_page_bits.9 vm_page_is_valid.9 \ diff --git a/share/man/man9/vm_page_alloc.9 b/share/man/man9/vm_page_alloc.9 index aa3854b47aea..1b587339b0cd 100644 --- a/share/man/man9/vm_page_alloc.9 +++ b/share/man/man9/vm_page_alloc.9 @@ -1,5 +1,9 @@ .\" .\" Copyright (C) 2001 Chad David . All rights reserved. +.\" Copyright (c) 2021 The FreeBSD Foundation +.\" +.\" Portions of this documentation were written by Mark Johnston under +.\" sponsorship from the FreeBSD Foundation. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions @@ -26,102 +30,321 @@ .\" .\" $FreeBSD$ .\" -.Dd November 16, 2016 +.Dd October 17, 2021 .Dt VM_PAGE_ALLOC 9 .Os .Sh NAME .Nm vm_page_alloc -.Nd "allocate a page for a" -.Vt vm_object +.Nd "allocate a page of memory" .Sh SYNOPSIS .In sys/param.h .In vm/vm.h .In vm/vm_page.h .Ft vm_page_t .Fn vm_page_alloc "vm_object_t object" "vm_pindex_t pindex" "int req" +.Ft vm_page_t +.Fo vm_page_alloc_after +.Fa "vm_object_t object" +.Fa "vm_pindex_t pindex" +.Fa "int req" +.Fa "vm_page_t mpred" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_contig +.Fa "vm_object_t object" +.Fa "vm_pindex_t pindex" +.Fa "int req" +.Fa "u_long npages" +.Fa "vm_paddr_t low" +.Fa "vm_paddr_t high" +.Fa "u_long alignment" +.Fa "vm_paddr_t boundary" +.Fa "vm_memattr_t memattr" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_contig_domain +.Fa "vm_object_t object" +.Fa "vm_pindex_t pindex" +.Fa "int req" +.Fa "u_long npages" +.Fa "vm_paddr_t low" +.Fa "vm_paddr_t high" +.Fa "u_long alignment" +.Fa "vm_paddr_t boundary" +.Fa "vm_memattr_t memattr" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_domain +.Fa "vm_object_t object" +.Fa "vm_pindex_t pindex" +.Fa "int domain" +.Fa "int req" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_domain_after +.Fa "vm_object_t object" +.Fa "vm_pindex_t pindex" +.Fa "int domain" +.Fa "int req" +.Fa "vm_page_t mpred" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_freelist +.Fa "int freelist" +.Fa "int req" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_freelist_domain +.Fa "int domain" +.Fa "int freelist" +.Fa "int req" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_noobj +.Fa "int req" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_noobj_contig +.Fa "int req" +.Fa "u_long npages" +.Fa "vm_paddr_t low" +.Fa "vm_paddr_t high" +.Fa "u_long alignment" +.Fa "vm_paddr_t boundary" +.Fa "vm_memattr_t memattr" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_noobj_contig_domain +.Fa "int domain" +.Fa "int req" +.Fa "u_long npages" +.Fa "vm_paddr_t low" +.Fa "vm_paddr_t high" +.Fa "u_long alignment" +.Fa "vm_paddr_t boundary" +.Fa "vm_memattr_t memattr" +.Fc +.Ft vm_page_t +.Fo vm_page_alloc_noobj_domain +.Fa "int domain" +.Fa "int req" +.Fc .Sh DESCRIPTION The .Fn vm_page_alloc -function allocates a page at +family of functions allocate one or more pages of physical memory. +Most kernel code should not call these functions directly but should instead +use a kernel memory allocator such as +.Xr malloc 9 +or +.Xr uma 9 , +or should use a higher-level interface to the page cache, such as +.Xr vm_page_grab 9 . +.Pp +All of the functions take a +.Fa req +parameter which encodes the allocation priority and optional modifier flags, +described below. +The functions whose names do not include +.Dq noobj +additionally insert the pages starting at index .Fa pindex -within +in the +VM object .Fa object . -It is assumed that a page has not already been allocated at -.Fa pindex . -The page returned is inserted into the object, unless -.Dv VM_ALLOC_NOOBJ -is specified in the -.Fa req . +The object must be write-locked and not have a page already resident at the +specified index. +The functions whose names include +.Dq domain +support NUMA-aware allocation by returning pages from the +.Xr numa 4 +domain specified by +.Fa domain . .Pp +The +.Fn vm_page_alloc_after +and +.Fn vm_page_alloc_domain_after +functions behave identically to .Fn vm_page_alloc -will not sleep. +and +.Fn vm_page_alloc_domain , +respectively, except that they take an additional parameter +.Fa mpred +which must be the page resident in +.Fa object +with largest index smaller than +.Fa pindex , +or +.Dv NULL +if no such page exists. +These functions exist to optimize the common case of loops that allocate +multiple pages at successive indices within an object. .Pp -Its arguments are: -.Bl -tag -width ".Fa object" -.It Fa object -The VM object to allocate the page for. The -.Fa object -must be locked if -.Dv VM_ALLOC_NOOBJ -is not specified. -.It Fa pindex -The index into the object at which the page should be inserted. -.It Fa req -The bitwise-inclusive OR of a class and any optional flags indicating -how the page should be allocated. +.Fn vm_page_alloc_contig +and +.Fn vm_page_alloc_noobj_contig +functions and their NUMA-aware variants allocate a physically contiguous run of +.Fa npages +pages which satisfies the specified constraints. +The +.Fa low +and +.Fa high +parameters specify a physical address range from which the run is to +be allocated. +The +.Fa alignment +parameter specifies the requested alignment of the first page in the run +and must be a power of two. +If the +.Fa boundary +parameter is non-zero, the pages constituting the run will not cross a +physical address that is a multiple of the parameter value, which must be a +power of two. +If +.Fa memattr +is not equal to +.Dv VM_MEMATTR_DEFAULT , +then mappings of the returned pages created by, e.g., +.Xr pmap_enter 9 +or +.Xr pmap_qenter 9 , +will carry the machine-dependent encoding of the memory attribute. +Additionally, the direct mapping of the page, if any, will be updated to +reflect the requested memory attribute. +.Pp +The +.Fn vm_page_alloc_freelist +and +.Fn vm_page_alloc_freelist_domain +functions behave identically to +.Fn vm_page_alloc_noobj +and +.Fn vm_page_alloc_noobj_domain , +respectively, except that a successful allocation will return a page from the +specified physical memory freelist. +These functions are not intended for use outside of the virtual memory +subsystem and exist only to support the requirements of certain platforms. +.Sh REQUEST FLAGS +All page allocator functions accept a +.Fa req +parameter that governs certain aspects of the function's behavior. +.Pp +The +.Dv VM_ALLOC_WAITOK , +.Dv VM_ALLOC_WAITFAIL , +and +.Dv VM_ALLOC_NOWAIT +flags specify the behavior of the allocator if free pages could not be +immediately allocated. +The +.Dv VM_ALLOC_WAITOK +flag can only be used with the +.Dq noobj +variants. +If +.Dv VM_ALLOC_NOWAIT +is specified, then the allocator gives up and returns +.Dv NULL . +.Dv VM_ALLOC_NOWAIT +is specified implicitly if none of the flags are present in the request. +If either +.Dv VM_ALLOC_WAITOK +or +.Dv VM_ALLOC_WAITFAIL +is specified, the allocator will put the calling thread to sleep until +sufficient free pages become available. +At this point, if +.Dv VM_ALLOC_WAITFAIL +is specified the allocator will return +.Dv NULL , +and if +.Dv VM_ALLOC_WAITOK +is specified the allocator will retry the allocation. +After a failed +.Dv VM_ALLOC_WAITFAIL +allocation returns, the VM object, if any, will have been unlocked while the +thread was sleeping. +In this case the VM object write lock will be re-acquired before the function +call returns. .Pp -Exactly one of the following classes must be specified: +.Fa req +also encodes the allocation request priority. +By default the page(s) are allocated with no special treatment. +If the number of available free pages is below a certain watermark, the +allocation will fail or the allocating thread will sleep, depending on +the specified wait flag. +The watermark is computed at boot time and corresponds to a small (less than +one percent) fraction of the system's total physical memory. +To allocate memory more aggressively, one of following flags may be specified. .Bl -tag -width ".Dv VM_ALLOC_INTERRUPT" -.It Dv VM_ALLOC_NORMAL -The page should be allocated with no special treatment. .It Dv VM_ALLOC_SYSTEM -The page can be allocated if the cache is empty and the free -page count is above the interrupt reserved water mark. +The page can be allocated if the free page count is above the interrupt +reserved water mark. This flag should be used only when the system really needs the page. .It Dv VM_ALLOC_INTERRUPT -.Fn vm_page_alloc -is being called during an interrupt. -A page will be returned successfully if the free page count is greater -than zero. +The allocation will fail only if zero free pages are available. +This flag should be used only if the consequences of an allocation failure +are worse than leaving the system without free memory. +For example, this flag is used when allocating kernel page table pages, where +allocation failures trigger a kernel panic. .El .Pp -The optional flags are: +The following optional flags can further modify allocator behavior: .Bl -tag -width ".Dv VM_ALLOC_NOBUSY" +.It Dv VM_ALLOC_SBUSY +The returned page will be shared-busy. +This flag may only be specified when allocating pages in a VM object. .It Dv VM_ALLOC_NOBUSY -The returned page will not be exclusive busy. +The returned page will not be busy. +This flag is implicit when allocating pages without a VM object. +When allocating pages in a VM object, and neither +.Dv VM_ALLOC_SBUSY +nor +.Dv VM_ALLOC_NOBUSY +are specified, the returned pages will be exclusively busied. .It Dv VM_ALLOC_NODUMP The returned page will not be included in any kernel core dumps regardless of whether or not it is mapped in to KVA. -.It Dv VM_ALLOC_NOOBJ -Do not associate the allocated page with a vm object. -The -.Fa object -argument is ignored. -.It Dv VM_ALLOC_SBUSY -The returned page will be shared busy. .It Dv VM_ALLOC_WIRED The returned page will be wired. .It Dv VM_ALLOC_ZERO -Indicate a preference for a pre-zeroed page. -There is no guarantee that the returned page will be zeroed, but it -will have the -.Dv PG_ZERO -flag set if it is zeroed. -.El +If this flag is specified, the +.Dq noobj +variants will return zeroed pages. +The other allocator interfaces ignore this flag. +.It Dv VM_ALLOC_COUNT(n) +Hint that at least +.Fa n +pages will be allocated by the caller in the near future. +.Fa n +must be no larger than 65535. +If the system is short of free pages, this hint may cause the kernel +to reclaim memory more aggressively than it would otherwise. .El .Sh RETURN VALUES -The -.Vt vm_page_t -that was allocated is returned if successful; otherwise, +If the allocation was successful, a pointer to the +.Vt struct vm_page +corresponding to the allocated page is returned. +If the allocation request specified multiple pages, the returned +pointer points to an array of +.Vt struct vm_page +constituting the run. +Upon failure, .Dv NULL is returned. -.Sh NOTES -The pager process is always upgraded to -.Dv VM_ALLOC_SYSTEM -unless -.Dv VM_ALLOC_INTERRUPT -is set. +Regardless of whether the allocation succeeds or fails, the VM +object +.Fa object +will be write-locked upon return. +.Sh SEE ALSO +.Xr numa 4 , +.Xr malloc 9 , +.Xr uma 9 , +.Xr vm_page_grab 9 , +.Xr vm_page_sbusy 9 .Sh AUTHORS This manual page was written by .An Chad David Aq Mt davidc@acns.ab.ca .