git: c98367641991 - main - vm_fault: Defer marking COW pages valid
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 13 Apr 2025 16:12:35 UTC
The branch main has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=c98367641991019bac0e8cd55b70682171820534 commit c98367641991019bac0e8cd55b70682171820534 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2025-04-13 16:09:31 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2025-04-13 16:09:31 +0000 vm_fault: Defer marking COW pages valid Suppose an object O has two shadow objects S1, S2 mapped into processes P1, P2. Suppose a page resident in O is mapped read-only into P1. Now suppose that P1 writes to the page, triggering a COW fault: it allocates a new page in S1 and copies the page, then marks it valid. If the page in O was busy when initially looked up, P1 would have to release the map lock and sleep first. Then, after handling COW, P1 must re-check the map lookup because locks were dropped. Suppose the map indeed changed, so P1 has to retry the fault. At this point, the mapped page in O is shadowed by a valid page in S1. If P2 exits, S2 will be deallocated, resulting in a collapse of O into S1. In this case, because the mapped page is shadowed, P2 will free it, but that is illegal; this triggers a "freeing mapped page" assertion in invariants kernels. Fix the problem by deferring the vm_page_valid() call which marks the COW copy valid: only mark it once we know that the fault handler will succeed. It's okay to leave an invalid page in the top-level object; it will be freed when the fault is retried, and vm_object_collapse_scan() will similarly free invalid pages in the shadow object. Reviewed by: kib MFC after: 1 month Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D49758 --- sys/vm/vm_fault.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c index 81631b672040..0bd3a8207c4a 100644 --- a/sys/vm/vm_fault.c +++ b/sys/vm/vm_fault.c @@ -1064,14 +1064,14 @@ vm_fault_cow(struct faultstate *fs) * Oh, well, lets copy it. */ pmap_copy_page(fs->m, fs->first_m); - vm_page_valid(fs->first_m); if (fs->wired && (fs->fault_flags & VM_FAULT_WIRE) == 0) { vm_page_wire(fs->first_m); vm_page_unwire(fs->m, PQ_INACTIVE); } /* - * Save the cow page to be released after - * pmap_enter is complete. + * Save the COW page to be released after pmap_enter is + * complete. The new copy will be marked valid when we're ready + * to map it. */ fs->m_cow = fs->m; fs->m = NULL; @@ -1759,6 +1759,19 @@ found: if (hardfault) fs.entry->next_read = vaddr + ptoa(ahead) + PAGE_SIZE; + /* + * If the page to be mapped was copied from a backing object, we defer + * marking it valid until here, where the fault handler is guaranteed to + * succeed. Otherwise we can end up with a shadowed, mapped page in the + * backing object, which violates an invariant of vm_object_collapse() + * that shadowed pages are not mapped. + */ + if (fs.m_cow != NULL) { + KASSERT(vm_page_none_valid(fs.m), + ("vm_fault: page %p is already valid", fs.m_cow)); + vm_page_valid(fs.m); + } + /* * Page must be completely valid or it is not fit to * map into user space. vm_pager_get_pages() ensures this.