Reducing vm page queue mutex contention
Suleiman Souhlal
ssouhlal at FreeBSD.org
Thu Feb 1 06:42:24 UTC 2007
Hello Alan,
Profiling shows that the vm page queue mutex is the most contended lock in
the kernel, maybe apart from sched_lock. It seems that this is in part
because this lock protects a lot of things: page queues, pv entries, page
flags, page hold count, page wired count..
I came up with a possible plan to reduce contention on this lock,
concentrating on the amd64 pmap (although these should be applicable to
the other architectures as well):
- Make vm_page_flag_set/clear() just use atomic operations to get rid of
the page queues lock dependency.
I'm still not entirely convinced this is entirely safe.
- vm_page_hold and vm_page_unhold can be made not acquire the queues lock in
the common case.
I already have a patch for this, although it increases the size of
vm_page_t (I have some other ideas to reduce the size of vm_page_t, but
that's for another time):
http://people.freebsd.org/~ssouhlal/testing/vm_page_hold-20070131.diff
- Add a mutex pool for vm pages to protect the pv entries lists.
I'm currently working on this.
My current approach makes struct pv_entry larger because it needs to store
a pointer to the pte in each pv_entry.
Another way that might be better is to move to per-object pv entries,
which is what Linux does. It would greatly reduce memory usage when
mapping large objects in a lot of processes, although it might be slower
for sparsely faulted objects mapped in a large number of processes.
This approach would be a lot of work, which is why I'm leaning towards
keeping per-page pv entries.
- It should be possible to make vm_page->wired_count use atomic operations
instead of needing a lock, similarly to what I did for the hold_count.
This might be a bit tricky, but hopefully possible.
Alternatively, we could use the mutex pool described above to protect it.
- We can change pmap_unuse_pt and free_pv_entry to just mark the pages they
want to free in an array allocated by the caller.
The caller will then free those pages after it drops the pmap lock.
For example:
struct pages_to_free {
vm_page_t *page[MAX_PAGES];
int num_pages;
};
void pmap_remove(...)
{
struct pages_to_free pages;
PMAP_LOCK(pmap);
...
pmap_unuse_pt(..., &pages);
...
PMAP_UNLOCK(pmap);
vm_page_lock_queues();
for (i = 0; i < pages.num_pages; i++)
vm_page_free(pages.page[i]);
vm_page_unlock_queues();
}
This way, pmap_remove can be mostly without queues lock.
- Once the above are done, it should be possible to make pmap_enter() run
mostly queue lock free by:
- Pre-allocating a pv chunk early in pmap_enter, if there are no free
ones, so that we never have to allocate new chunks in pmap_insert_entry.
- Dropping the page queues lock immediately after the pmap_allocpte in
pmap_enter.
Any thoughts/comments?
-- Suleiman
More information about the freebsd-arch
mailing list