optimizing TLB invalidations
Alan Cox
alc at rice.edu
Mon Oct 1 16:57:05 UTC 2012
On 10/01/2012 11:16, Jayachandran C. wrote:
> On Sat, Sep 22, 2012 at 10:09 PM, Alan Cox<alc at rice.edu> wrote:
>> Can you please test the attached patch? It introduces a new TLB
>> invalidation function for efficiently invalidating address ranges and uses
>> this function in pmap_remove().
>>
>> Basically, the function looks at the size of the address range in order to
>> decide how best to perform the invalidation. If the range is small compared
>> to the TLB size, it probes the TLB for pages in the range. That said, the
>> function understands that pages come in pairs, and so it won't probe for odd
>> page numbers. In contrast, the current code in pmap_remove() will probe for
>> both the even and odd page. On the other hand, if the range is large, then
>> the function changes its approach. It iterates over the TLB entries
>> checking each to see if it falls within the range. This can eliminate an
>> enormous number of TLB probes when a large virtual address range is
>> unmapped. Finally, on a multiprocessor, this change will reduce the number
>> of IPIs to invalidate TLB entries. There will be one IPI per range rather
>> than one per page.
>>
>> Ultimately, this new function could be applied elsewhere, like
>> pmap_protect(), but that's a patch for another day.
> Tested this on my XLP 64 bit SMP config, and did not any issues. The
> compilation test did not show much change in performance, but I think
> I need to run a multi-threaded benchmark to see the performance
> improvement.
>
Yes, I agree. Under a compilation test, the FreeBSD malloc(3)/free(3)
implementation will occasionally release a large chunk of memory (4MB)
back to the kernel. If all of that chunk was used, then we'll save
about 900 or so TLB probes. But, this doesn't happen very often. Under
a compilation workload, most of the bulk destruction of mappings happens
in pmap_remove_pages(), not pmap_remove().
Probably the place where you'll see an easily discernible effect is when
pmap_qremove() is modified to use the ranged TLB invalidation.
pmap_qremove() gets used when we unmap data from the buffer cache and
must shootdown every CPU.
Alan
More information about the freebsd-mips
mailing list