mips pmap patch
Alan Cox
alc at rice.edu
Wed Aug 29 05:25:17 UTC 2012
On 08/27/2012 10:24, Jayachandran C. wrote:
> On Mon, Aug 20, 2012 at 9:24 PM, Alan Cox<alc at rice.edu> wrote:
>> On 08/20/2012 05:36, Jayachandran C. wrote:
>>> On Thu, Aug 16, 2012 at 10:10 PM, Alan Cox<alc at rice.edu> wrote:
>>>> On 08/15/2012 17:21, Jayachandran C. wrote:
>>>>> On Tue, Aug 14, 2012 at 1:58 AM, Alan Cox<alc at rice.edu> wrote:
>>>>>> On 08/13/2012 11:37, Jayachandran C. wrote:
>>> [...]
>>>>>>> I could not test for more than an hour on 32-bit due to another
>>>>>>> problem (freelist 1 containing direct-mapped pages runs out of pages
>>>>>>> after about an hour of compile test). This issue has been there for a
>>>>>>> long time, I am planning to look at it when I get a chance.
>>>>>>>
>>>>>> What exactly happens? panic? deadlock?
>>>>> The build slows down to a crawl and hangs when it runs out of pages in
>>>>> the freelist.
>>>>
>>>> I'd like to see the output of "sysctl vm.phys_segs" and "sysctl
>>>> vm.phys_free" from this machine. Even better would be running "sysctl
>>>> vm.phys_free" every 60 seconds during the buildworld. Finally, I'd like
>>>> to
>>>> know whether or not either "ps" or "top" shows any threads blocked on the
>>>> "swwrt" wait channel once things slow to a crawl.
>>> I spent some time looking at this issue. I use a very large kernel
>>> image with built-in root filesystem, and this takes about 120 MB out
>>> of the direct mapped area. The remaining pages (~64 MB) are not enough
>>> for the build process. If I increase free memory in this area either
>>> by reducing the rootfs size of by adding a few more memory segments to
>>> this area, the build goes through fine.
>>
>> I'm still curious to see what "sysctl vm.phys_segs" says. It sounds like
>> roughly half of the direct map region is going to DRAM and half to
>> memory-mapped I/O devices. Is that correct?
> Yes, about half of the direct mapped region in 32-bit is taken by
> flash, PCIe and other memory mapped IO. I also made the problem even
> worse by not reclaiming some bootloader areas in the direct mapped
> region, which reduced the available direct mapped memory.
>
> Here's the output of sysctls:
>
> root at testboard:/root # sysctl vm.phys_segs
> vm.phys_segs:
> SEGMENT 0:
>
> start: 0x887e000
> end: 0xc000000
> domain: 0
> free list: 0x887a407c
>
> SEGMENT 1:
>
> start: 0x1d000000
> end: 0x1fc00000
> domain: 0
> free list: 0x887a407c
>
> SEGMENT 2:
>
> start: 0x20000000
> end: 0xbc0b3000
> domain: 0
> free list: 0x887a3f38
>
> SEGMENT 3:
>
> start: 0xe0000000
> end: 0xfffff000
> domain: 0
> free list: 0x887a3f38
>
> root at testboard:/root # sysctl vm.phys_free
> vm.phys_free:
> FREE LIST 0:
>
> ORDER (SIZE) | NUMBER
> | POOL 0 | POOL 1 | POOL 2
> -- -- -- -- -- -- -- --
> 8 ( 1024K) | 2877 | 0 | 0
> 7 ( 512K) | 0 | 1 | 0
> 6 ( 256K) | 1 | 0 | 0
> 5 ( 128K) | 0 | 1 | 0
> 4 ( 64K) | 0 | 1 | 0
> 3 ( 32K) | 0 | 1 | 0
> 2 ( 16K) | 0 | 1 | 0
> 1 ( 8K) | 0 | 0 | 0
> 0 ( 4K) | 0 | 0 | 0
>
> FREE LIST 1:
>
> ORDER (SIZE) | NUMBER
> | POOL 0 | POOL 1 | POOL 2
> -- -- -- -- -- -- -- --
> 8 ( 1024K) | 66 | 0 | 0
> 7 ( 512K) | 1 | 1 | 0
> 6 ( 256K) | 0 | 0 | 0
> 5 ( 128K) | 0 | 0 | 0
> 4 ( 64K) | 0 | 1 | 0
> 3 ( 32K) | 0 | 0 | 0
> 2 ( 16K) | 0 | 0 | 0
> 1 ( 8K) | 1 | 1 | 0
> 0 ( 4K) | 0 | 1 | 0
>
>>> I also found that when the build slows down, most of the pages taken
>>> from freelist 1 are allocated by the UMA subsystem, which seems to
>>> keep quite a few pages allocated.
>>
>> At worst, it may be necessary to disable the use of uma_small_alloc() for
>> this machine configuration. At best, uma_small_alloc() could be revised
>> opportunistically use pages in the direct map region, but have the ability
>> to fall back to pages that have to be mapped.
> I think this probably is not a bug, but a configuration problem (we
> cannot have such a huge built-in root filesystem when the direct
> mapped area is at this low). Anyway, I have checked in code to
> recover more areas from the bootloader, and this mostly solves the
> issue for me. The above output is taken before the check-in.
I'm afraid that exhaustion of freelist 1 is still highly likely to occur
under some workloads that require the allocation of a lot of small
objects in the kernel's heap.
Alan
More information about the freebsd-mips
mailing list