mips pmap patch

Alan Cox alc at rice.edu
Wed Aug 29 05:25:17 UTC 2012


On 08/27/2012 10:24, Jayachandran C. wrote:
> On Mon, Aug 20, 2012 at 9:24 PM, Alan Cox<alc at rice.edu>  wrote:
>> On 08/20/2012 05:36, Jayachandran C. wrote:
>>> On Thu, Aug 16, 2012 at 10:10 PM, Alan Cox<alc at rice.edu>   wrote:
>>>> On 08/15/2012 17:21, Jayachandran C. wrote:
>>>>> On Tue, Aug 14, 2012 at 1:58 AM, Alan Cox<alc at rice.edu>    wrote:
>>>>>> On 08/13/2012 11:37, Jayachandran C. wrote:
>>> [...]
>>>>>>> I could not test for more than an hour on 32-bit due to another
>>>>>>> problem (freelist 1 containing direct-mapped pages runs out of pages
>>>>>>> after about an hour of compile test).  This issue has been there for a
>>>>>>> long time, I am planning to look at it when I get a chance.
>>>>>>>
>>>>>> What exactly happens?  panic?  deadlock?
>>>>> The build slows down to a crawl and hangs when it runs out of pages in
>>>>> the freelist.
>>>>
>>>> I'd like to see the output of "sysctl vm.phys_segs" and "sysctl
>>>> vm.phys_free" from this machine.  Even better would be running "sysctl
>>>> vm.phys_free" every 60 seconds during the buildworld.  Finally, I'd like
>>>> to
>>>> know whether or not either "ps" or "top" shows any threads blocked on the
>>>> "swwrt" wait channel once things slow to a crawl.
>>> I spent some time looking at this issue.  I use a very large kernel
>>> image with built-in root filesystem, and this takes about 120 MB  out
>>> of the direct mapped area. The remaining pages (~64 MB) are not enough
>>> for the build process.  If I increase free memory in this area either
>>> by reducing the rootfs size of by adding a few more memory segments to
>>> this area, the build goes through fine.
>>
>> I'm still curious to see what "sysctl vm.phys_segs" says.  It sounds like
>> roughly half of the direct map region is going to DRAM and half to
>> memory-mapped I/O devices.  Is that correct?
> Yes, about half of the direct mapped region in 32-bit is taken by
> flash, PCIe and other memory mapped IO.  I also made the problem even
> worse by not reclaiming some bootloader areas in the direct mapped
> region, which reduced the available direct mapped memory.
>
> Here's the output of sysctls:
>
> root at testboard:/root # sysctl vm.phys_segs
> vm.phys_segs:
> SEGMENT 0:
>
> start:     0x887e000
> end:       0xc000000
> domain:    0
> free list: 0x887a407c
>
> SEGMENT 1:
>
> start:     0x1d000000
> end:       0x1fc00000
> domain:    0
> free list: 0x887a407c
>
> SEGMENT 2:
>
> start:     0x20000000
> end:       0xbc0b3000
> domain:    0
> free list: 0x887a3f38
>
> SEGMENT 3:
>
> start:     0xe0000000
> end:       0xfffff000
> domain:    0
> free list: 0x887a3f38
>
> root at testboard:/root # sysctl vm.phys_free
> vm.phys_free:
> FREE LIST 0:
>
>    ORDER (SIZE)  |  NUMBER
>                  |  POOL 0  |  POOL 1  |  POOL 2
> --            -- --      -- --      -- --      --
>     8 (  1024K)  |    2877  |       0  |       0
>     7 (   512K)  |       0  |       1  |       0
>     6 (   256K)  |       1  |       0  |       0
>     5 (   128K)  |       0  |       1  |       0
>     4 (    64K)  |       0  |       1  |       0
>     3 (    32K)  |       0  |       1  |       0
>     2 (    16K)  |       0  |       1  |       0
>     1 (     8K)  |       0  |       0  |       0
>     0 (     4K)  |       0  |       0  |       0
>
> FREE LIST 1:
>
>    ORDER (SIZE)  |  NUMBER
>                  |  POOL 0  |  POOL 1  |  POOL 2
> --            -- --      -- --      -- --      --
>     8 (  1024K)  |      66  |       0  |       0
>     7 (   512K)  |       1  |       1  |       0
>     6 (   256K)  |       0  |       0  |       0
>     5 (   128K)  |       0  |       0  |       0
>     4 (    64K)  |       0  |       1  |       0
>     3 (    32K)  |       0  |       0  |       0
>     2 (    16K)  |       0  |       0  |       0
>     1 (     8K)  |       1  |       1  |       0
>     0 (     4K)  |       0  |       1  |       0
>
>>> I also found that when the build slows down, most of the pages taken
>>> from freelist 1 are allocated by the UMA subsystem, which seems to
>>> keep quite a few pages allocated.
>>
>> At worst, it may be necessary to disable the use of uma_small_alloc() for
>> this machine configuration.  At best, uma_small_alloc() could be revised
>> opportunistically use pages in the direct map region, but have the ability
>> to fall back to pages that have to be mapped.
> I think this probably is not a bug, but a configuration problem (we
> cannot have such a huge built-in root filesystem when the direct
> mapped area is at this low).  Anyway, I have checked in code to
> recover more areas from the bootloader, and this mostly solves the
> issue for me.  The above output is taken before the check-in.

I'm afraid that exhaustion of freelist 1 is still highly likely to occur 
under some workloads that require the allocation of a lot of small 
objects in the kernel's heap.

Alan



More information about the freebsd-mips mailing list