Tracking down what has inact pages locked up

Wed Mar 19 20:05:26 UTC 2014

On 3/19/2014 2:29 PM, John Baldwin wrote:
> On Tuesday, March 18, 2014 11:36:49 pm Karl Denninger wrote:
>> On 3/18/2014 4:30 PM, John Baldwin wrote:
>>> On Tuesday, March 18, 2014 3:36:04 pm Karl Denninger wrote:
>>>> On 3/18/2014 2:05 PM, John Baldwin wrote:
>>>>> On Sunday, March 16, 2014 4:36:06 pm Karl Denninger wrote:
>>>>>> Is there a reasonable way to determine who or what has that memory
>>>>>> locked up -- and thus why the vm system is not demoting that space into
>>>>>> the cache bucket so it can be freed (which, if my understanding is
>>>>>> correct, should be happening long before now!)
>>>>> I have a hackish thing (for 8.x, might work on 10.x) to let you figure out
>>>>> what is using up RAM.  This should perhaps go into the base system at some
>>>>> point.
>>>>>
>>>>> Grab the bits at http://people.freebsd.org/~jhb/vm_objects/
>>>>>
>>>>> You will want to build the kld first and use 'make load' to load it.  It adds
>>>>> a new sysctl that dumps info about all the VM objects in the system.  You can
>>>>> then build the 'vm_objects' tool and run it.  It can take a while to run if
>>>>> you have NFS mounts, so I typically save its output to a file first and then
>>>>> use sort on the results.  sort -n will show you the largest consumer of RAM,
>>>>> sort -n -k 3 will show you the largest consumer of inactive pages.  Note
>>>>> that 'df' and 'ph' objects are anonymous, and that filename paths aren't
>>>>> always reliable, but this can still be useful.
>>>>>
>>>> Thanks.
>>>>
>>>> I suspect the cause of the huge inact consumption is a RAM leak in the
>>>> NAT code in IPFW.  It was not occurring in 9.2-STABLE, but is on
>>>> 10.0-STABLE, and reverting to natd in userland stops it -- which
>>>> pretty-well isolates where it's coming from.
>>> Memory for in-kernel NAT should be wired pages, not inactive.
>> Yeah, should be. :-)
>>
>> But..... it managed to lock up 19GB of the 24GB the system has in inact
>> pages over 12 hours, and dropping the system to single user and
>> unloading the modules did not release the RAM...... which is why the
>> question (on how to track down what the hell is going on.)
>>
>> Changing the config back to natd as opposed to in-kernel NAT, however,
>> made the problem disappear.
> It would be useful to run the program I posted above to see what is tying
> up in active pages.  It is not going to be wired kernel memory.  You may
> simply be seeing a different aritfact that natd causes additional memory
> pressure so pagedaemon purges inactive pages more often.  If you aren't
> using memory, then having a lot of inactive pages isn't a problem, it
> means the system will be able to satisfy potential future reads without
> needing to go to disk.
>
> What I have done in places where I want to limit inactive memory is to
> write a simple program that invokes posix_fadvise(POSIX_FADV_DONTNEED)
> on each file to flush it from inactive to cache.  You may also need an
> fsync() on each file to flush any dirty pages before the fadvise.
What caught my attention originally was that swap was rising in 
consumption at the same time inact was basically pegged (and 
performance, as expected, was in the trashcan.)

I'll see what I can track down.

-- 
-- Karl
karl at denninger.net

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2711 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20140319/4b02fc17/attachment.bin>