All the memory eaten away by ZFS 'solaris' malloc - on 11.1-R amd64
Mark Martinec
Mark.Martinec+freebsd at ijs.si
Wed Aug 1 07:12:15 UTC 2018
> On Tue, Jul 31, 2018 at 11:54:29PM +0200, Mark Martinec wrote:
>> I have now upgraded this host from 11.1-RELEASE-p11 to 11.2-RELEASE
>> and the situation has not improved. Also turned off all services.
>> ZFS is still leaking memory about 30 MB per hour, until the host
>> runs out of memory and swap space and crashes, unless I reboot it
>> first every four days.
>>
>> Any advise before I try to get rid of that faulted disk with a pool
>> (or downgrade to 10.3, which was stable) ?
2018-08-01 00:09, Mark Johnston wrote:
> If you're able to use dtrace, it would be useful to try tracking
> allocations with the solaris tag:
>
> # dtrace -n 'dtmalloc::solaris:malloc {@allocs[stack(), args[3]] =
> count()} dtmalloc::solaris:free {@frees[stack(), args[3]] =
> count();}'
>
> Try letting that run for one minute, then kill it and paste the output.
> Ideally the host will be as close to idle as possible while still
> demonstrating the leak.
Good and bad news:
The suggested dtrace command bails out:
# dtrace -n 'dtmalloc::solaris:malloc {@allocs[stack(), args[3]] =
count()} dtmalloc::solaris:free {@frees[stack(), args[3]] = count();}'
dtrace: description 'dtmalloc::solaris:malloc ' matched 2 probes
Assertion failed: (buf->dtbd_timestamp >= first_timestamp), file
/usr/src/cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c,
line 3330.
Abort trap
But I did get one step further, localizing the culprit.
I realized that the "solaris" malloc count goes up in sync with
the 'telegraf' monitoring service polls, which also has a ZFS plugin
which monitors the zfs pool and ARC. This plugin runs 'zpool list -Hp'
periodically.
So after stopping telegraf (and other remaining services),
the 'vmstat -m' shows that InUse count for "solaris" goes up by 552
every time that I run "zpool list -Hp" :
# (while true; do zpool list -Hp >/dev/null; vmstat -m | \
fgrep solaris; sleep 1; done) | awk '{print $2-a; a=$2}'
6664427
541
552
552
552
552
552
552
552
552
556
548
552
552
552
552
552
552
552
552
552
# zpool list -Hp
floki 68719476736 37354102272 31365374464 - -
49% 54 1.00x ONLINE -
stuff - - - - - - - -
UNAVAIL -
Mark
More information about the freebsd-stable
mailing list