collecting statistics / metrics
Alfred Perlstein
alfred at ixsystems.com
Fri Apr 5 15:00:03 UTC 2013
On Apr 5, 2013, at 1:42 AM, Andriy Gapon <avg at FreeBSD.org> wrote:
> on 03/04/2013 20:36 Alfred Perlstein said the following:
>> Hey folks, sorry for the top post here, but I just came into this thread.
>>
>> Here at iXsystems we've just developed a set of scripts to scrape the various
>> FreeBSD user land utilities (sysctl, netstat, nfsstat, vmstat, etc, etc) and put
>> them into graphs based on time.
>>
>> The goal is to be able to line up all these metrics with whatever benchmark we
>> are currently running and be able to see what may be causing issues.
>>
>> Potentially you should be able to scroll through the graphs and see things like
>> "ran out of mbufs @time", "vm system began paging at @time", "buffer deaemon
>> went nuts @time"
>>
>> Then we can take the information back and leverage it to make tuning decisions,
>> or potentially change kernel algorithms.
>
> This is very very useful!
>
>> The only problem we have is that every user land tool has its own format, so
>> along with my team we have written some shell to coerce the output from the
>> various programs into pseudo-CSV (key/value pair) which can then be post
>> processed by tools to convert to CSV which can then be put into something like
>> open office, or put through an R program to graph it.
>>
>> I'm hoping to have something shortly.
>>
>> What I was hoping to do over the next few days was discuss with people how we
>> can (or should we even) fix the user land statistics tools to output machine
>> readable output that can be easily parsed.
>>
>> Example: netstat -m (hard to parse) versus 'vmstat -z | grep mbuf' easy to parse.
>>
>> The idea of outputting xml is good, CSV is OK, however CSV is problematic as in
>> the case of sysctl, if new nodes appear, then we can't begin to emit them, we
>> must either ignore them, or abort, or log them to auxiliary files. Anything
>> that makes life easier is good.
>>
>> I should be able to share our scripts within the next couple of days.
>
> Just an alternative idea...
> I think gathering all this information via plugins to e.g. collectd could be
> more flexible and less processing / parsing intensive. That would allow to
> avoid unnecessary formatting and re-parsing and to store the data in a
> convenient format. Ideally it would be great to have an umbrella library on top
> of sysctl, devstat, etc that would expose various stats in a convenient form.
> Another thing of convenience would be an ability to know which sysctls are
> actually stats. I think that you have already done work towards this goal.
> There are certain heuristics that may help to distinguish stats from knobs,
> constants, etc, but the explicit "this is a metric" should be used. Of course,
> it would take a lot of work to properly mark all the sysctls.
>
> Just thinking out loud.
I'm going to bring these suggestions to my team and I think we can incorporate some of these ideas for sure.
-Alfred
More information about the freebsd-arch
mailing list