Profiling C application
Scott Bennett
bennett at sdf.org
Mon Nov 30 12:32:09 UTC 2020
Jan Beich <jbeich at FreeBSD.org> wrote:
> Ihor Antonov <ihor at antonovs.family> writes:
>
> > Hello folks,
> >
> > What can I use on FreeBSD to profile a slow application?
> >
> > I have a source code of the application and I can rebuild it if necessary.
> >
> > I want to see where (in what functions) application spends most of its
> > time - amount of function calls and duration of each call (per call)
> >
> > What tools can you recommend to do this?
>
> hwpmc (pmcstat, pmcannotate) and dtrace, see examples at the end.
> There're probably other ways, so better ask on hackers@ list.
>
> # Load kernel module and check available counters. Aliases like
> # "instructions" or "unhalted-cycles" may not be available.
> $ kldload hwpmc
> $ pmccontrol -L | grep \\<inst
> inst_retired.any
> inst_retired.any_p
> inst_retired.prec_dist
> inst_retired.total_cycles_ps
>
> # Interactive use
> $ pmcstat -t firefox -P inst_retired.any -T
> PMC: [inst_retired.any] Samples: 287408 (100.0%) , 96792 unresolved
> Key: => pause => press space again to continue
> %SAMP IMAGE FUNCTION CALLERS
> 3.0 libxul.so _ZN2sh13ResourcesHLSL14unifor _ZN2sh10OutputHLSL9visitLoopE:1.9 ...
> 1.4 libxul.so _ZN7mozilla19TrackBuffersMana _ZN7mozilla10OggDemuxer13Seek
> 1.4 libavcodec encode_thread mp_set_zero_yuv
> 1.4 libxul.so _ZN2sh10OutputHLSL9visitLoopE _ZN2sh10OutputHLSL18visitCons
> 1.0 libxul.so _ZN7mozilla3dom12FetchRequest _ZN7mozilla26RemoteLazyInputS
> 1.0 libavcodec epzs_motion_search2 epzs_motion_search2
> 0.9 kernel cpu_search_highest cpu_search_highest
> 0.7 libxul.so _ZN2sh10OutputHLSL22visitFunc _ZN2sh13TParseContext16checkC
> 0.7 libxul.so _ZN7mozilla3dom16HTMLMediaEle ...
> 0.6 libc.so.7 regexec
> [...]
>
> # Capture on disk
> $ pmcstat -t firefox -P inst_retired.any -O firefox.pmc
>
> # Flamegraph
> $ git clone https://github.com/brendangregg/FlameGraph
> $ pmcstat -z16 -R firefox.pmc -G firefox.graph
> $ FlameGraph/stackcollapse-pmc.pl firefox.graph > firefox.stack
> $ FlameGraph/flamegraph.pl firefox.stack > firefox.svg
> $ firefox firefox.svg
>
> # gprof-like (no need to build with -pg, regular non-debug symbols will do)
> $ pmcstat -R firefox.pmc -g
> $ gprof $(which firefox) inst_retired.any/firefox.gmon
> Flat profile:
>
> Each sample counts as 0.00787402 seconds.
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 28.42 14.67 14.67 (anonymous namespace)::LoadELF(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, (anonymous namespace)::MmapWrapper*, void**)
> 17.21 23.55 8.88 mozilla::PrintfTarget::vprint(char const*, __va_list_tag*)
> 11.03 29.24 5.69 643 8.85 8.85 mozilla::JSONWriter::Scalar(mozilla::Span<char const, 18446744073709551615ul> const&, mozilla::Span<char const, 18446744073709551615ul> const&)
> 9.55 34.17 4.93 lul::ReadSymbolDataInternal(unsigned char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, lul::SecMap*, void*, unsigned long, lul::UniqueStringUniverse*, void (*)(char const*))
> [...]
>
> $ gprof /usr/local/lib/libavcodec.so inst_retired.any/libavcodec.so.*.gmon
> Flat profile:
>
> Each sample counts as 0.00787402 seconds.
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 29.63 2516.15 2516.15 341629 7.37 7.69 encode_thread
> 13.33 3647.98 1131.83 498281 2.27 3.57 epzs_motion_search2
> 8.02 4329.25 681.28 decode_chunks
> 6.87 4912.72 583.46 mpc8_decode_frame
> [...]
>
> $ gprof /boot/modules/i915kms.ko inst_retired.any/i915kms.ko.gmon
> Flat profile:
>
> Each sample counts as 0.00787402 seconds.
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 13.56 1.73 1.73 eb_lookup_vmas
> 11.78 3.24 1.50 57 26.38 26.38 i915_active_ref
> 9.68 4.47 1.24 26 47.55 47.85 i915_request_await_object
> 6.29 5.28 0.80 i915_gem_do_execbuffer
> 4.62 5.87 0.59 __i915_vma_active
> [...]
Nice examples! Thank you for posting them. I will definitely save
these for future reference.
Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet: bennett at sdf.org *xor* bennett at freeshell.org *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good *
* objection to the introduction of that bane of all free governments *
* -- a standing army." *
* -- Gov. John Hancock, New York Journal, 28 January 1790 *
**********************************************************************
More information about the freebsd-questions
mailing list