cvs commit: src/gnu/usr.bin/groff/tmac mdoc.local src/libMakefile
src/lib/libpmc Makefile libpmc.c pmc.3 pmc.h src/share/doc/papers
Makefile src/share/doc/papers/hwpmc Makefile hwpmc.ms
src/share/examples/hwpmc README src/share/man/man4 Makefile ...
Andre Oppermann
andre at freebsd.org
Thu Apr 21 08:03:20 PDT 2005
Joseph Koshy wrote:
>
> al> I assume this is like a portable version of the measurement backend in
> al> Intels VTune... at least I assume VTune does something like this
> al> itself.
>
> I have not actually used Intel's VTune or AMD's CodeAnalyst so
> please take my words with a pinch of salt.
>
> >From reading the publically available documentation, VTune's backend
> appears to do 'system-wide sampling'.
>
> Our backend can do system-wide measurements as well as per-process
> measurements (i.e., the counter hardware can be 'virtualized').
> Another difference is that we support 'counting' as well as 'sampling'.
>
> So 4 kinds of PMC usage styles are currently supported by our
> infrastructure:
>
> - process-private, counting
>
> o We could have a profiling runtime library that augments its
> data collection with data from the PMCs at function entry/exit.
>
> o Scientific applications could use this mode to measure hardware
> counts between two points of code. I believe the scientific
> community uses an API named "PAPI" for performance measurements.
> We should be able to support PAPI in -current now.
>
> - system-wide, counting
>
> o You could allocate system-wide, counting PMCs and read these
> once a minute. This operation would have near-zero overhead
> and could be used for collecting long-term data, say for making
> machine sizing decisions.
>
> - process-private, sampling
>
> o The standard 'profiling' function, with a couple of twists:
> you would not need to specially compile executables for
> profiling, and you could profile any process you could
> PMC_ATTACH a PMC to.
>
> - system-wide, sampling
>
> o This 'profiles' the whole system: applications, kernel and
> interrupt handlers.
>
> The current snapshot in -current has sampling modes turned off as
> they haven't been fully implemented.
How can I do kernel subsystem only measurements? I'd like to profile
the IP and TCP processing in the kernel.
--
Andre
> obrien> Every modern CPU has event counters. Some CPU's have as little as 2
> obrien> (Pentium Pro), others have 4 (Athlon64 and Opteron), I think IA-64 has
>
> The P4 has had the most complexity so far: 18 counters, 45 event-select
> registers and many many restrictions about what works with what.
> Further, logical (HTT) cpus share PMC resources and some events
> change semantics if HTT is enabled (TS/TI events) :(.
>
> The userland library pmc(3) and the driver hwpmc(4) handle these
> issues for you.
>
> obrien> This PMC facility is much more similar to Linux's Oprofile than VTune or
> obrien> AMD's CodeAnalyst. It allows one to set and access the event counters.
>
> Linux has Oprofile (for system-wide sampling) and many separate
> 'counting' mode implementations (Perfctr, Rabbit, Lperfex, etc.).
>
> obrien> You will need to find the applicable CPU docs so you know what [public]
> obrien> events exist, and any "options" those events have.
>
> The PMC specific sections of pmc(3) list the events and allowed
> modifiers that our library understands.
>
> You would still need to read the CPU docs: some of the events
> measured by hardware only make sense in the context of a given CPU
> architecture.
>
> For folks who like Python, there is a Python wrapper around libpmc
> that makes it easy to play around with this functionality. You can
> pick it up at:
>
> http://people.freebsd.org/~jkoshy/projects/perf-measurement/pypmc.html
>
> Regards,
> Koshy
> <jkoshy at freebsd.org>
More information about the cvs-src
mailing list