cvs commit: src/gnu/usr.bin/groff/tmac mdoc.local src/libMakefile src/lib/libpmc Makefile libpmc.c pmc.3 pmc.h src/share/doc/papers Makefile src/share/doc/papers/hwpmc Makefile hwpmc.ms src/share/examples/hwpmc README src/share/man/man4 Makefile ...

Andre Oppermann andre at freebsd.org
Thu Apr 21 08:03:20 PDT 2005


Joseph Koshy wrote:
> 
> al> I assume this is like a portable version of the measurement backend in
> al> Intels VTune... at least I assume VTune does something like this
> al> itself.
> 
> I have not actually used Intel's VTune or AMD's CodeAnalyst so
> please take my words with a pinch of salt.
> 
> >From reading the publically available documentation, VTune's backend
> appears to do 'system-wide sampling'.
> 
> Our backend can do system-wide measurements as well as per-process
> measurements (i.e., the counter hardware can be 'virtualized').
> Another difference is that we support 'counting' as well as 'sampling'.
> 
> So 4 kinds of PMC usage styles are currently supported by our
> infrastructure:
> 
>   - process-private, counting
> 
>     o We could have a profiling runtime library that augments its
>       data collection with data from the PMCs at function entry/exit.
> 
>     o Scientific applications could use this mode to measure hardware
>       counts between two points of code.  I believe the scientific
>       community uses an API named "PAPI" for performance measurements.
>       We should be able to support PAPI in -current now.
> 
>   - system-wide, counting
> 
>     o You could allocate system-wide, counting PMCs and read these
>       once a minute.  This operation would have near-zero overhead
>       and could be used for collecting long-term data, say for making
>       machine sizing decisions.
> 
>   - process-private, sampling
> 
>     o The standard 'profiling' function, with a couple of twists:
>       you would not need to specially compile executables for
>       profiling, and you could profile any process you could
>       PMC_ATTACH a PMC to.
> 
>   - system-wide, sampling
> 
>     o This 'profiles' the whole system: applications, kernel and
>       interrupt handlers.
> 
> The current snapshot in -current has sampling modes turned off as
> they haven't been fully implemented.

How can I do kernel subsystem only measurements?  I'd like to profile
the IP and TCP processing in the kernel.

-- 
Andre


> obrien> Every modern CPU has event counters.  Some CPU's have as little as 2
> obrien> (Pentium Pro), others have 4 (Athlon64 and Opteron), I think IA-64 has
> 
> The P4 has had the most complexity so far: 18 counters, 45 event-select
> registers and many many restrictions about what works with what.
> Further, logical (HTT) cpus share PMC resources and some events
> change semantics if HTT is enabled (TS/TI events) :(.
> 
> The userland library pmc(3) and the driver hwpmc(4) handle these
> issues for you.
> 
> obrien> This PMC facility is much more similar to Linux's Oprofile than VTune or
> obrien> AMD's CodeAnalyst.  It allows one to set and access the event counters.
> 
> Linux has Oprofile (for system-wide sampling) and many separate
> 'counting' mode implementations (Perfctr, Rabbit, Lperfex, etc.).
> 
> obrien> You will need to find the applicable CPU docs so you know what [public]
> obrien> events exist, and any "options" those events have.
> 
> The PMC specific sections of pmc(3) list the events and allowed
> modifiers that our library understands.
> 
> You would still need to read the CPU docs: some of the events
> measured by hardware only make sense in the context of a given CPU
> architecture.
> 
> For folks who like Python, there is a Python wrapper around libpmc
> that makes it easy to play around with this functionality.  You can
> pick it up at:
> 
>   http://people.freebsd.org/~jkoshy/projects/perf-measurement/pypmc.html
> 
> Regards,
> Koshy
> <jkoshy at freebsd.org>


More information about the cvs-src mailing list