Adding members to struct cpu_functions
M. Warner Losh
imp at bsdimp.com
Mon Oct 19 15:54:13 UTC 2009
In message: <05B19969-B238-4E3A-8326-624067F0362B at semihalf.com>
Rafal Jaworowski <raj at semihalf.com> writes:
: On 2009-10-18, at 17:49, Nathan Whitehorn wrote:
[[ trimmed ]]
: > I just did the measurements on a 1.8 GHz PowerPC G5. There were four
: > tests, each repeated 1 million times. "Load and store" involves
: > incrementing a volatile int from 0 to 1e6 inline. "Direct calls"
: > involves a branch to a function that returns 0 and does nothing
: > else. "Function ptr" calls the same function via a pointer stored in
: > a register, and "KOBJ calls" calls it via KOBJ. Here are the results
: > (errors are +/- 0.5 ns for the function call measurements due to
: > compiler optimization jitter, and 0 for load and store, since that
: > takes a deterministic number of clock cycles):
: >
: > 32-bit kernel:
: > Load and store: 26.1 ns
: > Direct calls: 7.2 ns
: > Function ptr: 8.4 ns
: > KOBJ calls: 17.8 ns
: >
: > 64-bit kernel:
: > Load and store: 9.2 ns
: > Direct calls: 6.1 ns
: > Function ptr: 8.3 ns
: > KOBJ calls: 40.5 ns
: >
: > ABI changes make a large difference, as you can see. The cost of
: > calling via KOBJ is non-negligible, but small, especially compared
: > to the cost of doing anything involving memory. I don't know how
: > this changes with ARM calling conventions.
:
: Very interesting, thanks! Could you elaborate on the testing details
: and share the diagnostic code so we could repeat this with other CPU
: variations like Book-E PowerPC, or ARM?
I'd love to see this on MIPS too...
KOBJ is a big win for device configuration, where one memory I/O can
take 60 times these call numbers...
Warner
More information about the freebsd-arm
mailing list