Performance of SheevaPlug on 8-stable

Mon Mar 8 00:27:13 UTC 2010

On Sun, Mar 07, 2010 at 03:25:41PM -0600, Mark Tinguely wrote:
> 
> FreeBSD-current has kernel and user witness turned on. Witness is for
> locks, so it should not change the performance of a tight arithmetic loop
> like this.

I have no kernel debugging enabled.
I have no malloc.conf on current, but I have on the 8.0-current system,
so malloc debugging is enabled on one machine, but it shouldn't hurt in
this case since it is not allocating anything.

> I don't know the marvell interals, and from what I tell, their technial
> docs require NDA. That said, many of the ARM processors also have a
> instruction internal cache (instruction prefetch) in addition to the
> instruction cache. I don't think the prefetch has an enable/disable.
> 
> It looks like from the cpu identification that the the branch prediction
> is turned on. Branch prediction compensates for the longer pipelines.
> I can't see how in the tight loop how that could go astray.
> 
> Thus says the ARM ARM:
> 
> 	ARM implementations are free to choose how far ahead of the
> 	current point of execution they prefetch instructions; either
> 	a fixed or a dynamically varying number of instructions. As well
> 	as being free to choose how many instructions to prefetch, an ARM
> 	implementation can choose which possible future execution path to
> 	prefetch along. For example, after a branch instruction, it can
> 	choose to prefetch either the instruction following the branch
> 	or the instruction at the branch target. This is known as branch
> 	prediction.
> 
> There are a few data dangling allocations that I would like to see
> closed from the multiple kernel allocation fix. *IN THEORY, IF* a page
> is allocated via the arm_nocache (DMA COHERENT) or a sendfile, then
> it is never marked as unallocated. *IN THEORY*, if that page is used
> again, then we could falsely believe that page is being shared and
> we turn off the cache, eventhough it is not shared.
> 
> 	http://www.casselton.net/~tinguely/arm_pmap_unmanaged.diff
> 
> * Disclaimer: I am not sure if DMA COHERENT nor sendfiles are used in
> the Sheeva implementation. This is a theoritical observation of a side
> effect of the multiple kernel mapping patch that we did just before
> FreeBSD 8-release.

-- 
B.Walter <bernd at bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.