cvs commit: src/sys/ia64/ia64 exception.S interrupt.c machdep.c
mp_machdep.c pmap.c trap.c vm_machdep.c src/sys/ia64/include
proc.h smp.h
Doug Rabson
dfr at nlsystems.com
Sun Aug 7 09:02:50 GMT 2005
Excellent! When trying to think about per-cpu VHPT in the past, I could
never quite see how to handle the collision chains sanely. The solution
described below seems ideal.
On Saturday 06 August 2005 21:28, Marcel Moolenaar wrote:
> marcel 2005-08-06 20:28:19 UTC
>
> FreeBSD src repository
>
> Modified files:
> sys/ia64/ia64 exception.S interrupt.c machdep.c
> mp_machdep.c pmap.c trap.c vm_machdep.c
> sys/ia64/include proc.h smp.h
> Log:
> Improve SMP support:
> o Allocate a VHPT per CPU. The VHPT is a hash table that the CPU
> uses to look up translations it can't find in the TLB. As such,
> the VHPT serves as a level 1 cache (the TLB being a level 0
> cache) and best results are obtained when it's not shared between
> CPUs. The collision chain (i.e. the hash bucket) is shared between
> CPUs, as all buckets together constitute our collection of PTEs. To
> achieve this, the collision chain does not point to the first PTE in
> the list anymore, but to a hash bucket head structure. The head
> structure contains the pointer to the first PTE in the list, as well
> as a mutex to lock the bucket. Thus, each bucket is locked
> independently of each other. With at least 1024 buckets in the VHPT,
> this provides for sufficiently finei-grained locking to make the
> ssolution scalable to large SMP machines.
> o Add synchronisation to the lazy FP context switching. We do this
> with a seperate per-thread lock. On SMP machines the lazy high
> FP context switching without synchronisation caused inconsistent
> state, which resulted in a panic. Since the use of the high FP
> registers is not common, it's possible that races exist. The ia64
> package build has proven to be a good stress test, so this will get
> plenty of exercise in the near future.
> o Don't use the local ID of the processor we want to send the IPI
> to as the argument to ipi_send(). use the struct pcpu pointer
> instead. The reason for this is that IPI delivery is unreliable. It
> has been observed that sending an IPI to a CPU causes it to receive a
> stray external interrupt. As such, we need a way to make the delivery
> reliable. The intended solution is to queue requests in the target
> CPU's per-CPU structure and use a single IPI to inform the CPU that
> there's a new entry in the queue. If that IPI gets lost, the CPU can
> check it's queue at any convenient time (such as for each clock
> interrupt). This also allows us to send requests to a CPU without
> interrupting it, if such would be beneficial.
>
> With these changes SMP is almost working. There are still some
> random process crashes and the machine can hang due to having the IPI
> lost that deals with the high FP context switch.
>
> The overhead of introducing the hash bucket head structure results
> in a performance degradation of about 1% for UP (extra pointer
> indirection). This is surprisingly small and is offset by gaining
> reasonably/good scalable SMP support.
>
> Revision Changes Path
> 1.57 +8 -0 src/sys/ia64/ia64/exception.S
> 1.50 +5 -0 src/sys/ia64/ia64/interrupt.c
> 1.201 +30 -13 src/sys/ia64/ia64/machdep.c
> 1.56 +29 -25 src/sys/ia64/ia64/mp_machdep.c
> 1.161 +227 -272 src/sys/ia64/ia64/pmap.c
> 1.114 +12 -7 src/sys/ia64/ia64/trap.c
> 1.91 +1 -0 src/sys/ia64/ia64/vm_machdep.c
> 1.15 +2 -1 src/sys/ia64/include/proc.h
> 1.10 +4 -2 src/sys/ia64/include/smp.h
More information about the cvs-src
mailing list