svn commit: r253877 - in projects/atomic64/sys: amd64/include i386/include
Bruce Evans
brde at optusnet.com.au
Fri Aug 2 13:52:13 UTC 2013
On Fri, 2 Aug 2013, Jung-uk Kim wrote:
> Log:
> Reimplement atomic operations on PDEs and PTEs in pmap.h. This change
> significantly reduces duplicate code. Also, it may improve and even correct
> some questionable implementations.
Do they all (or any) need to be atomic with respect to multiple CPUs?
It's hard to see how concurrent accesses to page tables can work worh
without higher-level locking than is provided by atomic ops.
> Modified: projects/atomic64/sys/amd64/include/pmap.h
> ==============================================================================
> --- projects/atomic64/sys/amd64/include/pmap.h Fri Aug 2 00:08:00 2013 (r253876)
> +++ projects/atomic64/sys/amd64/include/pmap.h Fri Aug 2 00:20:04 2013 (r253877)
> @@ -185,41 +185,13 @@ extern u_int64_t KPML4phys; /* physical
> pt_entry_t *vtopte(vm_offset_t);
> #define vtophys(va) pmap_kextract(((vm_offset_t) (va)))
>
> -static __inline pt_entry_t
> -pte_load(pt_entry_t *ptep)
> -{
> - pt_entry_t r;
> -
> - r = *ptep;
> - return (r);
> -}
This function wasn't atomic with respect to multiple CPUs. Except on
i386 with PAE, but then it changes a 64-bit object on a 32-bit CPU,
so it needs some locking just to be atomic with respect to a single CPU.
> -static __inline pt_entry_t
> -pte_load_store(pt_entry_t *ptep, pt_entry_t pte)
> -{
> - pt_entry_t r;
> -
> - __asm __volatile(
> - "xchgq %0,%1"
> - : "=m" (*ptep),
> - "=r" (r)
> - : "1" (pte),
> - "m" (*ptep));
> - return (r);
> -}
This was the main one that was atomic with respect to multiple CPUs on
both amd64 and i386. This seems to be accidental -- xchg to memory gives
a lock prefix and slowness whether you want it or not.
> -
> -#define pte_load_clear(pte) atomic_readandclear_long(pte)
> -
> -static __inline void
> -pte_store(pt_entry_t *ptep, pt_entry_t pte)
> -{
> +#define pte_load(ptep) atomic_load_acq_long(ptep)
> +#define pte_load_store(ptep, pte) atomic_swap_long(ptep, pte)
> +#define pte_load_clear(pte) atomic_swap_long(pte, 0)
> +#define pte_store(ptep, pte) atomic_store_rel_long(ptep, pte)
> +#define pte_clear(ptep) atomic_store_rel_long(ptep, 0)
>
> - *ptep = pte;
> -}
pte_store() was also not atomic with respect to multiple CPUs. So almost
everything was not atomic with respect to multiple CPUs, except for PAE
on i386.
Bruce
More information about the svn-src-projects
mailing list