powerpc64/GENERIC64 use of dcbst vs. dcbf: is the dcbst use really okay? Anyone know?

Mark Millard markmi at dsl-only.net
Tue Sep 23 00:18:28 UTC 2014


Anyone know why the following is true in FreeBSD (10.1-BETA2, for example) for kernel vs. openfirmware transitions (in both directions) for powerpc64/GENERIC64? (And some other places are noted.) The issue is dcbst vs. dcbf instruction usage.

(I later quote from pem_64_bit_v3.0.2005jul2005.pdf (from IBM).)

Some context first...

Apple's published BootX-81 always saves and restored the Exception Vectors when going between openfirmware and the kernel: it maintains separate vectors for the two contexts. In addition it carefully uses dcbf and icbi no matter if copies to that area at address 0 or to a save area. And that is followed by isync. (And more, sync and eieio: Apple seems paranoid.)

Apple used dcbf instead of dcbst.

IBM writes of dcbst vs. dcbf:

> Instruction caches, if they exist, are not required to be consistent with data caches, memory, or I/O data trans- fers. Software must use the appropriate cache management instructions to ensure that instruction caches are kept coherent when instructions are modified by the processor or by input data transfer. When a processor alters a memory location that may be contained in an instruction cache, software must ensure that updates to memory are visible to the instruction fetching mechanism. Although the instructions to enforce consistency vary among implementations, the following sequence for a uniprocessor system is typical: 
> 1. dcbst (update memory)
> 2. sync (wait for update)
> 3. icbi (invalidate copy in instruction cache) 4. isync (perform context synchronization) 
> Note: Most operating systems will provide a system service for this function. These operations are neces- sary because the memory may be designated as write-back. Since instruction fetching may bypass the data cache, changes made to items in the data cache may not otherwise be reflected in memory until after the instruction fetch completes. 
> For implementations used in multiprocessor systems, variations on this sequence may be recommended. For example, in a multiprocessor system with a unified instruction/data cache (at any level), if instructions are fetched without coherency being enforced, the preceding instruction sequence is inadequate. Because the icbi instruction does not invalidate blocks in a unified cache, a dcbf instruction should be used instead of a dcbst instruction for this case.
> 

Then the point given that background information...

FreeBSD's powerpc64/GENERIC64 seems to have a mix of dcbst and dcbf use. The following have dcbst (unless patched separately at run time):

000000000086c1e8 <.agp_apple_unbind_page+0x60> dcbst   r0,r0

000000000086c27c <.agp_apple_bind_page+0x64> dcbst   r0,r0

00000000008b1b78 <.elf_reloc_internal+0x12c> dcbst   r0,r30

00000000008bcd30 <.__syncicache+0x38> dcbst   r0,r0

That last is used during the openfirmware vs. kernel transitions. The above are from "objdump -d --prefix-address /boot/kernel/kernel".

Is the dcbst use risky because of any unified caches at any level on any of the processors that powerpc64/GENERIC64 is supposed to handle?


===
Mark Millard
markmi at dsl-only.net



More information about the freebsd-ppc mailing list