Partial cacheline flush problems on ARM and MIPS

Sun Aug 26 23:18:57 UTC 2012

On Aug 26, 2012, at 2:01 PM, Ian Lepore wrote:

> On Sun, 2012-08-26 at 11:53 -0700, Tim Kientzle wrote:
>> These rules sound reasonable.   Good documentation might
>> also give examples of what the PRE/POST operations might entail
>> (e.g., from the preceding discussion, it sounds like PREREAD
>> and PREWRITE require at least a partial cache flush on ARM).
>> That helps folks who are coming to the docs with some hardware
>> background.
>> 
> 
> I agree, I think it would be good to have something like a RATIONALE
> section in the manpage that summarizes the issues faced by various
> categories of platform (hardware coherency, software-assisted coherency,
> etc) and how they handle it.  You don't want people coding to the
> implementation (some of which is going on now, and I've been guilty of
> it myself); if there were more info in the docs there'd be less
> motivation to peek at the implementation.

More documentation is good.

>>>      * Read and write sync operators may be combined in a single call,
>>>       PRE and POST operators may not be.  E.G., PREREAD|PREWRITE is
>>>       allowed, PREREAD|POSTREAD is not.  We should note that while
>>>       read and write operations may be combined, on some platforms
>>>       PREREAD|PREWRITE is needlessly expensive when only a read is
>>>       being performed.
>> 
>> 
>> PREREAD|POSTREAD doesn't sound useful to me, but
>> why would it be explicitly forbidden?
>> 
>> Would you also forbid POSTREAD|PREWRITE?
>> (For a buffer that has just completed a DMA read
>> and is going to be immediately used for a DMA write?)
> 
> My thinking on forbidding PREREAD|POSTREAD is at least partly that it
> removes some temptation to do the wrong thing: treat the busdma API as
> if it were a general cpu cache manipulation library.  

I think it should be read as POSTREAD + PREREAD, not PREREAD + POSTREAD.

> With the new definition of the sequences, a PREREAD|POSTREAD operation
> is nonsensical because it leaves no window during which the DMA hardware
> has access to memory; it is in effect a no-op.  If you think in terms of
> implementation you might think "This would have to cause a cache
> invalidation."  If you think in terms of API you should be thinking
> something more like "This marks out a time window during which the DMA
> hardware has safe access to that memory which has a duration of zero."

Acutally, it would allow one to shift the DMA from one device to another, if read the way I say above.

> POSTREAD|PREWRITE is interesting.  It is not a no-op in terms of the
> API.  It closes the hardware access window that was opened by an earlier
> PREREAD, and it opens a new hardware access window with the PREWRITE.
> Whether it touches hardware or caches or anything in a given
> implementation isn't the point, in terms of the API it's the right thing
> to do for a pair of back to back DMA operations, without intervening CPU
> access, on the same memory.

Right.  This is useful in many cases.  Consider a log structured device that reads data into memory, and then writes it back to the tail of the log as a defragging operation.

> So PREREAD|POSTREAD and PREWRITE|POSTWRITE make no sense, but all other
> combos should be allowed.  Maybe instead of allowing and forbidding
> specific combos, we should just advise that these two are effectively
> no-ops.

I respectfully disagree..

Warner