Coherent bus_dma for ARMv7
Zbigniew Bodek
zbb at semihalf.com
Mon Apr 3 14:59:12 UTC 2017
2017-04-03 16:37 GMT+02:00 Andrew Turner <andrew at fubar.geek.nz>:
>
> > On 3 Apr 2017, at 15:14, Zbigniew Bodek <zbb at semihalf.com> wrote:
> >
> > 2017-04-03 15:37 GMT+02:00 Andrew Turner <andrew at fubar.geek.nz>:
> >
> > > On 3 Apr 2017, at 14:16, Marcin Wojtas <mw at semihalf.com> wrote:
> > >
> > > Hi Adrian,
> > >
> > > Frankly we are not such experts in armv6 bus_dma, which looks more
> > > complicated than one in arm64, so we thought it's much safer no to mix
> > > the two solutions and leave for the user a single switch to decide,
> > > which one to pick. Afaik Andrew Turner did the oposite for arm64
> > > (implement not coherent solution on top of coherent bus_dma), however
> > > I'm not sure if it's possible here in an easy way - there's also
> > > pretty significant risk of regression for all platforms. Please let me
> > > know your opinion. Do you think some sort of update of armv6 is
> > > doable?
> >
> > I don’t see any reason to think it would be difficult to add support for
> coherent hardware to the existing armv6 busdma code. It’s mostly skipping
> cache operations based on a flag in the dam tag.
> >
> > Andrew
> >
> > Hello Andrew,
> >
> > I don't think anyone uses flags related to DMA coherency in
> bus_dma_tag_create.
>
> The generic PCI and ThunderX PCIe PEM drivers do. The former based on the
> FDT dma-coherent flag.
>
In this particular example this will work as almost all (not all) devices
on ThunderX are PCIe devices. For most ARMv7-based SoCs this is not true.
We would need to create a coherent DMA tag for the top level buses and
ensure that this is propagated correctly down to the subordinate buses and
devices.
>
> >
> > Nevertheless, for coherent platforms we want bus_dma to always map DMA
> memory as normal WBWA regardless of the flags passed to create a bus_dma
> MAP.
> > For example, we don't want to perform any synchronization and we want to
> have the cacheable memory regardless of BUS_DMA_COHERENT flag used.
>
> That’s already the case on arm64, the only synchronisation used when the
> tag is created with BUS_DMA_COHERENT is a memory barrier.
>
For PCI.
>
> > Otherwise the performance improvement will apply only to those drivers
> that dare to use BUS_DMA_COHERENT flag and very few of them does that. In
> other words, what is the point of having coherent DMA if you do cache
> maintenance anyway?
>
> The drivers should be getting the parent DMA tag and passing this to
> bus_dma_tag_create. If this was created with BUS_DMA_COHERENT it will pass
> this to the child tag. This is how the above PCI drivers work.
>
>
This basically makes sense to me if we do the same for all buses or once
for every platform. The question is how much additional stuff is added to
busdma_machdep-v6.c to make it work on all relevant platforms because it is
quite different from the ARM64 implementation.
Still we can go with the ARM64 approach and add new DMA handling, parallel
to the existing one. Improve it over time to handle non-coherent DMA and
replace the old one with the new one when it is proven to be correct for
all.
Kind regards
zbb
More information about the freebsd-arm
mailing list