CPU Cache and busdma usage in USB

Hans Petter Selasky hselasky at c2i.net
Sun Jul 5 10:36:48 UTC 2009


On Tuesday 30 June 2009 11:28:23 Hans Petter Selasky wrote:
> On Tuesday 30 June 2009 10:37:48 Piotr Zięcik wrote:
> > Monday 29 June 2009 15:16:56 Hans Petter Selasky napisał(a):
> > > You want to change the flags passed to bus_dmamap_sync() so that the
> > > flush/invalidate mapping gets right. I understand why your patch makes
> > > it work. That's not the problem.
> > >
> > > In "src/sys/arm/arm/busdma_machdep.c" there is a function called
> > > "_bus_dmamap_sync_bp()". If you look at that function you see that it
> > > only triggers on the "BUS_DMASYNC_PREWRITE" and "BUS_DMASYNC_POSTREAD"
> > > flags. After your patching only the PREXXXX flags are used, so if bouce
> > > pages are used on ARM and x86 and amd64 +++, then only
> > > BUS_DMASYNC_PREWRITE will do anything. This indicates that your patch
> > > is not fully correct.
>
> Hi,
>
> > That is true. I have missed "bounce page" case. I can change flags passed
> > to bus_dmamap_sync() to fix this on ARM, but this will break MIPS.
>
> Right, so there is inconsistency among the platforms in how the BUSDMA is
> implemented, which should be fixed.
>
> > This clearly shows the problem - using side effect of busdma to manage
> > CPU cache. Current USB implementation relies of this side effect assuming
> > that bus_dmamap_sync() with given flags will do cpu cache
> > flush/invalidation. This is not true even on i386 !
>
> i386 handles this differently.
>
> > This thread is not about my patch. It is only a fast and dirty hack
> > making USB working on platforms without hardware cache coherency
> > and showing the problem. I see two proper solutions:
>
> Ok.
>
> > 1. Implement usb_pc_cpu_invalidate() / usb_pc_cpu_flush() with
> > cpu_*() functions realizing requested operation.
> >
> > 2. Using busdma in proper way:
> >         [... prepare data to transfer ...]
> >         bus_dmamap_sync(..., PREREAD | PREWRITE);
> >         [... do transfer ...]
> >         bus_dmamap_sync(..., POSTREAD | POSTWRITE);
> >         [... check results ...]
> > as manpage says:
> > 	<cite>
> > 	If read and write operations are not preceded and followed by the
> > 	appropriate synchronization operations, behavior is undefined.
> > 	</cite>
> > Requirement cited above is currently not met in USB stack. Funtion
> > shown in http://fxr.watson.org/fxr/source/dev/usb/controller/ehci.c#L1294
> > is just an example of improper usage of bus_dmamap_sync(), which is
> > hidden under usb_pc_cpu_invalidate().
>
> I'm not violating any rules in busdma. The busdma manual is silent about
> doing multiple sync operations of the same kind in a row. If the sheeva
> hardware you are using works differently, you have to fix it.
>
> > This thread started from my question about general usage of
> > usb_pc_cpu_*() functions. So I am asking once again - why these functions
> > relies on side effect of busdma instead of simply doing cache
> > flush/invalidation through cpu_*() functions ?
>
> Because busdma is the interface for doing these kinds of operations and not
> cpu_xxx(). The DMA memory is allocated using busdma and loaded using
> busdma, and cpu_xxx() is not a part of the busdma interface.
>
> > > Grepping through the source code for ARM, I found a line like this:
> > > (...)
> > > You see that it only performs purge and no prior flush. This is what
> > > needs fixing! If semihalf could go through the "arm/arm/cpufunc.c" file
> > > and fix those flush and invalidate functions then many people would
> > > become happy!
> >
> > My developement platform is sheeva. CPU functions for this platform are
> > implemented correctly.
> >
> > > Again, it is not a problem in USB firstly, it is a problem in
> > > "arm/xxx".
> >
> > You are suggesting that the problem is in ARM busdma (and in MIPS busdma,
> > as on this platform USB also does not work). Could you give me example of
> > platform _without_ hardware coherency where new USB stack simply works ?
>
> If I find time later today I will fix the cache sync operations for
> AT91RM9200 which has a built in OHCI, which has been working fine with my
> new USB stack.
>
> Can you please add some prints to the "bus_dmamap_sync_buf()" code on ARM,
> and verify which cases the CPU is entering, without your patch and with
> your patch. From my analysis your patch will just change the execution
> path:
>
> With your patch it calls:
>
>                 cpu_dcache_wb_range((vm_offset_t)buf, len);
>                 cpu_l2cache_wb_range((vm_offset_t)buf, len);
>
> Without your patch it calls:
>
>                         cpu_dcache_wbinv_range((vm_offset_t)buf, len);
>                         cpu_l2cache_wbinv_range((vm_offset_t)buf, len);
>
> In my view these execution paths are equivalent with regard to clean or
> cache flush. They should both perform a cache flush, which is what the USB
> code wants to do, but obviously the latter case is not implemented
> correctly, most likely because it doesn't flush the buffer like expected!
>
> Try adding:
>
>                 cpu_dcache_wb_range((vm_offset_t)buf, len);
>                 cpu_l2cache_wb_range((vm_offset_t)buf, len);
>
> Before:
>
>                         cpu_dcache_wbinv_range((vm_offset_t)buf, len);
>                         cpu_l2cache_wbinv_range((vm_offset_t)buf, len);
>
> In: sys/arm/arm/busdma_machdep.c
>

Hi,

I did not find time yet to test on my AT91RM9200 board. I hope to do a test 
this week. Did you at semihalf find anything?

--HPS



More information about the freebsd-usb mailing list