SDHCI_QUIRK_BROKEN_SDMA_BOUNDARY, maxphys, SDHCI_BLKSZ_SDMA_BNDRY_512K, and alloc_bounce_zone behavior [RPi5 example]

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 22 Jan 2024 06:55:58 UTC
I had previously reported on the freebsd-arm list that
on the RPi5 used via the EDK2 draft I was seeing:

# sysctl hw.busdma
hw.busdma.zone0.total_deferred_time: 0 0
hw.busdma.zone0.domain: 0
hw.busdma.zone0.alignment: 524288
hw.busdma.zone0.lowaddr: 0xffffffff
hw.busdma.zone0.total_deferred: 0
hw.busdma.zone0.total_bounced: 12018773
hw.busdma.zone0.active_bpages: 12
hw.busdma.zone0.reserved_bpages: 0
hw.busdma.zone0.free_bpages: 1227
hw.busdma.zone0.total_bpages: 1239
hw.busdma.total_bpages: 1239

Note the large alignment.

It turns out that the first alloc_bounce_zone call
on the RPi5 is for:

        if (!(slot->quirks & SDHCI_QUIRK_BROKEN_SDMA_BOUNDARY)) {
                if (maxphys <= 1024 * 4)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_4K;
                else if (maxphys <= 1024 * 8)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_8K;
                else if (maxphys <= 1024 * 16)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_16K;
                else if (maxphys <= 1024 * 32)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_32K;
                else if (maxphys <= 1024 * 64)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_64K;
                else if (maxphys <= 1024 * 128)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_128K;
                else if (maxphys <= 1024 * 256)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_256K;
                else
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_512K;
        }
        slot->sdma_bbufsz = SDHCI_SDMA_BNDRY_TO_BBUFSZ(slot->sdma_boundary);

        /*
         * Allocate the DMA tag for an SDMA bounce buffer.
         * Note that the SDHCI specification doesn't state any alignment
         * constraint for the SDMA system address.  However, controllers
         * typically ignore the SDMA boundary bits in SDHCI_DMA_ADDRESS when
         * forming the actual address of data, requiring the SDMA buffer to
         * be aligned to the SDMA boundary.
         */
        err = bus_dma_tag_create(bus_get_dma_tag(slot->bus), slot->sdma_bbufsz,
            0, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL, NULL,
            slot->sdma_bbufsz, 1, slot->sdma_bbufsz, BUS_DMA_ALLOCNOW,
            NULL, NULL, &slot->dmatag);

That gives the alignment 524288 and lowaddr 0xffffffff
on the RPi5 used via the EDK2 draft. That gives only 8192
aligment positions that fit in the 32 bit address space
lowaddr spans.

alloc_bounce_zone does:

	. . .
	/* Check to see if we already have a suitable zone */
	STAILQ_FOREACH(bz, &bounce_zone_list, links) {
		if ((dmat_alignment(dmat) <= bz->alignment) &&
#ifdef dmat_domain    
		    dmat_domain(dmat) == bz->domain &&
#endif
		    (dmat_lowaddr(dmat) >= bz->lowaddr)) {
			dmat->bounce_zone = bz;
			return (0);
		}     
	}
	. . .
	bz->alignment = MAX(dmat_alignment(dmat), PAGE_SIZE);
	. . .

so later calls end up with:

dmat_alignment(dmat) <= bz->alignment  // actually < 524288
and:
dmat_lowaddr(dmat) >= bz->lowaddr      // actually > 0xffffffffu

So everything ends up using:

hw.busdma.zone0.alignment: 524288
hw.busdma.zone0.lowaddr: 0xffffffff

and no other busdma zone is created for the RPi5
EDK2 context.

One could imaging a smaller lowaddr being in the first
alloc_bounce_zone call --and/or a larger alignment.
The code is sensitive to the ordering that the value
pairs happen to occur in and just a reversed order could
give very different results, for example.

I would expect that avoiding everything sharing a small
lowaddr value or huge alignment value (or a combination
of both) would be appropriate.

In other words, I'm suggesting that alloc_bounce_zone
probably should be adjusted.

I'll warn that there is another oddity that many
calls to alloc_bounce_zone end up with
dmat_alignment(dmat) being < PAGE_SIZE, tiny even.
But the bz->alignment was forced to a PAGE_SIZE
as a minimum value in the first call. The comparison
is not based on a common value-standard for the 2
sides.



===
Mark Millard
marklmi at yahoo.com