[Bug 277476] graphics/drm-515-kmod: amdgpu periodic hangs due to phys contig allocations

From: <bugzilla-noreply_at_freebsd.org>
Date: Sat, 06 Apr 2024 14:22:49 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277476

--- Comment #2 from Josef 'Jeff' Sipek <jeffpc@josefsipek.net> ---
I dug a bit more into this.  It looks like the drm code has provisions for
allocating memory via dma APIs.  The FreeBSD port doesn't implement those.

Specifically, looking at drm-kmod-drm_v5.15.25_5 source:

drivers/gpu/drm/amd/amdgpu/gmc_v*.c sets adev->need_swiotlb to
drm_need_swiotlb(...).  drm_need_swiotlb is implemented in
drivers/gpu/drm/drm_cache.c as a 'return false' on FreeBSD.

Later on, amdgpu_ttm_init calls ttm_device_init with the use_dma_alloc
argument equal to adev->need_swiotlb (IOW, false).

Much later on, ttm_pool_alloc is called to allocate a buffer.  That in turn
calls ttm_pool_alloc_page which amounts to:

        if (!use_dma_alloc)
                return alloc_pages(...);

        panic("ttm_pool.c: use_dma_alloc not implemented");

So, because of the 'return false' during initialization, we always call
alloc_pages (aka. linux_alloc_pages) which tries to allocate physically
contiguous memory.

As I said before, I don't know anything about the graphics stack, so it is
possible that this dma API is completely irrelevant.


Looking at ttm_pool_alloc some more, it immediately turns the physically
contiguous allocation into an array of struct page pointers (tt->pagse).
So, depending on how the rest of the module uses the buffer & pages, it
may be relatively easy to switch to a virtually-contiguous allocation.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.