Re: Re: git: 718d1928f874 - main - LinuxKPI: make linux_alloc_pages() honor __GFP_NORETRY
Date: Tue, 25 Mar 2025 21:36:12 UTC
On Tue, 25 Mar 2025, Olivier Certner wrote: Hi Olivier, >> What is the drm code in question? ttm_pool_alloc -> ttm_pool_alloc_page()? >> As all other uses of __GFP_NORETRY in 6.1 (ignoring drm_printf.c) seem to be >> in i915. > > Yes, this is indeed targeted at ttm_pool_alloc_page() which tries to allocate contiguous pages to fill up the pool (but not in 5.10). TTM pools are used by the amdgpu driver to build its translation tables. Great; I think with two other work in progress bits the disabled fallback in there could be used again very soon but that'll be a discussion for another time and place (and likely people). > Calls to functions other than (linux_)alloc_pages*() are unaffected by the change, and if you dig through all the references of __GFP_NORETRY/GFP_RETRY (including those from files under 'selftests/'), you'll see the built GFP flags are never used with (linux_)alloc_pages*(), except for the only reference you mentioned. > >> Are you sure? >> >> i915_gem_object_get_pages_internal() in drm-6.1 at least seems to >> conditionally pass it down: >> >> 17 #define QUIET (__GFP_NORETRY | __GFP_NOWARN) >> ... >> 74 page = alloc_pages(gfp | (order ? QUIET : MAYFAIL), >> >> Seems it can deal with allocation failures, lowering order or returning >> -ENOMEM from the function so should be fine hopefully. > > Yes, I was aware of this piece of code, but obviously it cannot cause any problem. > > All calls to Linux's alloc_pages*() can fail *whatever* the passed GFP flags except for GFP_NOFAIL (and that's the only exception). Ah I see; that's where the M_WAITOK logic in there comes from. I was wondering about it given others like GFP_KERNEL also being mapped to that. Sometimes that makes me wonder if we should indeed pass the gfp_t all the way down to the actual function doing the alloc and only there filter and convert rather than doing that earlier in wrapper functions. > Callers always have to cope, and specifically when specifying __GFP_NORETRY it would be foolish not too (and that wouldn't be allowed in Linus' tree anyway). > > If it wasn't for that, i915_gem_object_get_pages_internal() does the same lowering that ttm_pool_alloc_page() does anyway, as you noticed. > > My sentence was indeed too strong, as I was still swapping in context for this work which was done months ago now. I know how that feels. I just started looking at something I've done a year+ ago completely outside my domain. > I reviewed all callers not only for GFP_NORETRY but also for most others GFP flags (I have tweaked grep files for all of them and over multiple Linux versions), as I started some work to document what the Linux guarantees/behaviors really are and then some other work to rationalize how we translate them in FreeBSD (there seems to be several possible improvements here). Unfortunately, I have stalled that last work for weeks now, and probably will for a significant while. > > Given Linux's contract on __GFP_NORETRY, it is arguably not reasonable to spend time compacting memory on such calls, that's a deviation from what drivers are supposed to expect. > > Oh, and the rest of the commit message also doesn't mention that I also tested this change on machines using the i915 driver, without observing any problem or change in behavior. Fantastic! Thanks you so much for the follow-up. It clarifies a lot and makes sense and is very helpful! Lots of joy, Bjoern -- Bjoern A. Zeeb r15:7