Re: hdaa: uma_zalloc_debug: zone "malloc-{32,64}" with the following non-sleepable locks held
Date: Sun, 29 Dec 2024 16:32:23 UTC
On Sat, Dec 28, 2024 at 9:27 AM Mark Johnston <markj@freebsd.org> wrote: > On Fri, Dec 27, 2024 at 08:30:37PM +0700, Yuri Pankov wrote: > > Getting the following debug notifications: > > > > hdacc0: <ATI R6xx HDA CODEC> at cad 0 on hdac0 > > hdaa0: <ATI R6xx Audio Function Group> at nid 1 on hdacc0 > > uma_zalloc_debug: zone "malloc-32" with the following non-sleepable > > locks held: > > exclusive sleep mutex hdac0 (HDA driver mutex) r = 0 > > (0xfffff80107cb7aa0) locked @ /usr/src/sys/dev/sound/pci/hda/hdaa.c:1571 > > stack backtrace: > > #0 0xffffffff80bcbbac at witness_debugger+0x6c > > #1 0xffffffff80bccdc0 at witness_warn+0x430 > > #2 0xffffffff80f00974 at uma_zalloc_debug+0x34 > > #3 0xffffffff80f004c7 at uma_zalloc_arg+0x27 > > #4 0xffffffff80b26a7d at malloc+0x7d > > #5 0xffffffff80b2737d at realloc+0xed > > #6 0xffffffff80b27432 at reallocf+0x12 > > #7 0xffffffff80b9238d at devclass_add_device+0x1cd > > #8 0xffffffff80b9093b at make_device+0x10b > > #9 0xffffffff80b9077d at device_add_child_ordered+0x2d > > #10 0xffffffff808b2b2c at hdaa_configure+0x485c > > #11 0xffffffff808ac5b4 at hdaa_attach+0x544 > > #12 0xffffffff80b92b9b at device_attach+0x45b > > #13 0xffffffff80b93f0a at bus_attach_children+0x4a > > #14 0xffffffff808c51c0 at hdacc_attach+0x2f0 > > #15 0xffffffff80b92b9b at device_attach+0x45b > > #16 0xffffffff80b93f0a at bus_attach_children+0x4a > > #17 0xffffffff808c3e9d at hdac_attach2+0x35d > > I see this as well on a new system. I think this is fallout from commit > f3d3c63442fff. > > At a glance, the hdaa lock in question can't trivially be made > sleepable, as it's also used to lock a callout handler, > hdaa_jack_poll_callback(), and the lock itself is shared with the parent > hdac device. > > Until that's fixed somehow, I suspect we should restore the M_NOWAIT > usage. > I think that's right. One issue is that it's doing its own locking in attach, but since we're not yet competing for resources, that may be misplaced (I've not looked in detail, though). I agree that reverting this small part of the change would be warranted until we can sort out the other issues with newbus. While I'd like to transition to a topo lock for it, I know all the difficulties that CAM has had with that route. While it exists in a more hostile environment for things coming and going, I think that maybe jumping to some kind of epoch or smr approach for lifetime management may be better, though I've not thought though it in detail since ideally we'd do it for newbus and then move CAM's lifetime management into that same mechanism and radically simplify the code there which is a twisty maze of hacks to ensure things don't go away too soon when its reference counting fails to cover some weird edge case. Warner > > pcm0: <ATI R6xx (HDMI)> at nid 3 on hdaa0 > > pcm1: <ATI R6xx (HDMI)> at nid 5 on hdaa0 > > pcm2: <ATI R6xx (HDMI)> at nid 7 on hdaa0 > > pcm3: <ATI R6xx (HDMI)> at nid 9 on hdaa0 > > hdacc1: <Realtek ALC888 HDA CODEC> at cad 0 on hdac1 > > hdaa1: <Realtek ALC888 Audio Function Group> at nid 1 on hdacc1 > > uma_zalloc_debug: zone "malloc-64" with the following non-sleepable > > locks held: > > exclusive sleep mutex hdac1 (HDA driver mutex) r = 0 > > (0xfffff80107cb7a40) locked @ /usr/src/sys/dev/sound/pci/hda/hdaa.c:1571 > > stack backtrace: > > #0 0xffffffff80bcbbac at witness_debugger+0x6c > > #1 0xffffffff80bccdc0 at witness_warn+0x430 > > #2 0xffffffff80f00974 at uma_zalloc_debug+0x34 > > #3 0xffffffff80f004c7 at uma_zalloc_arg+0x27 > > #4 0xffffffff80b26a7d at malloc+0x7d > > #5 0xffffffff80b2737d at realloc+0xed > > #6 0xffffffff80b27432 at reallocf+0x12 > > #7 0xffffffff80b9238d at devclass_add_device+0x1cd > > #8 0xffffffff80b9093b at make_device+0x10b > > #9 0xffffffff80b9077d at device_add_child_ordered+0x2d > > #10 0xffffffff808b2b2c at hdaa_configure+0x485c > > #11 0xffffffff808ac5b4 at hdaa_attach+0x544 > > #12 0xffffffff80b92b9b at device_attach+0x45b > > #13 0xffffffff80b93f0a at bus_attach_children+0x4a > > #14 0xffffffff808c51c0 at hdacc_attach+0x2f0 > > #15 0xffffffff80b92b9b at device_attach+0x45b > > #16 0xffffffff80b93f0a at bus_attach_children+0x4a > > #17 0xffffffff808c3e9d at hdac_attach2+0x35d > > pcm4: <Realtek ALC888 (Rear Analog 5.1/2.0)> at nid 20,22,21 and 24,26 > > on hdaa1 > > pcm5: <Realtek ALC888 (Front Analog)> at nid 27 and 25 on hdaa1 > > pcm6: <Realtek ALC888 (Internal Digital)> at nid 17 and 31 on hdaa1 > > pcm7: <Realtek ALC888 (Rear Digital)> at nid 30 on hdaa1 > > > > Devices in question: > > > > hdac0@pci0:17:0:1: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1002 > > device=0x1640 subvendor=0x15d9 subdevice=0x1c97 > > vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' > > device = 'Rembrandt Radeon High Definition Audio Controller' > > class = multimedia > > subclass = HDA > > hdac1@pci0:17:0:6: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1022 > > device=0x15e3 subvendor=0x15d9 subdevice=0x1c97 > > vendor = 'Advanced Micro Devices, Inc. [AMD]' > > device = 'Family 17h/19h HD Audio Controller' > > class = multimedia > > subclass = HDA > > >