[Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 29 Dec 2024 02:45:37 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267028 --- Comment #336 from Mark Millard <marklmi26-fbsd@yahoo.com> --- (In reply to George Mitchell from comment #335) For reference: going backwards through the found_modules list (via also using my extra recorded data) is the following. It is pairs of my modlist_newmod_hist and then a the prior node's link.tqe_next value that should agree with the the prior modAddr. (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos] $48 = {modAddr = 0xfffff80004718180, containerAddr = 0xfffff8000362f300, modnameAddr = 0xffffffff82ea6025 "amdgpu_raven_vcn_bin_fw", version = 1} (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-1].modAddr->link.tqe_next $49 = (struct modlist *) 0xfffff80004718180 (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-1] $50 = {modAddr = 0xfffff800047182c0, containerAddr = 0xfffff8000362f480, modnameAddr = 0xffffffff82e62026 "amdgpu_raven_mec2_bin_fw", version = 1} (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-2].modAddr->link.tqe_next $51 = (struct modlist *) 0xfffff800047182c0 (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-2] $52 = {modAddr = 0xfffff80003647d40, containerAddr = 0xfffff80003169180, modnameAddr = 0xffffffff82e1e010 "amdgpu_raven_mec_bin_fw", version = 1} (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-3].modAddr->link.tqe_next $53 = (struct modlist *) 0xfffff80003647d40 (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-3] $54 = {modAddr = 0xfffff80004718240, containerAddr = 0xfffff80003169300, modnameAddr = 0xffffffff82e12009 "amdgpu_raven_rlc_bin_fw", version = 1} (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-4].modAddr->link.tqe_next $55 = (struct modlist *) 0xfffff80004718240 (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-4] $56 = {modAddr = 0xfffff800035bb8c0, containerAddr = 0xfffff80003169600, modnameAddr = 0xffffffff829f6010 "amdgpu_raven_ce_bin_fw", version = 1} (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-5].modAddr->link.tqe_next $57 = (struct modlist *) 0xfffff80000000007 (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-5] $58 = {modAddr = 0xfffff80004b90140, containerAddr = 0xfffff80004c42000, modnameAddr = 0xffffffff829ef000 "amdgpu_raven_me_bin_fw", version = 1} (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-6].modAddr->link.tqe_next $59 = (struct modlist *) 0xfffff80004b90140 (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-6] $60 = {modAddr = 0xfffff80004b90180, containerAddr = 0xfffff80004c42300, modnameAddr = 0xffffffff829e7025 "amdgpu_raven_pfp_bin_fw", version = 1} (kgdb) print modlist_newmod_hist[modlist_newmod_hist_pos-7].modAddr->link.tqe_next $61 = (struct modlist *) 0xfffff80004b90180 $57's (modlist_newmod_hist_pos-5's) link.tqe_next does not agree with $56's modlist_newmod_hist[modlist_newmod_hist_pos-4].modAddr , again by having the value 0xfffff80000000007 . The code did not stop when 0xfffff80000000007 was stored into that link.tqe_next instance, unfortunately. There is something just before that was unusual in the core.9.txt ( or, as named here, core.txt.9 ): I think it is the first time I've seen any "WARNING !drm_modeset_is_locked . . ." messages BEFORE the first part of the first trap(/panic?) reported. In this example, it looks like: <6>[drm] Initialized amdgpu 3.40.0 20150101 for drmn0 on minor 0 WARNING !drm_modeset_is_locked(&crtc->mutex) failed at /usr/ports/graphics/drm-510-kmod/work/drm-kmod-drm_v5.10.163_7/drivers/gpu/drm/drm_atomic_helper.c:619 . . . WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/ports/graphics/drm-510-kmod/work/drm-kmod-drm_v5.10.163_7/drivers/gpu/drm/drm_atomic_helper.c:894 kernel trap 22 with interrupts disabled kernel trap 22 with interrupts disabled panic: modlist_lookup: a prior tqe_next changed! . . . I wonder if that is some sort of consequence of my attempt to have the hardware monitoring three 8-Byte address ranges for being written to. As stands, I do not see how the results provide any specific additional useful-evidence that I can identify. The only thing that I've thought of is to add printf reporting of the address argument passed to each attempted db_hwatchpoint_cmd use to help validate that I have that code doing what I intended. -- You are receiving this mail because: You are the assignee for the bug.