From nobody Sat Dec 21 16:28:27 2024 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YFqTm20ztz5hLy9 for ; Sat, 21 Dec 2024 16:28:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YFqTl676gz4K1h for ; Sat, 21 Dec 2024 16:28:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1734798507; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xIl7+X2WwdNU6jvQa/ZJVTrZJaCRvyFBGpln0VBRezU=; b=nhlYjkHUVRO2Y2AAEhNEEAYqxXAmTdWgekwNQzYlQNZd7tSTHJ5CbaX2GeyoE4Vjd9fYYx vkr7fL/2zwBuVSfcsnqo1j3+qcN1MedGe2M1NLzBgwp53ryYRjOD37O2GB1O8419EpTRx2 A1Myav3HI9wbfGydrwo+Wvuhs07zfUxJBxI2FWWYhmOG1gzZ3AFE9h1xKioqq6jdT+CyWv oH7iC5VQ0zb9Xw3u+92AJYOKNyu6FE2CxQXIobl+TY16Dr5K80RMI58TuhTqlQYdGKp+Fs QsRkmtJ/NUzcQhpJi+NkU/88SVrelyjLWxPAoflHKqiXcHibpxjIQoDh7lFbMQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1734798507; a=rsa-sha256; cv=none; b=n5HaeIfkRKapFgstJaG7uRyRl5qynr2nO0CRFA75FAJHTnl4TRitvxKLu9WWfpiXTQmf1u UQiZo7u1iJKOqOH7Ia8RYJ3XNlhvgmLHze22X/uBunYuvVvSqMQOD4/vjwxVihP6wldc/M N1DzyNGRLkKf12Kj9M7VxOd0I4zdsUC8OKDlnhz/aqKzWSXArDZEcjNafFO8B+3qk60nMs 6osmHb5rWEkbMKOddhFoDA80EnrYSsdMw694VzDwGCCT2k7LkjEW2Yx4O3AxinVBQOjT+h J2i9yZhSuVriIV5ojnbaYczf10tIKJldjnT0nmZxYJrQsB49QosZMdZYFW1YKA== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4YFqTl4XC4zXTG for ; Sat, 21 Dec 2024 16:28:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 4BLGSRlL045998 for ; Sat, 21 Dec 2024 16:28:27 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 4BLGSRNA045997 for bugs@FreeBSD.org; Sat, 21 Dec 2024 16:28:27 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko Date: Sat, 21 Dec 2024 16:28:27 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 13.1-RELEASE X-Bugzilla-Keywords: crash, needs-qa X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: marklmi26-fbsd@yahoo.com X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: maintainer-feedback? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@FreeBSD.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D267028 --- Comment #260 from Mark Millard --- (In reply to satanist+freebsd from comment #259) In: mod =3D malloc(sizeof(struct modlist), M_LINKER, M_NOWAIT | M_ZERO); if (mod =3D=3D NULL) panic("no memory for module list"); mod->container =3D container; if something similar to mod =3D=3D 0xfffff80000000007 resulted, it appears to me that the dereference in mod->container or the like would have gotten a general protection fault, given the later actual failure that sometimes happens because of the 0xfffff80000000007 that sometimes happens. I'll note also that, for example, one of the historical crashes involving 0xfffff80000000007 was in handling a different list: /* * Remove the references to the thread from all of the objects we were * polling. */ static void seltdclear(struct thread *td) { struct seltd *stp; struct selfd *sfp; struct selfd *sfn; stp =3D td->td_sel; STAILQ_FOREACH_SAFE(sfp, &stp->st_selq, sf_link, sfn) selfdfree(stp, sfp); stp->st_flags =3D 0; } so the issue does not appear to be list specific, even if one list is more common for failing than others for some reason. I do not know if there is some relevant relationship with the likes of code from: drm-kmod/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c for alternate failure points. No simple reproduction test has ever been discovered. MALLOC_DEBUG is controlled in the kernel via sys/kern/kern_malloc.c having the code: #if defined(INVARIANTS) || defined(MALLOC_MAKE_FAILURES) || \ defined(DEBUG_MEMGUARD) || defined(DEBUG_REDZONE) #define MALLOC_DEBUG 1 #endif It, in turn leads to definition and use of the kernel's malloc_dbg() and free_dbg(). I certainly have no objection to such testing, say via using an INVARIANTS based kernel build. But I'm not testing, having no context to use to reproduce the problem with. I'm just looking at vmcore.* file(s) via kgdb . But I'll also note, that recently we appear to have learned that some of the software in use was rather old and not being updated --so not tracking kernel updates. Testing if the modern software built to match the kernel in use also produces the problems seems appropriate, as that is what would be changed if there is still a bug to be fixed. As I understand that testing is what is going on now. --=20 You are receiving this mail because: You are the assignee for the bug.=