Re: Debugging a (potentially?) ZFS-related panic, and discussion about large patchsets
- In reply to: Shawn Webb : "Re: Debugging a (potentially?) ZFS-related panic, and discussion about large patchsets"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 11 Jan 2022 00:11:10 UTC
On Mon, Jan 10, 2022 at 07:10:23PM -0500, Shawn Webb wrote: > On Tue, Jan 11, 2022 at 12:43:06AM +0100, Mateusz Guzik wrote: > > On 1/11/22, Mark Johnston <markj@freebsd.org> wrote: > > > On Mon, Jan 10, 2022 at 05:11:16PM -0500, Shawn Webb wrote: > > >> Hey all, > > >> > > >> So I'm getting an interesting ZFS-related kernel panic. I've uploaded > > >> the core.txt at [0]. I suspect it's related to FreeBSD commit > > >> 681ce946f33e75c590e97c53076e86dff1fe8f4a (zfs: merge > > >> openzfs/zfs@f291fa658 (master) into main). > > >> > > >> I'm able to reproduce it on a single system with some level of > > >> determinism: I'm building the security appliance firmware at ${DAYJOB} > > >> in a bhyve VM that's backed by a zvol. The host is a Dell Precision > > >> 7540 laptop with a single NVMe drive in it. The VM is configured with > > >> a single zvol, booting with UEFI. > > >> > > >> Looking at the commit email sent to dev-commits-src-all@, I see this: > > >> 146 files changed, 4933 insertions(+), 1572 deletions(-) > > >> > > >> Strangely, when I run `git show > > >> 681ce946f33e75c590e97c53076e86dff1fe8f4a`, I only see a small subset > > >> of those changes. > > > > > > That is a merge commit. You need to specify that you want a diff > > > against the first parent (the preceding FreeBSD), so something > > > equivalent to "git diff --stat 681ce946f^ 681ce946f". Use > > > "git log 681ce946f^2" to see the merged OpenZFS commits. > > > > > >> As a downstream consumer of 14-CURRENT, how am I supposed to even > > >> start debugging such a large patchset in any manner that respects my > > >> time? > > >> > > >> It seems to me that breaking up commits into smaller, bite-size chunks > > >> would make life easier for those experiencing bugs, especially ones > > >> that result in kernel panics. > > > > > > That's up to the upstream project, in this case OpenZFS. > > > > > >> ZFS in and of itself is a beast, and I've yet to study any of its > > >> code, so when there's a commit that large, even thinking about > > >> debugging it is a daunting task. > > >> > > >> Needless to say, I'm going to need some hand holding here for > > >> debugging this. Anyone have any idea what's going on? > > > > > > To start, you'll need to look at the stack trace for the thread with tid > > > 100061. > > > > > > > imo the kernel should be patched to obtain the trace on its own. As > > the target has interrupts disabled it will have to do it with NMI, but > > support for that got scrapped in > > > > commit 1c29da02798d968eb874b86221333a56393a94c3 > > Author: Mark Johnston <markj@FreeBSD.org> > > Date: Fri Jan 31 15:43:33 2020 +0000 > > > > Reimplement stack capture of running threads on i386 and amd64. > > I guess it's especially problematic for laptop systems where dropping > to the db> prompt isn't an option (nvidia driver on this laptop). I'd > have to scrap the entire notion of a GUI, which kinda defeats the > purpose of using a laptop. > > Plugging in a USB memstick and setting debug.trace_on_panic=0 is the > route I usually take on such systems. Sorry, wrong sysctl node. I meant to reference debug.debugger_on_panic. -- Shawn Webb Cofounder / Security Engineer HardenedBSD https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc