Re: Debugging a (potentially?) ZFS-related panic, and discussion about large patchsets

From: Mark Johnston <markj_at_freebsd.org>
Date: Mon, 10 Jan 2022 23:34:02 UTC
On Mon, Jan 10, 2022 at 05:11:16PM -0500, Shawn Webb wrote:
> Hey all,
> 
> So I'm getting an interesting ZFS-related kernel panic. I've uploaded
> the core.txt at [0]. I suspect it's related to FreeBSD commit
> 681ce946f33e75c590e97c53076e86dff1fe8f4a (zfs: merge
> openzfs/zfs@f291fa658 (master) into main).
> 
> I'm able to reproduce it on a single system with some level of
> determinism: I'm building the security appliance firmware at ${DAYJOB}
> in a bhyve VM that's backed by a zvol. The host is a Dell Precision
> 7540 laptop with a single NVMe drive in it. The VM is configured with
> a single zvol, booting with UEFI.
> 
> Looking at the commit email sent to dev-commits-src-all@, I see this:
> 146 files changed, 4933 insertions(+), 1572 deletions(-)
> 
> Strangely, when I run `git show
> 681ce946f33e75c590e97c53076e86dff1fe8f4a`, I only see a small subset
> of those changes.
 
That is a merge commit.  You need to specify that you want a diff
against the first parent (the preceding FreeBSD), so something
equivalent to "git diff --stat 681ce946f^ 681ce946f".  Use
"git log 681ce946f^2" to see the merged OpenZFS commits.

> As a downstream consumer of 14-CURRENT, how am I supposed to even
> start debugging such a large patchset in any manner that respects my
> time?
>
> It seems to me that breaking up commits into smaller, bite-size chunks
> would make life easier for those experiencing bugs, especially ones
> that result in kernel panics.

That's up to the upstream project, in this case OpenZFS.

> ZFS in and of itself is a beast, and I've yet to study any of its
> code, so when there's a commit that large, even thinking about
> debugging it is a daunting task.
> 
> Needless to say, I'm going to need some hand holding here for
> debugging this. Anyone have any idea what's going on?

To start, you'll need to look at the stack trace for the thread with tid
100061.

> I guess this email is to serve three purposes:
> 
> 1. Report that a bug was introduced recently.
> 2. Ask for help in squashing the bug. I'm more than happy to test any
>    patches.
> 3. Start a dialogue on making life just a little easier for
>    downstreams.
> 
> [0]: https://hardenedbsd.org/~shawn/2022-01-10_zfs_core-r01.txt