Re: panic: nfsv4root ref cnt cpuid = 1

From: John F Carr <jfc_at_mit.edu>
Date: Tue, 24 Sep 2024 17:53:57 UTC

> On Sep 24, 2024, at 13:32, J David <j.david.lists@gmail.com> wrote:
> 
> On Mon, Sep 23, 2024 at 6:38 PM Rick Macklem <rick.macklem@gmail.com> wrote:
>> If you can easily get the source line# for nfsrpc_lookup+0x87f, that
>> could be helpful.
> 
> Sure. I did it via lldb and got nfs_clrpcops.c 1697. Your method gives
> the same result.
> 
> According to the github version of 14.1-RELEASE, that's the "if (ndp
> != NULL) {" after the call to nfscl_openrelease().
> 
> Per lldb, the actual instruction at that address is:
> 
> testq  %r13, %r13
> 
> My knowledge of amd64 assembler is nearly nil, but I *think* this
> corresponds to checking if ndp is null. And I think that %r13 is a
> register, so I'm not sure that could cause a page fault. Maybe the
> trace indicates that that's the line it would have come back to if
> something in nfscl_openrelease() hadn't gone wrong?
> 
> Thanks!
> 

The stack dump on the console tends to omit the frame that faulted.
The fault was probably in nfscl_openrelease or something it called.
The faulting instruction address 0xffffffff809da260 should be accurate.
The faulting data address 0x28 corresponds to the offset of field
nfsow_rwlock in struct nfsclowner.  Perhaps in nfscl_openrelease
the expression op->nfo_own is NULL and the fault is in one of the
two function calls in this code block around line 850 of nfs_clstate.c.

	owp = op->nfso_own;
	if (NFSHASONEOPENOWN(nmp))
		nfsv4_relref(&owp->nfsow_rwlock);
	else
		nfscl_lockunlock(&owp->nfsow_rwlock);


John Carr