[Bug 283965] nfs: page fault during nfsrpc_readdir

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 09 Jan 2025 20:15:09 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=283965

            Bug ID: 283965
           Summary: nfs: page fault during nfsrpc_readdir
           Product: Base System
           Version: 15.0-CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: asomers@FreeBSD.org

PROBLEM
=======

nfs page faults in response to a bad readdir response by the underlying file
system.

First it prints to dmesg:
Readdir reply file name had imbedded / or nul by

Then it page panics:

panic: page fault
cpuid = 5
time = 1736452691
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00dc52c200
vpanic() at vpanic+0x136/frame 0xfffffe00dc52c330
panic() at panic+0x43/frame 0xfffffe00dc52c390
trap_pfault() at trap_pfault+0x466/frame 0xfffffe00dc52c400
calltrap() at calltrap+0x8/frame 0xfffffe00dc52c400
--- trap 0xc, rip = 0xffffffff80a19014, rsp = 0xfffffe00dc52c4d0, rbp =
0xfffffe00dc52c7f0 ---
nfsrpc_readdir() at nfsrpc_readdir+0xbd4/frame 0xfffffe00dc52c7f0
ncl_readdirrpc() at ncl_readdirrpc+0xf0/frame 0xfffffe00dc52c940
ncl_doio() at ncl_doio+0x4bb/frame 0xfffffe00dc52c9e0
ncl_bioread() at ncl_bioread+0x5f4/frame 0xfffffe00dc52cb70
nfs_readdir() at nfs_readdir+0x1d8/frame 0xfffffe00dc52cc80
vop_sigdefer() at vop_sigdefer+0x30/frame 0xfffffe00dc52ccb0
VOP_READDIR_APV() at VOP_READDIR_APV+0x32/frame 0xfffffe00dc52ccd0
kern_getdirentries() at kern_getdirentries+0x1d7/frame 0xfffffe00dc52cdd0
sys_getdirentries() at sys_getdirentries+0x29/frame 0xfffffe00dc52ce00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00dc52cf30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00dc52cf30

Examining the core file shows that the value of tl2 is 0x4 in
nfs_clrpcops.c:3768, at this line:

                            *tl2++ = cookiep->nfsuquad[0] = cookie.lval[0] =
                                ncookie.lval[0];


STEPS TO REPRODUCE
==================

I have a fusefs test case in development that can reliably trigger this panic
in nfsd.  The test creates a fuse file system and exports it over NFS, then
performs a readdir on the NFS mount.

ENVIRONMENT
===========

FreeBSD 15.0-CURRENT amd64 based on f415b2ef30f7bf0db753f09fbba7b0910475b0d2
(Jan 6 2025), with one small change to fusefs and with rmacklem's WIP patch
that adds vfs.nfsd.nfsd_disable_grace.

ANALYSIS
========

This looks like a straightforward uninitialized data access.  tl2 is
initialized to NULL.  Then it's set to cp, but only if nfscl_invalidfname
reports that the fname is valid.  Then it gets dereferenced at line 3768
regardless of whether it was ever initialized.

-- 
You are receiving this mail because:
You are the assignee for the bug.