Re: Trying to implement BFS, page fault at vfs_domount_first, how to debug?

From: John F Carr <jfc_at_mit.edu>
Date: Fri, 30 Dec 2022 19:35:43 UTC

> On Dec 30, 2022, at 14:13, Hikmat Jafarli <jafarlihi@gmail.com> wrote:
> 
> I'm trying to implement the BeOS filesystem (BFS) for FreeBSD.
> The repository is here: https://github.com/jafarlihi/freebsd-bfs
> (Please don't mind bad styling and all the copy-paste work,
> I'll polish it later, I'm just trying to get to some PoC where it works)
> 
> Now when I try to mount a valid BFS partition (reported as BFS by `fstyp`)
> it executes all the way to printf that logs "Either not a BFS volume or
> corrupted" and then crashes with "page fault while in kernel mode" in
> vfs_domount_first+0x271. Here's the log:
> ```
> Either not a BFS volume or corrupted
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address = 0x18
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff82b2427b
> stack pointer        = 0x28:0xfffffe00df399ac0
> frame pointer        = 0x28:0xfffffe00df399ac0
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 1208 (mount)
> trap number = 12
> panic: page fault
> cpuid = 0
> time = 1672414952
> KDB: stack backtrace:
> #0 0xffffffff80c694a5 at kdb_backtrace+0x65
> #1 0xffffffff80c1bb5f at vpanic+0x17f
> #2 0xffffffff80c1b9d3 at panic+0x43
> #3 0xffffffff810afdf5 at trap_fatal+0x385
> #4 0xffffffff810afe4f at trap_pfault+0x4f
> #5 0xffffffff810875b8 at calltrap+0x8
> #6 0xffffffff80cf0651 at vfs_domount_first+0x271
> #7 0xffffffff80cece9d at vfs_domount+0x2ad
> #8 0xffffffff80cec2d8 at vfs_donmount+0x8f8
> #9 0xffffffff80ceb9a9 at sys_nmount+0x69
> #10 0xffffffff810b06ec at amd64_syscall+0x10c
> #11 0xffffffff81087ecb at fast_syscall_common+0xf8
> ```
> 
> Now I'm trying to understand what exactly goes wrong here
> and how to map 0x271 to the exact source line.
> 
> I'd appreciate it if someone could tell me how to debug this.
> 
> (Sorry for noob question, I already tried IRC and was directed here)

Your BFS module tried to dereference a null pointer to structure.

It's a null pointer dereference because of "fault virtual address = 0x18".  That normally means you tried to access the fourth word of a structure but the pointer to structure was null.  It could be something else, but play the odds.

It's in your module because the instruction pointer address is far beyond the other kernel functions in the stack trace.  Stack traces in crash reports are misleading: they tend to omit the function that triggered the crash.  The address of vfs_domount_first is 0xffffffff80cf03e0 (0xffffffff80cf0651 - 0x271).  That's the function that called your module.  The address of the faulting instruction is 0xffffffff82b2427b.  That's in your module.