Re: Direct dumped kernel cores
- In reply to: Justin Hibbits : "Re: Direct dumped kernel cores"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 01 Nov 2024 01:48:53 UTC
On Thu, Oct 31, 2024, 7:11 PM Justin Hibbits <jhibbits@freebsd.org> wrote: > On Thu, 31 Oct 2024 16:32:51 -0600 > Warner Losh <imp@bsdimp.com> wrote: > > > On Thu, Oct 31, 2024 at 4:24 PM Justin Hibbits <jhibbits@freebsd.org> > > wrote: > > > > > Hi everyone, > > > > > > At Juniper we've been using a so-called 'rescue' kernel for dumping > > > vmcores directly to the filesystem after a panic. We're now > > > contributing this feature, implemented by Klara Systems, to > > > FreeBSD, and looking for feedback. I posted a review > > > at https://reviews.freebsd.org/D47358 for anyone interested. > > > > > > Interesting bits to keep in mind: > > > * It requires a 2-stage build process, one to build the rescue > > > kernel, the other to build the main kernel, which embeds the rescue > > > kernel inside its image. This might need some further work. > > > * Thus far it's been implemented for amd64 and arm64, once proven > > > out, other architectures (powerpc64/le, riscv64) can follow suit. > > > * Kernel environment bits to pass down to the rescue kernel are > > > prefixed `debug.rescue.`, for instance > > > `debug.rescue.vfs.root.mountfrom`. > > > > > > > First off, this is kinda cool. I've wanted this occasionally when my > > swap partition is too small (though in my case, it was easy enough to > > add another drive to the system that was panicking and dump to that). > > > > I do have a question: I'm curious why you didn't follow the Linux > > lead of having > > a kexec_load(2) system call to load the 'rescue kernel' to make this > > more generic. > > That would make the leap to having full kexec support (eg > > reboot(CMD_KEXEC) a lot easier to implement. > > > > Warner > > One problem with trying to kexec_load() a rescue kernel is that the > rescue kernel needs its own memory to work with, a contiguous block, so > needs to be loaded early, or at least reserved early. Without its > reserved memory it would be stomping over the 'host' kernel's > memory. That said, I do like that direction, and it's definitely worth > exploring. > That's exactly what kexec_load does. When the crash happens, the current kernel constructs a new memory map and passes that to the preloaded crash kernel so it knows what memory can safely be used plus info needed to do the crash dump. For the replacement kernel, the reboot copies a miniloader that copies the kernel to the load address, tears the cpu down to the warm reset state and jumps to the trampoline used to start the kernel. Loader.kboot writes that trampoline, creates the EFIlike style metadata and a memory map. And then calls reboot to boot into the new kernel. Warner - Justin > > > > > > > > There are many more details in the review summary. > > > > > > We'd love to get feedback from anyone interested. > > > > > > Thanks, > > > Justin Hibbits > > > > > > > >