[Bug 273715] dumpon: Kernel panic on boot when enabling dumpon over IP
Date: Mon, 02 Oct 2023 14:10:36 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273715 --- Comment #10 from Michael Gmelin <grembo@FreeBSD.org> --- (In reply to Mark Johnston from comment #9) > How did you come to that conclusion? By the art of baseless speculation. I compared the actual 13.1 and 13.2 boot sequences, and on both the interfaces are renamed after dumpon is called. So forget that. I spent the time to actually read your patch the previous change that lead to it, so this all makes sense now (Captain Obvious reiterating): https://cgit.freebsd.org/src/commit/sys/netinet/netdump/netdump_client.c?id=38a36057ae56c8023878f3c3c2185bafc2896964 introduced a check if the device supports debugnet. This check couldn't deal with non-existing interfaces, which is the bug you fixed. Before the check was introduced, setting the renamed interface name was fine, since at the point the panic I tested with happened - which is after device renaming - it just worked(tm). If there was a panic between dumpon and device renaming, netdump would have failed of course (but that's a rare thing to happen). Now with the check in place, it first crashed (due to the bug reported) and now - with the fix - will not catch on, as an interfaces named "untrusted" cannot be found. I see various options how to resolve this: 1. Always allow using the physical device name (so it would somehow poke around and find it) 2. Add a flag to dumpon (e.g., `-f` for force) to skip the device exists/device supports debugnet check 3. Allow specifying multiple devices netdump_client to try when dumping 4. Determine based on the device name if it can exist early (feels wonky) For my setup, 2. would be sufficient and easy to apply (as this means that all I have to do to adapting my base rc.conf for a new host is modify the ifconfig_<devname>_name line in rc.conf). It would also be quite easy to implement I assume (DIOCGKERNELDUMP would need to be extended though - unless some encoding in one of the existing parameters is done, like "dumpon -s 192.168.1.1 -s 192.168.1.2 untrusted:nocheck"). Alternatively this could maybe be handled using a sysctl(?). -- You are receiving this mail because: You are the assignee for the bug.