making SW_WATCHDOG dynamic
Andriy Gapon
avg at FreeBSD.org
Wed Dec 27 13:46:23 UTC 2017
On 26/12/2017 16:25, Mike Karels wrote:
> There is a kernel option, SW_WATCHDOG, which adds a low-level software
> watchdog in hardclock. By default, the kernel and watchdogd support
> only hardware-based watchdogs. There is also a callout-based software
> watchdog that can be enabled by watchdogd with an ioctl if --softwatchdog
> is specified, but watchdogd doesn't switch on its own. The SW_WATCHDOG
> option adds a lower-level software watchdog to the hardware-based mechanism,
> but it adds it unconditionally. I propose to include the SW_WATCHDOG
> facility by default, but enable it only if there is no hardware watchdog.
I think that this is a good idea. Although, I would not necessarily tie the
software watchdog to not having any hardware watchdog. This is probably a good
default policy, but I would allow to enable / disable the software watchdog
explicitly (e.g. via a sysctl).
I also think that we should support enabling several watchdog timers with
different timeouts. Each of them can serve a different purpose. E.g., a
software or hardware NMI-sending watchdog can be used to get diagnostic data out
of a hung system while a resetting watchdog can be used to ensure fail-safe
operation.
> I'm interested in any comments, suggestions, or background; feel free to
> mail me off the list. If there are multiple people interested, I'll
> forward messages to that group.
>
> I want to make the change because I have found SW_WATCHDOG quite useful
> at $JOB, and it's annoying to have to build a custom kernel just for this
> (not just once, but every time there is a kernel patch).
Makes sense.
> Also, I'm curious why we have two software watchdog facilities. The
> --softwatchdog facility has various options on expiration, such as
> printf/log/panic; I don't know why anything other than panic/reboot
> would be desirable, though. I already contacted some of the people who
> have left fingerprints on watchdog. Also, if anyone wants to review
> the code, let me know.
I guess that the second software watchdog was added to achieve what I suggested
above. Of course, it would have been nicer to re-use SW_WATCHDOG for that
purpose and to add a more generic support for configuring multiple watchdog
timers with different timeouts. But I guess that adding a new single-purpose
software watchdog was much easier to do.
P.S.
And maybe just using the second software watchdog would be good enough for what
you are doing?
--
Andriy Gapon
More information about the freebsd-arch
mailing list