Re: watchdog timer programming (progress)

From: mike tancsa <mike_at_sentex.net>
Date: Wed, 02 Oct 2024 12:59:09 UTC
<trimming a bunch>

On 10/2/2024 5:13 AM, Stephane Rochoy wrote:
>>
>> it seems the soft timeout value action is never overridden for some 
>> reason.
>>
>> This kinda feels like a bug / pr ?
>
> Well, honestly I'm puzzled:
> - in one hand, watchdog.c don't seems to use wd_softtimeout_act
> - and on the other hand hardclock seems to directly call
>  watchdog_fire which just kdb_enter or panic.
>
> Note that wd_timeout_cb seems to be about both pretimeout and
> timeout handling.
>

I was able to get things to work the way I want by setting the 
pre-action timeout. But, I need to set things 'right' the first time I 
call it.  If I do


0{p9999}# watchdogd --pretimeout-action panic --softtimeout-action panic 
-t 10
0{p9999}# killall -9 watchdogd
0{p9999}# KDB: stack backtrace:
#0 0xffffffff80b7fefd at kdb_backtrace+0x5d
#1 0xffffffff80abec93 at hardclock+0x103
#2 0xffffffff80abfe8b at handleevents+0xab
#3 0xffffffff80ac0b7c at timercb+0x24c
#4 0xffffffff810d0ebb at lapic_handle_timer+0xab
#5 0xffffffff80fd8a71 at Xtimerint+0xb1
#6 0xffffffff804b3685 at acpi_cpu_idle+0x2c5
#7 0xffffffff80fc48f6 at cpu_idle_acpi+0x46
#8 0xffffffff80fc49ad at cpu_idle+0x9d
#9 0xffffffff80b67bb6 at sched_idletd+0x576
#10 0xffffffff80aecf7f at fork_exit+0x7f
#11 0xffffffff80fd7dae at fork_trampoline+0xe

0{p9999}#

0{p9999}# watchdogd --pretimeout-action panic --softtimeout 
--softtimeout-action panic -t 10
watchdogd: setting WDIOC_SETSOFT 1: Invalid argument
watchdogd: patting the dog: Invalid argument
71{p9999}#

But if I reboot the box and make sure nothing is set, and start the daemon

watchdogd --pretimeout-action panic --softtimeout --softtimeout-action 
panic -t 10

it works

0{p9999}# watchdogd --pretimeout-action panic --softtimeout 
--softtimeout-action panic -t 10
0{p9999}# killall -9 watchdogd
0{p9999}# panic: watchdog soft-timeout, WD_SOFT_PANIC set
cpuid = 0
time = 1727873819
KDB: stack backtrace:
#0 0xffffffff80b7fefd at kdb_backtrace+0x5d
#1 0xffffffff80b32bd1 at vpanic+0x131
#2 0xffffffff80b32a93 at panic+0x43
#3 0xffffffff809827bb at wd_timeout_cb+0x6b
#4 0xffffffff80b50b0c at softclock_call_cc+0x12c
#5 0xffffffff80b52355 at softclock_thread+0xe5
#6 0xffffffff80aecf7f at fork_exit+0x7f
#7 0xffffffff80fd7dae at fork_trampoline+0xe
Timeout initializing vt_vga
Uptime: 50s
Automatic reboot in 15 seconds - press a key on the console to abort


I think some of the dead ends I ran into was due to this reason. My 
stock image has watchdogd_enable="YES" and having that start up would 
set something that would then lead to dead ends. But, if I JUST start up

watchdogd --pretimeout-action panic --softtimeout --softtimeout-action 
panic -t 10

it works.  I wonder if warrants a PR for the docs at least. Anyways, 
thanks again for helping me work through all this!


     ---Mike