How to use 'systemd' to implement a system-level watchdog timer

Chuck_Evergreen · November 20, 2018, 3:17am

All,

We are developing a product which will be remotely located; therefore, we require a robust method to recover kernel panics, a job usually performed by a software function which periodically “kicks” a watchdog timer hardware element so that it does not automatically perform a system reboot.

We propose to use the capabilities of ‘systemd’ to automatically perform this watchdog function. According to the ‘systemd’ documentation found here, setting ‘RuntimeWatchdogSec’ in ‘systemd.conf’ (inside /etc/systemd/) to a non-zero value, say 20 (sec), will cause the i.MX6DL to enter the shutdown / reset state if the kernel stops running for at least 10 seconds.

Questions:

Is our understanding of this ‘systemd’ function correct?
Has anyone else implemented a system-level watchdog timer using ‘systemd’?
In order to test this functionality, what is the best way to “kill” the kernel?
Can anyone provide implementation DOs or DON’Ts?

Thank you,

-Chuck

marcel.tx · November 20, 2018, 7:57am

Answers: 1) Yes, we do believe so. 2) I don’t think so. I’m also not sure what exact systemd version is required and whether or not ours has that functionality enabled at all. 3) If sysrq is enabled do echo c > /proc/sysrq-trigger otherwise just killing init may do kill -9 1. 4) Not that I can think of right now.

Chuck_Evergreen · November 23, 2018, 4:42am

Marcel,

In the ‘system.conf’ file (within /etc/systemd/), I have set ‘RuntimeWatchdogSec=20’ which should cause the hardware to enter the reset state after 10 seconds whenever the kernel stops, ‘panics’. Unfortunately, since ‘sysrq’ is not enabled the SoM and ‘kill -9 1’ does not stop the kernel, I have no way to test whether or not this watchdog time scheme works or not.

Any other suggestions as to how I may create a ‘kernel panic’ after boot up and startup of our autorun application?

Thanks,

-Chuck

jaski.tx · November 23, 2018, 1:36pm

hi @Chuck_Evergreen, you should enable sysrq for the kernel.

Chuck_Evergreen · November 23, 2018, 3:27pm

Jaski,

Can I enable ‘sysrq’ by loading a module at runtime, if so, can you tell me how best to do that? Or do I have to rebuild the kernel?

Thank you,

-Chuck

philippe.tx · November 23, 2018, 4:33pm

Hi @Chuck_Evergreen

Yes to use echo c > /proc/sysrq-trigger to panic your kernel you need to rebuild your kernel. You can find a guideline here how you can do that.

after you did the defconfig of your board make sure to enter the menuconfig with make menuconfig. Hit / (slash). This will open a search console. Enter MAGIC_SYSRQ. Jump to the setting with 1. Enable the setting with space and exit menuconfig. You can then proceed with building the kernel as it is described in the guideline I linked.

Chuck_Evergreen · November 23, 2018, 4:42pm

Philippe,

Thanks for this…

-Chuck

jaski.tx · November 26, 2018, 7:13am

You are welcome.