I don’t understand, is this the deep sleep mode? What’s the output of
cat /sys/power/mem_sleep
The other alternative you propose seems a bit risky because that would imply no watchdog monitoring during the bootloader stage.
Yes, that’s correct. It may be possible to change the bootloader to enable the WDW bit when it’s enabling the watchdog. But that would only be relevant if the issue you’re seeing is the one I reproduced, which I’m not sure is the case.
This is interesting. If it’s the same for @rodring10, then I really don’t know what’s going on. I tried to reproduce the behavior you both are describing on our Reference Image, and the only way that worked was when mem_sleep was set to s2idle, and the motive for this behavior is already described in my previous response.
From my research, the imx2_wdt driver already does exactly this. Please take a look at the imx2_wdt_suspend function:
/* Disable watchdog if it is active or non-active but still running */
static int __maybe_unused imx2_wdt_suspend(struct device *dev)
{
struct watchdog_device *wdog = dev_get_drvdata(dev);
struct imx2_wdt_device *wdev = watchdog_get_drvdata(wdog);
/* The watchdog IP block is running */
if (imx2_wdt_is_running(wdev)) {
/*
* Don't update wdog->timeout, we'll restore the current value
* during resume.
*/
__imx2_wdt_set_timeout(wdog, IMX2_WDT_MAX_TIME);
imx2_wdt_ping(wdog);
}
if (wdev->no_ping) {
clk_disable_unprepare(wdev->clk);
wdev->clk_is_on = false;
}
return 0;
}
I also checked, and in my case the function runs during suspend.
When I was looking into it, back in June, I too put a print inside static int __maybe_unused imx2_wdt_suspend(struct device *dev) to verify that function was being called, and it was.
I remember talking to a few AI bots about the issue. One idea was that something else was holding the clock active that feeds the wdt. I remember stripping my device tree right back to bare bones and I still couldn’t get it to stay asleep.
So it seems what Rodrigo and I have in common is using Buildroot rather than Yocto. I wonder if there is something set by the init system causing it not to work. I’m using BusyBox for init. I could maybe try changing to something else and see if it starts working.
I’ve also loaded in a Toradex image and verified it can sleep indefinitely, so its not related to hardware.
Thinking now, I never actually confirmed that the watchdog was actually running in the Toradex image. Rafael, could you please confirm it is running?
Yes, the watchdog is running, I also confirmed that when I was trying to reproduce the issue described here.
I think this would only make a difference if your sleep mode was s2idle. In deep mode it’s not necessary, as can be seen on our Reference Image.
During my tests, one hypothesis I had was that your kernel configuration might be different from ours. I removed the WDT driver completely from the kernel, and I still got the same result of being able to sleep indefinitely. Without the WDT driver, the module will be reset by the watchdog constantly unless it’s sleeping. The suspend function of the WDT driver will not be called in this case.
This somewhat supports the idea there’s a shared clock somewhere that’s being disabled by our Reference Image on suspend, but it’s not being disabled in your case.
Thank you @rafael.tx and @phil for your comments.
Rafael, what I meant in my previous post is that my system currently has sleep mode = deep.
I couldn’t find a solution so far.
I wonder if you guys at Toradex are applying any patches during the build process that could change the behaviour for your image compared to the one Phil and I are building with Buildroot?
I wonder if I find where exactly in the kernel it puts the system into the low power mode which should cause the watchdog to halt, and verify its actually going into that state.
Did you compare your kernel configuration to the one I posted? Have you progressed on solving this issue?
I just wanted to add that beginning on BSP 7 we started using a kernel cache configuration repository, which you can use to base your configurations on:
I run the following small script from the kernel directory to configure it for manual build. This assumes that the kernel cache repository was cloned at the home directory:
I am currently working on some other stuff right now (the application running on the Linux) but will get back to the watchdog thing in a few days and let you know.
I was able to do some quick comparisons and noticed the kernel configuration you attached has a few different settings regarding the Watchdog compared to mine so I want to modify my config file and run some tests.
Used your imx8mp.dtsi and imx8mp-verdin.dtsi files
Still no success.
To be clear, when I put it to sleep. I power it on, then simply execute echo mem > /sys/power/state , then it immediately sleeps. It then reboots 128 seconds later.
It could be related to the init system or mdev?
I will also update to using the exact same kernel and bootloader as you. I last updated 6 months ago now (3 months prior to posting this). I only just considered that it could have been something fixed in-between times.
So, on our end we have compared the kernel.config used by Toradex (attached above by Rafael) to the one we are using, and although there are some differences, it seems none of them are very relevant to the issue.
Also, because Phil has tried with the kernel configuration file provided by Rafael and it is still not working, then probably the issue is not there?
In our case we are theoretically already using same u-boot, kernel and device tree as used by Toradex on their Yocto image, with no luck.
Currently we have implemented a sleep module that sleeps in chunks of 100 secs to make sure the Watchdog doesn’t reboot the board but it is not ideal to wake up 36 times in an hour as it affects power consumption (which we still need to quantify).
Another piece to the puzzle. I loaded on the easy installer recovery kernel.
This appears to be not using systemd (/etc/init.d/ present), Linux 5.5 and it uses the imx2_wdt.c driver for the watchdog. It sets the watchdog time down to 60s too.
I verified the watchdog kernel module was running too.
This software setup can sleep indefinitely.
Linux 5.5 was quite a while ago so it can’t be a recent change that has added support.
I wonder if by chance, Rodrigo and I have something in user space that is preventing the system to going properly to sleep.
I’ll do some tests killing off everything before sleeping and will report back.
I’m also running low on ideas. The only way I could reproduce something similar to what you’re seeing is when the sleep mode was set to s2idle.
I doubt that the init system could have an influence, but at this point I wouldn’t be too surprised either.
What I understood from my analysis of the driver is the following:
On probe, it will check whether the watchdog is running and set up the watchdog subsystem to start pinging the watchdog.
I can offer you the patch I created to add some debugging information to the driver. Maybe if you add it to your kernel and post the outputs when it runs, we can get other ideas of what’s going on.
EDIT: Ignore this message, it was only working because the watchdog wasn’t actually running lol. I incorrectly assumed it would start the dog if it wasn’t running.
!!!
Okay i think I understand what is going on.
If i disable the watchdog support in u-boot, then let linux start the dog with the fsl,suspend-in-wait flag set, it sets bit 7 WDW
WDW
Watchdog Disable for Wait. This bit determines the operation of WDOG during Low Power WAIT mode.
This is a write once only bit.
0 - Continue WDOG timer operation (Default).
1 - Suspend WDOG timer operation.
What I don’t understand is this. The u-boot config you linked me earlier has the watchdog turned on and autostarting. It’s a once only write register, so how can you have WDW set?
Is there something else overriding this config as part of your yocto build? Or something in the uboot environment.
Anyway, I don’t think it really matters, but it would be good to understand.
Thanks again for your help, I really appreciate it.
I’m going to patch uboot so it sets that flag and see if that works. Will report back.