Chromium long running sessions results in crashes on Verdin Imx8m-mini

Hello,

We have been experiencing occasional Chromium kiosk crashes on our Verdin imx8m-mini systems.
How it manifests is that after a certain amount of time (between 3-7 days), we would see that the browser would be showing the “aw snap” screen and the previously loaded web app has been killed.
Currently, we are experiencing with a couple of alternate setups running various web-apps (google, youtube, internal ones), as well as many different variations of system services running in parallel, to try to isolate where the problematic case is.

It is very likely that every instance of the “aw snap” crash we have experienced is caused by a system out-of-memory oom kernel warning, which would then terminate some running Chromium subprocess to free up the system RAM.

What we are interested in:
Has Toradex perhaps ran prolongued stress/stability tests on the Verdin imx8m-mini module (2GB) , and has some info on verification when it comes to verifying Chromium/Weston docker stack?
Namely as:

  • Successfully ran for 2 weeks without issues/memory leaks
  • Expected overall RAM/CPU usage on avg

Image versions we have internally used are:

  • torizon/weston-vivante:2
  • torizon/kiosk-mode-browser-vivante:2.3.1

Any info would be welcome!

Greetings @ebrodlic,

While we do test our containers for basic functionality. We currently don’t do any long term stability testing on these containers. Definitely nothing in the range of a week of operation. That said it’s a possibility that such issues like this may exist.

It is very likely that every instance of the “aw snap” crash we have experienced is caused by a system out-of-memory oom kernel warning, which would then terminate some running Chromium subprocess to free up the system RAM.

Just to clarify is this your suspicion, or have you confirmed that low memory is an issue when Chromium crashes?

In the meantime I’ll report this issue internally for further investigation. Though further details would be appreciated. For example were you just running Chromium idle on a web page for a number of days and observed it crashed? Or was there any specific workload that causes the issue to occur?

On a final note I recall in the past you had issues with the alternative Cog web browser. Since then there have been some improvements to this browser. Have you had the chance to evaluate the improvements? The Cog browser should be lighter weight than Chromium, assuming the issue really is memory consumption.

Also I noticed you’re still using the older kiosk-mode-browser-vivante container image. We’ve since separated this container into a dedicated container for Chromium and Cog each. With further improvements and optimizations for both: Web Browser / Kiosk Mode with TorizonCore | Toradex Developer Center

Have you tried the newer container images for Chromium? Or are you still experiencing the same issues with this?

Best Regards,
Jeremias

Hello @jeremias.tx

Just to clarify is this your suspicion, or have you confirmed that low memory is an issue when Chromium crashes?

I have confirmed that there have been crashes where dmesg lists kernel showing oom occured and thus terminating some of the chromium process tree.
What I am currently trying to identify is:
1 - Is it the same pattern absolutely every time. (Currently it seems as a very high correlation).
2 - How much memory usage increase is there actually going on, and what might be the trigger

We have ran a couple of different pages, some without any interaction at all and managed to get the problem. Pages we mostly used are your regular google.com, youtube.com and our internal built app.

I will be adding to this ticket all of the test data that I collect in order to help explain whats going on better.
As for Cog - in our current stage we had advantages offered via Chrome namely in multimedia and DevTools so we decided to stick with it until we revise.

Also I noticed you’re still using the older kiosk-mode-browser-vivante container image. We’ve since separated this container into a dedicated container for Chromium and Cog each.

Yes, I have seen that there is a newer version available, but have not yet tried it.
It is in our pipeline to do so relatively quickly, so hopefully I can expand upon that as well after the tests are performed.

Interesting findings so far, we appreciate the testing you’re doing on your side. It will help our own investigations/work here.

Please continue to share any further findings you discover during your testing. I’ll raise this out of memory Chromium issue internally with the team and see if there’s anything that can be done.

Best Regards,
Jeremias

Did you ever get a chance to run these further tests and further characterize the issue?

Hi @ebrodlic !

Do you have any news about this topic?

Best regards,