How to avoid metadata writes to eMMC from docker-engine with read-only containers

Hi,

We’re using Torizon OS with containers for a few years. One thing bothered me the whole time, and that’s what I’m looking for advice on:
All metadata of the containers is written to the eMMC by the Docker engine. That means on each state change of a container (e.g. start/stop), something is written.
Because health checks are basically just another container to start on each check, they can become write intensive (with a 5s interval, we got several GBs written to eMMC per day).

Nowadays all our containers are defined as read-only, health checks are managed on application level, and we don’t care about having Docker metadata kept across power cycles (except for container images and the docker-compose definition). But still, on each start of the system, several MBs are written.
Do you have any advice on Torizon OS (6.8 or greater) to get rid completely of all writes to eMMC from the Docker engine by, e.g. moving only non-container image overlays to an ephemeral place (tmpfs, although limited)?
Moving Docker data to external storage (SD card) could be an option (as written in How to Store Docker Data on an External Storage Device (USB/SD Card) | Toradex Developer Center ), but a) it will probably break the OTA/Offline update mechanism from the Torizon cloud and b) we use the SD card for other things

Thanks for hints and ideas,
BR
Lukas

Hi, I’m the Toradex AI Assistant.

I might make mistakes or occasionally get you wrong. Our support team will step in if needed.


Hello Lukas,

I understand your concern about Docker metadata writes to eMMC, especially with frequent health checks causing wear on the storage. Unfortunately, the available sources don’t contain specific guidance for completely eliminating Docker metadata writes to eMMC in Torizon OS while maintaining OTA functionality.

What the Documentation Covers

The sources mention storing Docker data on external storage devices (1), but as you correctly noted, this approach could potentially interfere with OTA updates and your SD card usage requirements.

The documentation also covers container registry management and pull-through cache solutions (1) to reduce network-related container operations, but these don’t address the core metadata write issue you’re experiencing.

Current Limitations

Based on the available information, there isn’t a documented method in Torizon OS to:

  • Move only Docker metadata to tmpfs while keeping container images on eMMC
  • Selectively redirect Docker engine writes to ephemeral storage
  • Maintain full OTA compatibility while eliminating all Docker metadata writes

Recommendation

For your specific use case involving frequent health checks and eMMC wear concerns on Torizon OS 6.8 or greater, I’d recommend reaching out to our community for more detailed guidance, as this appears to be an advanced customization scenario that may require specific configuration not covered in the standard documentation.

You can visit our Community to discuss this with our engineering team and other users who might have faced similar challenges.

Would you like me to help you with any other aspects of container management or Torizon OS configuration that are covered in our documentation?

Best regards,
Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.

Hello @lzaugg_parametric,

Moving Docker data to an external storage should be an actual alternative.
The OTA/offline updates should work normally, as long as the external media is connected.
If you plan to use the SD card for other things, it would likely not be the best option tough.

There is no way to do this without making changes to some of the system components, such as Docker.
The docker storage needs to be in a set of places, with a mix of persistent data, such as container images, and state data.

Having a tmpfs on relevant Docker directories would require reconfiguration and re-pulling of containers on each boot.
One idea I had to avoid this was to configure a ram-backed overlayfs on top of /var/lib/docker, but that would stop the updates from working.

Are you sure the writes that you see are due to docker state data and not some other cause?
Docker logs could use a different driver to avoid writing on the disk, if you are not already doing that.

Another point that is worth checking is the expected lifetime of your device with the current write load.
If the eMMC can tolerate the current write cycles for over 10 years, the current configuration may be good enough, but this is a calculation that needs to be done depending on how much the device is expected to be on the field.

Best Regards,
Bruno

Hi @bruno.tx,

Thanks for your reply.

I am confident in the writes of Docker. This is why we have investigated further (checking eMMC vs. process writes) and arrived at the current docker-compose configuration.

There are two reasons for eliminating Docker state writes (except image pulls) for our use cases:

  1. eMMC lifetime
    This should be managed now with read-only containers and no health checks on our side. Even with some writes (due to e.g. container start), the wear can be neglected and the eMMC should last for 10 years.
    However, it is concerning that a regular Torizon OS with approximately 10 Docker services and some health checks running every 5 seconds results in significant eMMC wear (no writing to eMMC from the application side at all); not to mention the additional I/O pressure. We have a system at a customer site that has been running for almost a year (with the health check configuration still in place) and it currently shows an eMMC lifetime of 50%. The Docker logger is set to journald and journald logs to tmpfs afaik (default Torizon OS).

  2. system resilience to power loss
    As long as the system writes data in larger chunks to eMMC on startup (focusing on the Docker engine at present), the possibility of a corrupt file (hopefully not the fs) exists.
    We experienced some corrupt Docker states due to a power loss during the start of the docker-compose services once.

In summary, we have the following options:
a) using the SD Card as the Docker storage: it appears to be the only immediate solution (thank you for clarifying that OTA/Offline updates should work).
b) a custom solution with a mix of overlay tmpfs and persistent mounts within the Docker data-root: this seems like a trial-and-error solution that requires engineering first and will likely break Torizon Updates.
c) not using Docker

Currently we are considering option c) for our regular system which involves either developing our own application updating and management mechanism or moving our applications into the BSP itself.

In that regard, I would like to understand why Toradex lists the Torizon OS minimal BSP as a reference BSP only and not for production use.

Additionally, I would like to know which other services (apart from the Docker engine) write to the eMMC regularly in the default Torizon OS configuration. I am aware of systemd-timesyncd, aktualizr and NetworkManager.

Please let me know if you have any other suggestions. Your input is greatly appreciated.

Best regards,
Lukas

Hello @lzaugg_parametric,

Thanks for the additional information.

This is correct for the default configuration in Torizon OS.


From your description, the main issue at this point are the corrupted docker states that you can experience after power losses, as the eMMC wear was mitigated to acceptable levels.
Considering that your containers are not persistent, removing all containers on start could be an alternative.
The images would be kept, so the containers can be started again.
Otherwise, going the non-docker route is also a valid route.


Torizon OS minimal is listed as a reference image and not production ready because customizing it with Yocto is required.
Pre-built Torizon OS with docker images can be used with little customization on TorizonCore Builder.
On the other hand, pre-built Torizon OS minimal images should never be used in production, requiring more relevant customizations using Yocto.

If you want to go the no-docker route, Torizon OS Minimal should be the easiest way to go.
It does require some Yocto knowledge, but would allow you to keep devices with the current update system for the OS.
The main difference is that the application will be part of this OS update.
It would also be possible to come up with another approach, such as using subsystem updates.


We do not have that information completely defined at this point.
Further testing would be needed to give an exhaustive list.
If you want to check on your system, using a tool such as iotop should help with that.

Best Regards,
Bruno

1 Like