Offline Update and short-lived containers (TOR-2304)

nclark · February 20, 2025, 12:46pm

I have added a short-lived container to my application and it does not appear in my docker image list after applying an offline update. After some research, I now understand this is to be due to TOR-2304.

An example workaround is to add a sleep to the container to prevent it from ending before the update process completes.

However, how do I know how long the “sleep” period should be?
Presumably the time taken to pruning the redundant images will depend upon the number and size of each image. So a fixed sleep may not be suitable.

Is there a method to determine when the offline update process has completed?

ToradexAI · February 20, 2025, 12:47pm

Hi, I’m the Toradex AI Assistant. The Toradex Support team will follow up with you shortly!

Did you like my answer?

Hello,

To effectively determine when an offline update process has completed on your Torizon system, you can monitor the logs generated by Aktualizr, the update client responsible for managing updates. These logs provide detailed information about the start and completion of update processes and can be accessed using the journalctl command. This approach allows you to track the progress and completion of updates in real-time.

For managing short-lived containers during updates, consider implementing a dynamic approach rather than relying on a fixed sleep duration. You can create a script or use systemd services to monitor the update process. For instance, your script could check the logs for a specific completion message or monitor the status of the update-related systemd service.

Additionally, incorporating health checks or watchdog mechanisms within your container can help ensure it remains active until the update process is confirmed complete. This strategy can prevent the container from terminating prematurely.

By using these monitoring techniques, you can more accurately determine when the update process has finished and manage your short-lived containers accordingly.

Best regards,
Toradex AI Assistant.

Please, also check if these links can help you solve your problem:

Offline updates not searching for lockbox.

jeremias.tx · February 24, 2025, 7:13pm

Greetings @nclark,

However, how do I know how long the “sleep” period should be?

So the sleep period should only need to cover the time between the update process doing docker-compose up and then the following docker system prune. I would say a sleep on the range of 1min should cover nearly all cases unless your container stack takes a lot of time to process for some reason. Could bump the sleep up to 5 or 10mins to be very sure, I can’t imagine any practical container stack taking that long between the up command and the prune command, as these commands are ran one after the other in the update process.

Another, method is to have your short-lived container run something like tail -F anything. This will keep the container running forever, essentially turning it into a long-lived container. That said,
this container would then keep running, never actually exiting and stop. If you want the container to eventually exit and stop then the sleep method is probably preferred. With the sleep method once the duration of the sleep has passed the container will stop and exit as expected.

Is there a method to determine when the offline update process has completed?

Officially, the update is considered successful when the new docker-compose.yml file takes it’s place at /var/sota/storage/docker-compose/docker-compose.yml in the device filesystem.

Best Regards,
Jeremias