OTA update doesn't complete from Torizon Cloud

I’m able to use Torizon Cloud to evaluate OTA update feature (Torizon OS and/or docker-compose.yaml).
I noticed a situation where the OTA update cannot be completed anymore. I know it’s alittle bit strange, and maybe the behavior is “by design”, but I would like a feedback from Toradex.
Here the steps to reproduce:

  • Verdin iMX8M-Plus + Verdin Development Board
  • install Torizon OS OTA
  • install a custom docker-compose.yaml OTA
  • connect to the SoM in SSH from local development PC
  • edit /var/sota/storage/docker-compose/docker-compose.yaml changing the tag of one of the container images
  • try to push a new docker-compose.yaml update OTA from Torizon Cloud
    The update process stays forever in this stage

Is this expected?

Hi, I’m the Toradex AI Assistant. The Toradex Support team will follow up with you shortly!

Did you like my answer?

:+1: :-1:


The behavior you’re experiencing is likely due to the manual modification of the docker-compose.yaml file on the device. When you manually edit the docker-compose.yaml file in /var/sota/storage/docker-compose/, it can cause inconsistencies with the state expected by the Torizon OTA update system. Here’s a detailed explanation:

Why This Happens:

  1. State Mismatch: The Torizon OTA system, which uses Aktualizr and OSTree, maintains a specific state of the system and its components. When you manually change the docker-compose.yaml file, it creates a state that the OTA system does not recognize.
  2. Integrity Checks: The OTA system performs integrity checks to ensure that the updates are applied correctly and securely. Manual changes can cause these checks to fail or hang, as the system is trying to reconcile the expected state with the actual state.

Steps to Avoid This Issue:

  1. Avoid Manual Changes: Do not manually edit the docker-compose.yaml file directly on the device. Instead, make all changes through the Torizon Cloud interface or the appropriate OTA mechanisms.
  2. Revert Manual Changes: If you have already made manual changes, revert them to the original state before attempting another OTA update.

How to Recover:

  1. Revert to Original docker-compose.yaml: SSH into the device and revert the docker-compose.yaml file to its original state before the manual edit.
  2. Reboot the Device: Sometimes, a simple reboot can help the system to reinitialize and clear any temporary inconsistencies.
  3. Retry the OTA Update: Once the device is back to its expected state, retry the OTA update from Torizon Cloud.

Additional Recommendations:

  • Use TorizonCore Builder: For creating and managing custom docker-compose.yaml files, use TorizonCore Builder or the Torizon IDE extensions. This ensures that all changes are tracked and managed correctly.
  • Monitor Logs: Check the logs of Aktualizr and Docker to get more insights into what might be causing the update to hang. This can be done using commands like journalctl -u aktualizr and docker logs <container_id>.

If the issue persists even after following these steps, it might be worth reaching out to Toradex support with detailed logs and steps you have taken for further assistance.

Please, also check if these links can help you solve your problem:

Verdin iMX8M Plus - Mallow carrier board and 10 inch DSI display.

Greetings @vix,

I noticed a situation where the OTA update cannot be completed anymore.

When you say “anymore” are you implying that you were able to perform successful updates before, but not they’re not completing for some reason?

edit /var/sota/storage/docker-compose/docker-compose.yaml changing the tag of one of the container images

This might not be important. But is there a reason you manually edited the file if you were going to perform an OTA update right after?

Is this expected?

Well there isn’t any detailed info here to show what is going on. We’ll need the logs from the Aktualizr update client before we can examine anything in detail.

Best Regards,
Jeremias

Yes, before changing manually docker-comppose.yaml the OTA updates completed successfully.

I’m in the development phase, so I have one OS image that I download from the cloud (to speed-up the process of installing a given image).
Then I need to change someting, testing something else, and so on.
For this reason I chnage something manually on the device.
Then I integrate these changes in a new release of the image and I would like installing it from OTA.

Here is the log

torizon@verdin-imx8mp-14777722:~$ sudo journalctl -f -u aktualizr* -b --no-pager
Sep 23 06:39:28 verdin-imx8mp-14777722 aktualizr-torizon[976]: Certificate issuer: CN=ota-devices-CA
Sep 23 06:39:28 verdin-imx8mp-14777722 aktualizr-torizon[976]: Certificate valid from: Sep 16 13:00:43 2024 GMT until: Sep 16 13:00:43 2124 GMT
Sep 23 06:39:28 verdin-imx8mp-14777722 aktualizr-torizon[976]: PROXY: initializing...
Sep 23 06:39:28 verdin-imx8mp-14777722 aktualizr-torizon[976]: PROXY: using TCP port 8850.
Sep 23 06:39:28 verdin-imx8mp-14777722 aktualizr-torizon[976]: Offline Updates are disabled
Sep 23 06:39:28 verdin-imx8mp-14777722 aktualizr-torizon[976]: PROXY: starting thread.
Sep 23 06:39:28 verdin-imx8mp-14777722 aktualizr-torizon[976]: PROXY: listening to connections...
Sep 23 06:39:30 verdin-imx8mp-14777722 aktualizr-torizon[976]: Event: SendDeviceDataComplete
Sep 23 06:39:30 verdin-imx8mp-14777722 aktualizr-torizon[976]: Event: UpdateCheckComplete, Result - No updates available
Sep 23 06:44:30 verdin-imx8mp-14777722 aktualizr-torizon[976]: Event: UpdateCheckComplete, Result - No updates available

But on torizon.io dashboard I see

As soon as I manually reverted docker-compose.yaml on the device to the original one (last updated successfully from OTA), now everything is ok:

  • same journalctl log on the device
  • this image on the cloud dashboard
    immagine

Okay, so let me summarize to make sure I understand. You’re saying that after you manually modified /var/sota/storage/docker-compose/docker-compose.yml, the update no longer worked, but once you reverted the modification the update worked again? Is that right?

Well, I tried to reproduce this behavior by manually modifying the file on my own system and then performing an update. On my side it seems to work fine. If I manually modify the compose file it does not seem to affect the success of the update. After the update the new compose file from Torizon Cloud correctly replaces the old compose file.

Here is the log

Is this the log from when the update was occurring? I don’t see anything in your logs related to an update. It just looks like the logs for Aktualizr while it’s idle waiting for an update.

Could you get the logs that are produced during the update process.

As soon as I manually reverted docker-compose.yaml on the device to the original one

Just to clarify you’re saying *.yaml. But, on Torizon OS it’s actually *.yml. Are you just mistyping, or are you actually creating a new file with *.yaml instead?

Best Regards,
Jeremias

Yes

No. It’s the log when the Torizon Cloud platform show that an update is in progress, but the aktualizr log shows no update is in progress.

Interesting question. I realized that on the device the file is .yml (and I edited this one). But on the Torizon Cloud I loaded a file .yaml (it’s allowed).
I hope this is fine.

I tried again this morning and I was able to complete the update successfully.
I don’t know what happened last week (more than once).
I let you know if I will find a way to reproduce the issue.

Interesting question. I realized that on the device the file is .yml (and I edited this one). But on the Torizon Cloud I loaded a file .yaml (it’s allowed).
I hope this is fine.

This should not matter what you upload it with. It’s just whenever the file gets downloaded to the device it will always be *.yml by default. I just wanted to make sure you were editing the correct file.

I tried again this morning and I was able to complete the update successfully.
I don’t know what happened last week (more than once).
I let you know if I will find a way to reproduce the issue.

That’s strange. Some kind of transient issue? Or maybe another factor/variable that we aren’t considering there?

With regards to manually modifying the compose file, it looks like this shouldn’t affect the update. In my test it had no effect at least. Well do let me know if you find any further info, otherwise this is a bit of a dead-end for me regarding paths to investigate.

Best Regards,
Jeremias

1 Like

I suspect this is the case, but it’s up to me finding what.
In case I find what happened I let you know.
In the meanwhile, thank you for your support.

Okay then, do please inform us if you uncover anything new here.

Best Regards,
Jeremias