I’m able to use Torizon Cloud to evaluate OTA update feature (Torizon OS and/or docker-compose.yaml).
I noticed a situation where the OTA update cannot be completed anymore. I know it’s alittle bit strange, and maybe the behavior is “by design”, but I would like a feedback from Toradex.
Here the steps to reproduce:
Verdin iMX8M-Plus + Verdin Development Board
install Torizon OS OTA
install a custom docker-compose.yaml OTA
connect to the SoM in SSH from local development PC
edit /var/sota/storage/docker-compose/docker-compose.yaml changing the tag of one of the container images
try to push a new docker-compose.yaml update OTA from Torizon Cloud
The update process stays forever in this stage
Hi, I’m the Toradex AI Assistant. The Toradex Support team will follow up with you shortly!
Did you like my answer?
The behavior you’re experiencing is likely due to the manual modification of the docker-compose.yaml file on the device. When you manually edit the docker-compose.yaml file in /var/sota/storage/docker-compose/, it can cause inconsistencies with the state expected by the Torizon OTA update system. Here’s a detailed explanation:
Why This Happens:
State Mismatch: The Torizon OTA system, which uses Aktualizr and OSTree, maintains a specific state of the system and its components. When you manually change the docker-compose.yaml file, it creates a state that the OTA system does not recognize.
Integrity Checks: The OTA system performs integrity checks to ensure that the updates are applied correctly and securely. Manual changes can cause these checks to fail or hang, as the system is trying to reconcile the expected state with the actual state.
Steps to Avoid This Issue:
Avoid Manual Changes: Do not manually edit the docker-compose.yaml file directly on the device. Instead, make all changes through the Torizon Cloud interface or the appropriate OTA mechanisms.
Revert Manual Changes: If you have already made manual changes, revert them to the original state before attempting another OTA update.
How to Recover:
Revert to Original docker-compose.yaml: SSH into the device and revert the docker-compose.yaml file to its original state before the manual edit.
Reboot the Device: Sometimes, a simple reboot can help the system to reinitialize and clear any temporary inconsistencies.
Retry the OTA Update: Once the device is back to its expected state, retry the OTA update from Torizon Cloud.
Additional Recommendations:
Use TorizonCore Builder: For creating and managing custom docker-compose.yaml files, use TorizonCore Builder or the Torizon IDE extensions. This ensures that all changes are tracked and managed correctly.
Monitor Logs: Check the logs of Aktualizr and Docker to get more insights into what might be causing the update to hang. This can be done using commands like journalctl -u aktualizr and docker logs <container_id>.
If the issue persists even after following these steps, it might be worth reaching out to Toradex support with detailed logs and steps you have taken for further assistance.
Please, also check if these links can help you solve your problem:
I noticed a situation where the OTA update cannot be completed anymore.
When you say “anymore” are you implying that you were able to perform successful updates before, but not they’re not completing for some reason?
edit /var/sota/storage/docker-compose/docker-compose.yaml changing the tag of one of the container images
This might not be important. But is there a reason you manually edited the file if you were going to perform an OTA update right after?
Is this expected?
Well there isn’t any detailed info here to show what is going on. We’ll need the logs from the Aktualizr update client before we can examine anything in detail.
Yes, before changing manually docker-comppose.yaml the OTA updates completed successfully.
I’m in the development phase, so I have one OS image that I download from the cloud (to speed-up the process of installing a given image).
Then I need to change someting, testing something else, and so on.
For this reason I chnage something manually on the device.
Then I integrate these changes in a new release of the image and I would like installing it from OTA.
Okay, so let me summarize to make sure I understand. You’re saying that after you manually modified /var/sota/storage/docker-compose/docker-compose.yml, the update no longer worked, but once you reverted the modification the update worked again? Is that right?
Well, I tried to reproduce this behavior by manually modifying the file on my own system and then performing an update. On my side it seems to work fine. If I manually modify the compose file it does not seem to affect the success of the update. After the update the new compose file from Torizon Cloud correctly replaces the old compose file.
Here is the log
Is this the log from when the update was occurring? I don’t see anything in your logs related to an update. It just looks like the logs for Aktualizr while it’s idle waiting for an update.
Could you get the logs that are produced during the update process.
As soon as I manually reverted docker-compose.yaml on the device to the original one
Just to clarify you’re saying *.yaml. But, on Torizon OS it’s actually *.yml. Are you just mistyping, or are you actually creating a new file with *.yaml instead?
No. It’s the log when the Torizon Cloud platform show that an update is in progress, but the aktualizr log shows no update is in progress.
Interesting question. I realized that on the device the file is .yml (and I edited this one). But on the Torizon Cloud I loaded a file .yaml (it’s allowed).
I hope this is fine.
I tried again this morning and I was able to complete the update successfully.
I don’t know what happened last week (more than once).
I let you know if I will find a way to reproduce the issue.
Interesting question. I realized that on the device the file is .yml (and I edited this one). But on the Torizon Cloud I loaded a file .yaml (it’s allowed).
I hope this is fine.
This should not matter what you upload it with. It’s just whenever the file gets downloaded to the device it will always be *.yml by default. I just wanted to make sure you were editing the correct file.
I tried again this morning and I was able to complete the update successfully.
I don’t know what happened last week (more than once).
I let you know if I will find a way to reproduce the issue.
That’s strange. Some kind of transient issue? Or maybe another factor/variable that we aren’t considering there?
With regards to manually modifying the compose file, it looks like this shouldn’t affect the update. In my test it had no effect at least. Well do let me know if you find any further info, otherwise this is a bit of a dead-end for me regarding paths to investigate.