Updating my image through Toradex Cloud failed

tarekmhnds · July 3, 2025, 8:58am

Hello,

I tried so many times to download my Image from My Application at Toradex Cloud to my Apalis imx8 board - with ixora carrier - but it always failed.

After a few seconds of starting the update, this always happens: the device goes offline, and shortly after that the update fails, even though the device remains connected to the network from my side.

My image was successfully downloaded after being uploaded about two weeks ago on only one Apalis board.

I have Torizon OS 6 on my board. Updating os to:
kirkstone/apalis-imx8/torizon/torizon-core docker/release
6.2.0+build.2
succeeded .

I tried to replace the Apalis with another one that I have, and I tried to replace the carrier board Ixora, but the image update is always failed.

Also, I tried to disconnect my device form Torizon cloud, then reconnect it, but the same result.

How can I solve this problem please?
What should I look for or check?

Best regards,
Tarek

ToradexAI · July 3, 2025, 8:58am

Hi, I’m the Toradex AI Assistant.

I might make mistakes. Our support team will step in if needed. If I answered your question, please tick the “Solution” checkbox. For follow-up questions to me, please use @ToradexAI in your post.

Did you like my answer?

Hello Tarek,

Thank you for your detailed description of the issue. It looks like you are experiencing repeated OTA update failures with your Apalis iMX8 board, even after troubleshooting steps like switching hardware and reconnecting to Torizon Cloud.

Given what you described, here are some steps and checks you can perform:

Network Stability:
Even if your device seems connected from your side, please make sure the board has continuous network access throughout the entire update. Intermittent network dropouts can cause the OTA update process to fail since Torizon OS devices poll for updates roughly every 5 minutes and require a stable connection during the process (Reference).
System Logs:
Collect and analyze logs from your device during and after the update attempt. You can run the following command via SSH to check logs related to the Aktualizr client (the update agent):
```
journalctl -u aktualizr.service
```
Also, inspect general kernel and system logs:
```
dmesg
journalctl -xe
```
Power Supply:
Unstable power during the update process can make the device go offline and roll back the update. Please double-check your power supply for voltage drops or interruptions.
Image Signature and Integrity:
Torizon OTA requires images to be signed and verified for integrity and authenticity (Reference). If your uploaded image has expired signatures or has been corrupted, the update will fail. It is worth trying to re-upload a fresh version of your application/container to Torizon Cloud and trigger the deployment again.
Device Status on Torizon Cloud:
After a failed update, check the device status and any error messages in the Torizon Cloud web interface. Sometimes, the Cloud dashboard will show more specific logs or failure reasons.
Rollback Behavior:
When an OTA update fails, Torizon OS uses atomic updates with built-in rollback for robustness. This means failed updates won’t brick your system, but continual failures indicate an underlying issue to be resolved (Reference).
Other Troubleshooting:
- Restart the board and retry the update after a clean boot.
- Check for firewall/proxy on your network that could interfere with TLS or HTTPS connections, which are required for secure OTA updates.
- Confirm you are using a supported OS version and that your image is compatible with Torizon OS 6.

If you still cannot determine the cause from the steps above, please provide:

Any error messages from the update attempts (from system logs or Torizon Cloud dashboard)
The output of journalctl -u aktualizr.service
A description of your network setup (direct connection, behind NAT, VPN, etc.)

Here are some helpful reference links for further reading:

Let me know what you discover or if you can share more logs or information.

Best regards,
Toradex AI Assistant.

tarekmhnds · July 3, 2025, 11:14am

Hello @ToradexAI ,

Offline state appears at the Torizon Cloud while update in progress.

For your steps:

1- I have a stable network with 50Mbps download speed.

2- journalctl -u aktualizr.service gives:
-- No entries --.

I checked journalctl -xe during the update and after it:
During the update everything was correct: It was downloading the packages and extracting them, but after the update failed I got these messages:

Jul 03 10:56:18 apalis-imx8-15610356 aktualizr-torizon[5402]: Total reclaimed space: 3.614GB
Jul 03 10:56:18 apalis-imx8-15610356 aktualizr-torizon[1069]: Event: InstallTargetComplete, Result - Error
Jul 03 10:56:18 apalis-imx8-15610356 aktualizr-torizon[1069]: Event: AllInstallsComplete, Result - docker-compose:INSTALL_FAILED
Jul 03 10:56:18 apalis-imx8-15610356 aktualizr-torizon[1069]: Update install completed. Releasing the update lock…
Jul 03 10:56:19 apalis-imx8-15610356 aktualizr-torizon[1069]: Event: SendDeviceDataComplete
Jul 03 10:56:19 apalis-imx8-15610356 aktualizr-torizon[1069]: Current versions in storage and reported by OSTree do not match
Jul 03 10:56:19 apalis-imx8-15610356 aktualizr-torizon[1069]: Event: PutManifestComplete, Result - Success
Jul 03 10:56:22 apalis-imx8-15610356 aktualizr-torizon[1069]: PROXY: sending device data to Torizon OTA.
Jul 03 10:56:22 apalis-imx8-15610356 aktualizr-torizon[1069]: Event: SendDeviceDataComplete
Jul 03 10:56:49 apalis-imx8-15610356 aktualizr-torizon[1069]: Current versions in storage and reported by OSTree do not match
Jul 03 10:56:50 apalis-imx8-15610356 aktualizr-torizon[1069]: Event: UpdateCheckComplete, Result - No updates available
Jul 03 10:57:20 apalis-imx8-15610356 aktualizr-torizon[1069]: Current versions in storage and reported by OSTree do not match
Jul 03 10:57:21 apalis-imx8-15610356 aktualizr-torizon[1069]: Event: UpdateCheckComplete, Result - No updates available
Jul 03 10:57:51 apalis-imx8-15610356 aktualizr-torizon[1069]: Current versions in storage and reported by OSTree do not match

3- Power supply is stable.

4- I’m not sure if this step is important, because I tried this image previously on one Apalis board.

5- After a failed update, the device status become back: Online on torizon cloud, with Update failed message.

6- The board didn’t restart or rollback after the update fails.

7- * I restarted my board and even changed it many times.
* I’m not sure about the firewall settings that I should have, I have windows 11 with normal settings, and I don’t have proxy on my network.
* My image was tested with Torizon OS 6, the same os that I have now on my Apalis board.

Do you have any ideas about the messages from journalctl -xe please?
Or do you have any other ideas to solve this problem?

Best regards,
Tarek

ToradexAI · July 3, 2025, 11:15am

Thanks for your response.

Please remember to tick the “Solution” checkbox below if my response solves your question. If you have follow-up questions, just continue to tag me using @ToradexAI in your new posts.

Did you like my answer?

Hello Tarek,

Thank you for the detailed information and for providing the log messages. The messages you are seeing indicate that the update is failing during the container deployment phase, specifically with “docker-compose:INSTALL_FAILED” and the warning “Current versions in storage and reported by OSTree do not match”. This suggests an issue at the application container installation/update stage.

Here’s how to proceed:

1. Diagnose Docker and Container State

The message “Total reclaimed space: 3.614GB” implies old containers/images were cleaned up, which is expected during an update.
The mismatch between “storage and OSTree” versions typically hints that the system expects a different state than what is actually present. Most often, this is triggered if part of the update completes (like downloading images), but part fails (like deploying the new application stack).

Action Items:

Check if Docker is running properly. Log in to your board and run:
```
systemctl status docker
docker ps -a
```
Check for errors in the Docker daemon logs:
```
journalctl -u docker.service
```

2. Check for Application/Compose File Issues

Sometimes, the docker-compose file may reference images or configuration that are not compatible with the current system, leading to deployment failures.
Test by deploying a very simple container (such as hello-world or a minimal busybox container) to isolate if the problem is with your application or with the system as a whole.

3. Resource Constraints

Ensure you still have enough free space for the deployment, even after reclaiming. Check disk usage with:
```
df -h
```
Confirm that memory and CPU resources aren’t exhausted by running:
```
free -h
top
```

4. Attempt a Simple Deployment

From Torizon Cloud or locally, try to deploy a minimal application container using a simple docker-compose.yml. If this succeeds, the issue is likely within your application compose file or images.

5. Double-Check Network

Even with a stable connection, it’s worth double-checking for potential interruptions during the download. Check if there are timeouts or connectivity drops logged in journalctl -u aktualizr.service (even if there are no entries now, look again right after triggering an update).

6. Next Steps

If the above steps do not clarify the issue, please provide the following information:

The contents of your docker-compose.yml file.
The output of docker ps -a and df -h directly after a failed update.
Any error messages from journalctl -u docker.service during the update attempt.

Further Reading and References:

These will help you understand more about rollback and service reliability mechanisms.

Let me know the outcomes or if you can share more detail from your compose file and docker logs. This can help identify if the problem is with the container images, the compose file, or system resource/execution state.

Best regards,
Toradex AI Assistant.

tarekmhnds · July 3, 2025, 12:09pm

Hello @ToradexAI ,

I found the issue:

At my docker-compose.yml at the volumes part I have these 2 binds:

- source: /home/torizon/settings.json
  target: /home/torizon/settings.json
  type: bind
- source: /home/torizon/VanSystemLog.txt
  target: /home/torizon/VanSystemLog.txt
  type: bind

These 2 files for settings and log.
I found that I should create these 2 files at my board at the path:
/home/torizon
before I download my image from Torizon cloud, otherwise the update fails.

Best regards,
Tarek

ToradexAI · July 3, 2025, 12:10pm

Thanks for your response.

Please remember to tick the “Solution” checkbox below if my response solves your question. If you have follow-up questions, just continue to tag me using @ToradexAI in your new posts.

Did you like my answer?

Hello Tarek,

Thank you for sharing your solution. Yes, when using bind mounts in Docker, the source files or directories must exist on the host system before mounting them. If the source files don’t exist, Docker cannot create the bind mount and the container deployment fails (1).

For future reference, here are two approaches you could consider:

Create required files before deployment (your current solution)
Use named volumes instead of bind mounts, which are automatically created by Docker if they don’t exist (2)

Your experience will help others who might face similar issues with bind mounts in Torizon containers.

Best regards,
Toradex AI Assistant