Compose_http_timeout

Issues Upgrading Base OS via https://app.torizon.io/ for some devices.
We are seeing an issue where some of our devices are rolling back their BaseOS version, after an upgrade where they report the updated version. We do see an failure reported on the upgrade in the UI however, it does not provide any information.
After some investigation we see the error from Aktualizr logs

aktualizr-torizon[18116]: ERROR: for registration UnixHTTPConnectionPool(host=‘localhost’, port=None): Read timed out. (read timeout=60)
aktualizr-torizon[18116]: ERROR: for senceiveio UnixHTTPConnectionPool(host=‘localhost’, port=None): Read timed out. (read timeout=60)
aktualizr-torizon[18116]: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
aktualizr-torizon[18116]: If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Can we get some assistance as to how best to resolve this problem. No entirely sur where to change the value of COMPOSE_HTTP_TIMEOUT

Hi @DuncanNapier ,

Thanks for reaching out and welcome to the Toradex Community. :tada:

Which version of Torizon is your base OS and what version are you trying to upgrade to?

So after the update they rolled back or they are on the updated version?

Have a good day!

Best Regards
Kevin

We’re upgrading to
Base OS
Type:
colibri-imx7-emmc
version is custom version of 4.2.260 (5.7.0 Torizon-Core)

It rolls back to
again a custom version 1.0.3.0 (5.7.0 Torizon-Core)

Hope thats helpful

Just to confirm they appear as if they have been upgraded but after a period of time they are reverting to a previous version of the BaseOS

Hi @DuncanNapier ,

Sorry for the late response.

How’s your internet connection? Are you using WiFi?

Best Regards
Kevin

The devices are reliant on 4g connection for comms. This is fairly reliable, the number of issue during our updates would suggest it is not issues in connection, if it where we would expect to see issues in our logs

Hi @DuncanNapier !

Thanks for the information.

As a sanity check, would you be able to reproduce this issue using a Wi-Fi connection?

Best regards,

Reproduction in itself is complicated, we don’t know how this is happening, or any sign that it will happen, These devices are deployed and on customer sites.
because we’re using 4G network we aware that we make be subject to poor signal quality so when seeing the error message

aktualizr-torizon[18116]: ERROR: for registration UnixHTTPConnectionPool(host=‘localhost’, port=None): Read timed out. (read timeout=60)
aktualizr-torizon[18116]: ERROR: for senceiveio UnixHTTPConnectionPool(host=‘localhost’, port=None): Read timed out. (read timeout=60)
aktualizr-torizon[18116]: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
aktualizr-torizon[18116]: If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

we hope to mitigate the the issue by setting COMPOSE_HTTP_TIMEOUT to something larger than 60s. Is this something we can change and if so where?

Hi @DuncanNapier !

I asked internally for some tips.

In the meantime, as a possible source of new information, could you please run a speedtest on the affected module(s)?

docker run -it --rm drewmoseley/speedtest-cli:latest

This is a Docker image from our colleague @drew.tx.

Best regards,

I’ll run this next time we have a failing module. Our own work around to reset the devices means we have limited failing modules at the moment.