Compose_http_timeout

DuncanNapier · February 13, 2024, 1:23pm

Issues Upgrading Base OS via https://app.torizon.io/ for some devices.
We are seeing an issue where some of our devices are rolling back their BaseOS version, after an upgrade where they report the updated version. We do see an failure reported on the upgrade in the UI however, it does not provide any information.
After some investigation we see the error from Aktualizr logs

aktualizr-torizon[18116]: ERROR: for registration UnixHTTPConnectionPool(host=‘localhost’, port=None): Read timed out. (read timeout=60)
aktualizr-torizon[18116]: ERROR: for senceiveio UnixHTTPConnectionPool(host=‘localhost’, port=None): Read timed out. (read timeout=60)
aktualizr-torizon[18116]: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
aktualizr-torizon[18116]: If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Can we get some assistance as to how best to resolve this problem. No entirely sur where to change the value of COMPOSE_HTTP_TIMEOUT

kevin.tx · February 13, 2024, 1:40pm

Hi @DuncanNapier ,

Thanks for reaching out and welcome to the Toradex Community.

Which version of Torizon is your base OS and what version are you trying to upgrade to?

So after the update they rolled back or they are on the updated version?

Have a good day!

Best Regards
Kevin

DuncanNapier · February 13, 2024, 2:10pm

We’re upgrading to
Base OS
Type:
colibri-imx7-emmc
version is custom version of 4.2.260 (5.7.0 Torizon-Core)

It rolls back to
again a custom version 1.0.3.0 (5.7.0 Torizon-Core)

Hope thats helpful

DuncanNapier · February 15, 2024, 2:15pm

Just to confirm they appear as if they have been upgraded but after a period of time they are reverting to a previous version of the BaseOS

kevin.tx · February 26, 2024, 1:44pm

Hi @DuncanNapier ,

Sorry for the late response.

How’s your internet connection? Are you using WiFi?

Best Regards
Kevin

DuncanNapier · February 28, 2024, 10:45am

The devices are reliant on 4g connection for comms. This is fairly reliable, the number of issue during our updates would suggest it is not issues in connection, if it where we would expect to see issues in our logs

henrique.tx · February 28, 2024, 9:10pm

Hi @DuncanNapier !

Thanks for the information.

As a sanity check, would you be able to reproduce this issue using a Wi-Fi connection?

Best regards,

DuncanNapier · February 29, 2024, 9:04am

Reproduction in itself is complicated, we don’t know how this is happening, or any sign that it will happen, These devices are deployed and on customer sites.
because we’re using 4G network we aware that we make be subject to poor signal quality so when seeing the error message

aktualizr-torizon[18116]: ERROR: for registration UnixHTTPConnectionPool(host=‘localhost’, port=None): Read timed out. (read timeout=60)
aktualizr-torizon[18116]: ERROR: for senceiveio UnixHTTPConnectionPool(host=‘localhost’, port=None): Read timed out. (read timeout=60)
aktualizr-torizon[18116]: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
aktualizr-torizon[18116]: If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

we hope to mitigate the the issue by setting COMPOSE_HTTP_TIMEOUT to something larger than 60s. Is this something we can change and if so where?

henrique.tx · February 29, 2024, 11:39am

Hi @DuncanNapier !

I asked internally for some tips.

In the meantime, as a possible source of new information, could you please run a speedtest on the affected module(s)?

docker run -it --rm drewmoseley/speedtest-cli:latest

This is a Docker image from our colleague @drew.tx.

Best regards,

DuncanNapier · March 6, 2024, 12:12pm

I’ll run this next time we have a failing module. Our own work around to reset the devices means we have limited failing modules at the moment.