OTA issues from Torizon Platform and Github Actions

Hi,

I am getting a failed build on the github actions which says:

Status: Downloaded newer image for torizon/torizoncore-builder:early-access
You are running an early access version of TorizonCore Builder.
An unexpected Exception occured. Please provide the following stack trace to
the Toradex TorizonCore support team:


Traceback (most recent call last):
  File "/builder/torizoncore-builder", line 221, in <module>
    mainargs.func(mainargs)
  File "/builder/tcbuilder/cli/platform.py", line 538, in do_platform_push
    package_info, compatible_with = _check_compatible_with_param(args.compatible_with, credentials)
  File "/builder/tcbuilder/cli/platform.py", line 472, in _check_compatible_with_param
    return translate_compatible_packages(credentials, criteria)
  File "/builder/tcbuilder/backend/platform.py", line 1144, in translate_compatible_packages
    server_creds = sotaops.ServerCredentials(credentials)
  File "/builder/tcbuilder/backend/sotaops.py", line 36, in __init__
    self._load()
  File "/builder/tcbuilder/backend/sotaops.py", line 40, in _load
    with ZipFile(fname, "r") as archive:
  File "/usr/lib/python3.9/zipfile.py", line 1257, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.9/zipfile.py", line 1324, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

> TASK tcb-platform-publish exited with error code 254 <
Error: Process completed with exit code 1.

Furthemore, I executed a task tcb-platform-publish, which executed successfully with the update showing on the Torizon Platform. but when I try to execute the update fleet, the update is queued, but it fails on the device. the journalctl output is as follows:

Jul 23 09:15:16 verdin-imx8mm-14756428 aktualizr-torizon[896]: Current versions in storage and reported by OSTree do not match
Jul 23 09:15:19 verdin-imx8mm-14756428 aktualizr-torizon[896]: Current version for ECU ID: 1e915376f5515483a365569f9f43a1efffa6ea1dc485163a2de734aadf71bfe2 is unknown
Jul 23 09:15:19 verdin-imx8mm-14756428 aktualizr-torizon[896]: New updates found in Director metadata. Checking Image repo metadata...
Jul 23 09:15:20 verdin-imx8mm-14756428 aktualizr-torizon[896]: 1 new update found in both Director and Image repo metadata.
Jul 23 09:15:20 verdin-imx8mm-14756428 aktualizr-torizon[896]: Event: UpdateCheckComplete, Result - Updates available
Jul 23 09:15:20 verdin-imx8mm-14756428 aktualizr-torizon[896]: Update available. Acquiring the update lock...
Jul 23 09:15:20 verdin-imx8mm-14756428 aktualizr-torizon[896]: Current version for ECU ID: 1e915376f5515483a365569f9f43a1efffa6ea1dc485163a2de734aadf71bfe2 is unknown
Jul 23 09:15:21 verdin-imx8mm-14756428 aktualizr-torizon[896]: Event: DownloadProgressReport, Progress at 100%
Jul 23 09:15:21 verdin-imx8mm-14756428 aktualizr-torizon[896]: Event: DownloadTargetComplete, Result - Success
Jul 23 09:15:21 verdin-imx8mm-14756428 aktualizr-torizon[896]: Event: AllDownloadsComplete, Result - Success
Jul 23 09:15:21 verdin-imx8mm-14756428 aktualizr-torizon[896]: Current version for ECU ID: 1e915376f5515483a365569f9f43a1efffa6ea1dc485163a2de734aadf71bfe2 is unknown
Jul 23 09:15:21 verdin-imx8mm-14756428 aktualizr-torizon[896]: Waiting for Secondaries to connect to start installation...
Jul 23 09:15:23 verdin-imx8mm-14756428 aktualizr-torizon[896]: No update to install on Primary
Jul 23 09:15:23 verdin-imx8mm-14756428 aktualizr-torizon[896]: Event: InstallStarted
Jul 23 09:15:23 verdin-imx8mm-14756428 aktualizr-torizon[896]: Updating containers via docker-compose
Jul 23 09:15:23 verdin-imx8mm-14756428 aktualizr-torizon[896]: Running docker-compose pull
Jul 23 09:15:23 verdin-imx8mm-14756428 aktualizr-torizon[896]: Running command: /usr/bin/docker-compose --file /var/sota/storage/docker-compose/docker-compose.yml.tmp pull --no-parallel
Jul 23 09:15:23 verdin-imx8mm-14756428 aktualizr-torizon[1842]:  geopaxapp-svc Pulling
Jul 23 09:15:24 verdin-imx8mm-14756428 dockerd[739]: time="2023-07-23T09:15:24.478829249Z" level=warning msg="reference for unknown type: " digest="sha256:f09d1d40239281aa0f4c8088676faa1f6da34b625b53b190282788101233e86e" remote="docker.io/geopaxpvtltd/geopaxapp-svc@sha256:f09d1d40239281aa0f4c8088676faa1f6da34b625b53b190282788101233e86e"
Jul 23 09:15:26 verdin-imx8mm-14756428 dockerd[739]: time="2023-07-23T09:15:26.415496733Z" level=error msg="Not continuing with pull after error: errors:\ndenied: requested access to the resource is denied\nunauthorized: authentication required\n"
Jul 23 09:15:26 verdin-imx8mm-14756428 dockerd[739]: time="2023-07-23T09:15:26.417149558Z" level=info msg="Ignoring extra error returned from registry: unauthorized: authentication required"
Jul 23 09:15:26 verdin-imx8mm-14756428 aktualizr-torizon[1842]:  geopaxapp-svc Error
Jul 23 09:15:26 verdin-imx8mm-14756428 aktualizr-torizon[1842]: Error response from daemon: pull access denied for geopaxpvtltd/geopaxapp-svc, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
Jul 23 09:15:26 verdin-imx8mm-14756428 aktualizr-torizon[896]: Error running docker-compose pull
Jul 23 09:15:26 verdin-imx8mm-14756428 aktualizr-torizon[896]: Rolling back container update
Jul 23 09:15:26 verdin-imx8mm-14756428 aktualizr-torizon[896]: Event: InstallTargetComplete, Result - Error
Jul 23 09:15:26 verdin-imx8mm-14756428 aktualizr-torizon[896]: Event: AllInstallsComplete, Result - docker-compose:INSTALL_FAILED
Jul 23 09:15:26 verdin-imx8mm-14756428 aktualizr-torizon[896]: Update install completed. Releasing the update lock...
Jul 23 09:15:26 verdin-imx8mm-14756428 aktualizr-torizon[896]: Current versions in storage and reported by OSTree do not match
Jul 23 09:15:27 verdin-imx8mm-14756428 aktualizr-torizon[896]: Event: PutManifestComplete, Result - Success

Apparently dockerhub denies access to download the image.

Furthermore, when the docker repository is changed to private, the tcb-platform-publish task fails with the following message:

 *  Executing task: DOCKER_HOST= source ./.conf/tcb-env-setup.sh -s /home/ha-01/GeopaxApp/storage -t early-access 

Warning: If you intend to use torizoncore-builder as a server (listening to ports), then you should pass extra parameters to "docker run" (via the -- switch).
Setting up TorizonCore Builder with version early-access.

You are running an early access version of TorizonCore Builder.
Access to manifest for image 'geopaxpvtltd/geopaxapp-svc:imagePreProduction' was not authorized; be sure to pass a proper username/password pair for the registry.
Error: Could not determine digest for image 'geopaxpvtltd/geopaxapp-svc:imagePreProduction'.

 *  The terminal process "/usr/bin/bash '-c', 'DOCKER_HOST= source ./.conf/tcb-env-setup.sh -s /home/ha-01/GeopaxApp/storage -t early-access'" terminated with exit code: 255. 
 *  Terminal will be reused by tasks, press any key to close it. 

in this case, the image is published to the repository but the tcb-platform-publish task fails.

Furthermore, while experimenting with the OTA updates, I queued an update for kirkland 6.3.0 version image. the update did complete successfully but it stuck my dev enviroment in the way that I am unable to debug my application container from vscode on the apollox extension with the same docker file that used to work before. the error seems to arise while creating a folder to mount. the error outout is as below:

 *  Executing task: sshpass -p hasann99 ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no torizon@192.168.0.158 LOCAL_REGISTRY=192.168.0.247 TAG=arm64 docker-compose up -d geopaxapp-svc-debug 

Warning: Permanently added '192.168.0.158' (ED25519) to the list of known hosts.
time="2023-07-23T15:17:59Z" level=warning msg="The \"DOCKER_LOGIN\" variable is not set. Defaulting to a blank string."
 Network torizon_default  Creating
 Network torizon_default  Created
 Container torizon-geopaxapp-svc-debug-1  Creating
 Container torizon-geopaxapp-svc-debug-1  Created
 Container torizon-geopaxapp-svc-debug-1  Starting
Error response from daemon: error while creating mount source path '/appdata/config': mkdir /appdata: operation not permitted

 *  The terminal process "sshpass '-p', 'hasann99', 'ssh', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'StrictHostKeyChecking=no', 'torizon@192.168.0.158', 'LOCAL_REGISTRY=192.168.0.247 TAG=arm64 docker-compose up -d geopaxapp-svc-debug'" terminated with exit code: 1. 

The compose file associated with it is as below:

version: "3.9"
services:
  geopaxapp-svc-debug:
    build:
      context: .
      dockerfile: Dockerfile.debug
    image: ${LOCAL_REGISTRY}:5002/geopaxapp-svc-debug:${TAG}
    user: root
    ports:
      - 2230:2230
      - 8000:8000
      - 8443:8443
      - 2101:2101
    devices:
      - "/dev/gpiochip4:/dev/gpiochip4"
      - "/dev/ttyACM0:/dev/ttyACM0"
      - "/dev/ttyACM1:/dev/ttyACM1"
      - "/dev/verdin-uart1:/dev/verdin-uart1"
      - "/dev/verdin-uart2:/dev/verdin-uart2"
    volumes:
      - "/var/run/dbus:/var/run/dbus"
      - "/var/run/sdp:/var/run/sdp"
      - "/sys/block:/sys/block"
      - "/dev:/dev"
      - "/mnt:/mnt"
      - "/appdata/config:/appdata/config"
      - "/appdata/log:/appdata/log"
      - "/appdata/data:/appdata/data"

  geopaxapp-svc:
    build:
      context: .
      dockerfile: Dockerfile
    image: ${DOCKER_LOGIN}/geopaxapp-svc:${TAG}
    ports:
      - 8000:8000
      - 8443:8443
      - 2101:2101
    devices:
      - "/dev/gpiochip4:/dev/gpiochip4"
      - "/dev/ttyACM0:/dev/ttyACM0"
      - "/dev/ttyACM1:/dev/ttyACM1"
      - "/dev/verdin-uart1:/dev/verdin-uart1"
      - "/dev/verdin-uart2:/dev/verdin-uart2"
    volumes:
      - "/var/run/dbus:/var/run/dbus"
      - "/var/run/sdp:/var/run/sdp"
      - "/sys/block:/sys/block"
      - "/dev:/dev"
      - "/mnt:/mnt"
      - "/appdata/config:/appdata/config"
      - "/appdata/log:/appdata/log"
      - "/appdata/data:/appdata/data"

Hello @geopaxpvtltd ,
It seems that you are using an early access version of torizoncore-builder. Do you get the same error if you use a non-early access version?

Best regards,
Josep

I resolved this by changing the docker host path to home directory.

I resolved this by using this guide : Using Private Registries With the Torizon Cloud | Toradex Developer Center

But now, the issue is that the update is queued on the device, downloaded and extracted by it but it fails giving the following error in journalctl:

Jul 24 07:57:50 verdin-imx8mm-14756428 aktualizr-torizon[997]: No update to install on Primary
Jul 24 07:57:50 verdin-imx8mm-14756428 aktualizr-torizon[997]: Updating containers via docker-compose
Jul 24 07:57:50 verdin-imx8mm-14756428 aktualizr-torizon[997]: Running docker-compose down
Jul 24 07:57:50 verdin-imx8mm-14756428 aktualizr-torizon[997]: Running command: /usr/bin/docker-compose --file /var/sota/storage/docker-compose/docker-compose.yml -p torizon down
Jul 24 07:57:50 verdin-imx8mm-14756428 aktualizr-torizon[3784]:  Network torizon_default  Removing
Jul 24 07:57:50 verdin-imx8mm-14756428 aktualizr-torizon[3784]:  Network torizon_default  Error
Jul 24 07:57:50 verdin-imx8mm-14756428 aktualizr-torizon[3784]: failed to remove network torizon_default: Error response from daemon: error while removing network: network torizon_default id 04d16e6dcf124907034d6b69800625c7b372daa81fa780da3dac4b6fa8fd8d14 has active endpoints
Jul 24 07:57:51 verdin-imx8mm-14756428 aktualizr-torizon[997]: docker-compose down of old image failed
Jul 24 07:57:51 verdin-imx8mm-14756428 aktualizr-torizon[997]: Event: InstallTargetComplete, Result - Error
Jul 24 07:57:51 verdin-imx8mm-14756428 aktualizr-torizon[997]: Event: AllInstallsComplete, Result - docker-compose:INSTALL_FAILED
Jul 24 07:57:51 verdin-imx8mm-14756428 aktualizr-torizon[997]: Update install completed. Releasing the update lock...
Jul 24 07:57:52 verdin-imx8mm-14756428 aktualizr-torizon[997]: Event: PutManifestComplete, Result - Success

I am unable to resolve this.

Regarding this, I am unsure what you mean. I am using the version available to me from the vscode extensions. I am working on a Cpp Console template by Toradex. The following is the extension.

Edit: @josep.tx I found the setting to disable the early access version. I changed it to 3.7.0 but the error still remains:

 *  Executing task: DOCKER_HOST= source ./.conf/tcb-env-setup.sh -s /home/ha-01/GeopaxApp/storage -t 3.7.0 

Warning: If you intend to use torizoncore-builder as a server (listening to ports), then you should pass extra parameters to "docker run" (via the -- switch).
Setting up TorizonCore Builder with version 3.7.0.

Error: Can't determine digest for image 'geopaxpvtltd/geopaxapp-svc:imagePreProduction'.

 *  The terminal process "/usr/bin/bash '-c', 'DOCKER_HOST= source ./.conf/tcb-env-setup.sh -s /home/ha-01/GeopaxApp/storage -t 3.7.0'" terminated with exit code: 255. 
 *  Terminal will be reused by tasks, press any key to close it. 

Hi @geopaxpvtltd ,

Just for the record is your host machine Windows or Linux?

Best Regards
Kevin

The host machine is windows from wsl2 using apollox extension.

hello @kevin.tx, were you able to re-produce the issue? or is it just me?

I resolved this by narrowing down the error to the volume mounting issues from docker-compose file for the production container.

 *  Executing task: DOCKER_HOST= source ./.conf/tcb-env-setup.sh -s /home/ha-01/GeopaxApp/storage -t 3.7.0 

Warning: If you intend to use torizoncore-builder as a server (listening to ports), then you should pass extra parameters to "docker run" (via the -- switch).
Setting up TorizonCore Builder with version 3.7.0.

Error: Can't determine digest for image 'geopaxpvtltd/geopaxapp-svc:imagePreProduction'.

 *  The terminal process "/usr/bin/bash '-c', 'DOCKER_HOST= source ./.conf/tcb-env-setup.sh -s /home/ha-01/GeopaxApp/storage -t 3.7.0'" terminated with exit code: 255. 
 *  Terminal will be reused by tasks, press any key to close it.

This problem still remains. It works only if I have the docker hub registry set to public.

Hello @kevin.tx ,

Can I ask for an update regarding this?

BR,
Hassan.

The tcb-platform-publish task currently expects a public container registry. You can modify the task definition in .vscode/tasks.json to instead expect a private registry. The task works by invoking TorizonCore Builder, specifically the platform push command.

You can modify the tasks definition directly to include the --login parameter with your login information for your registry. You can find info on the various arguments and options of TorizonCore Builder here: TorizonCore Builder Tool - Commands Manual | Toradex Developer Center

Best Regards,
Jeremias

1 Like

Thank you, Again! :+1: This issue is resolved.

Edit: The job on github action fails on the tcb-platform-publish task with the following error:

Traceback (most recent call last):
  File "/builder/torizoncore-builder", line 221, in <module>
    mainargs.func(mainargs)
  File "/builder/tcbuilder/cli/platform.py", line 506, in do_platform_push
    package_info, compatible_with = translate_compatible_packages(credentials, criteria)
  File "/builder/tcbuilder/backend/platform.py", line 1125, in translate_compatible_packages
    server_creds = sotaops.ServerCredentials(credentials)
  File "/builder/tcbuilder/backend/sotaops.py", line 36, in __init__
    self._load()
  File "/builder/tcbuilder/backend/sotaops.py", line 40, in _load
    with ZipFile(fname, "r") as archive:
  File "/usr/lib/python3.9/zipfile.py", line 1257, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.9/zipfile.py", line 1324, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

> TASK tcb-platform-publish exited with error code 254 <
Error: Process completed with exit code 1.

It is regarding a zip file which to my knowledge, i dont have any in the project except for the credentials.zip. but that is in the gitignore.

Hmm it looks like TorizonCore Builder is trying to unpack the credentials.zip file, which is either not their or not a zip file. Since you said:

i dont have any in the project except for the credentials.zip. but that is in the gitignore.

It sounds like this file isn’t being made available to Github actions. That said you should be careful with publishing this file in any repo, since it has security information related to your account that could be abused by others.

Best Regards,
Jeremias

I understand regarding the security concerns but for testing, I tried including the credentials.zip in the repo but the build still fails. Maybe it is looking for something else?

Hey @geopaxpvtltd

yeah, you should not, better say never, add your credentials.zip file to the git repo, this could be easily leaked.

Read this: https://github.com/toradex/torizon-experimental-torizon-ide-v2-docs/blob/main/GITHUB-ACTIONS.md

This doc has the instructions for you safely add your credentials.zip to be used with the GitHub Actions integration.

If you have any other questions or issues let me know.

BR,

hello @matheus.tx

Thanks for your input. I already followed the document and have the secrets saved in the repo.

both with and without the credentials.zip

BR.

ok, so you did this step right? https://github.com/toradex/torizon-experimental-torizon-ide-v2-docs/blob/main/GITHUB-ACTIONS.md#adding-credentialszip-as-cicd-variable

Could you please try to run this to test if the result of your command is generating something weird?

base64 -w 0 ./credentials.zip > base64.txt
cat base64.txt | base64 -d > credentials2.zip

After that check if credentials2.zip file is a valid zip file.

Let me know the result.

BR,

yes, i did.

yes, it apparently is:

these are the contents:

ok, because what the integration does is exactly this decode command of the base64 string that you added to the PLATFORM_CREDENDIALS secret.

And the funny thing about it is that writing this here for you I see that on the docs the PLATFORM_CREDENDIALS should be PLATFORM_CREDENTIALS with T instead D :sweat_smile:.

Coud you please add the PLATFORM_CREDENTIALS and try again?
Let me know the result.

BR,

lol.

I changed it to PLATFORM_CREDENTIALS
but the output is still the same.

Digest: sha256:034afbc45946f0ca468705f0070b0e8291dc2787048f9b38c01d76facba39bd5
Status: Downloaded newer image for torizon/torizoncore-builder:3.7.0
An unexpected Exception occured. Please provide the following stack trace to
the Toradex TorizonCore support team:


Traceback (most recent call last):
  File "/builder/torizoncore-builder", line 221, in <module>
    mainargs.func(mainargs)
  File "/builder/tcbuilder/cli/platform.py", line 512, in do_platform_push
    platform.push_compose(
  File "/builder/tcbuilder/backend/platform.py", line 1166, in push_compose
    lock_file = canonicalize_compose_file(compose_file, force)
  File "/builder/tcbuilder/backend/platform.py", line 1304, in canonicalize_compose_file
    set_images_hash(compose_file_data)
  File "/builder/tcbuilder/backend/platform.py", line 1263, in set_images_hash
    response, image_digest = registry.get_manifest(image_name, ret_digest=True)
  File "/builder/tcbuilder/backend/registryops.py", line 429, in get_manifest
    self._request_token(headers=res.headers)
  File "/builder/tcbuilder/backend/registryops.py", line 368, in _request_token
    self.token_cache[scope] = res_json["token"]
KeyError: 'token'

> TASK tcb-platform-publish exited with error code 254 <
Error: Process completed with exit code 1.
1 Like

This is good, now we have another stack message, nothing related to BadZipFile. So, now the credentials.zip file is a valid one.

Now the issue it’s something on the registryops.py. Let me take a look, and I will return to you.

BR,

1 Like

Hey @geopaxpvtltd I read on the comments above that you set back the version from settings.json to 3.7.0. Could you please try with the early-access again?

Let me know.

BR,