Torizon OTA updated fails with Docker Compose but gives no meaningful error message

I’m attempting to use the OTA with a docker-compose (shown below) and I’m pretty sure I probably made a mistake somewhere, but I’m not quite sure where. When I initiate an update, it tells me
image
Clicking on the More Info shows me:


Clicking on Show Update Events gives me:

If I click to show the current Docker compose file I see what I uploaded as.

version: "2.4"
services:
  pilot:
    container_name: pilot
    networks:
      - backend
      - frontend
    volumes:
      - "/mnt/gojimedia:/mnt/gojimedia"
    ports:
      - "80:80"
      - "8809:8809"
      - "8810:8810"
      - "8811:8811"
    image: johntoebes/goji:goji-server

  weston:
    # pass environment variables to the container
    # in this example, you accept the NXP EULA with regards to use of Vivante
    # binaries for GPU-accelerated graphics
    environment:
      - ACCEPT_FSL_EULA=1
    container_name: weston

    # For i.MX 8-based modules use this image:
    image: torizon/weston-vivante:1

    # Required to get udev events from host udevd via netlink
    network_mode: host
    volumes:
      - type: bind
        source: /tmp
        target: /tmp
      - type: bind
        source: /dev
        target: /dev
      - type: bind
        source: /run/udev
        target: /run/udev
    cap_add:
      - CAP_SYS_TTY_CONFIG

    # Add device access rights through cgroup rules
    device_cgroup_rules:
      # ... for tty0
      - 'c 4:0 rmw'
      # ... for tty7
      - 'c 4:7 rmw'
      # ... for /dev/input devices
      - 'c 13:* rmw'
      - 'c 199:* rmw'
      # ... for /dev/dri devices
      - 'c 226:* rmw'

  kiosk:

    container_name: kiosk
    image: torizon/kiosk-mode-browser:1

    # run a custom command, equivalent to CMD on a Dockerfile
    command: --window-mode http://pilot/map?bulkhead

    networks:
      - frontend
    volumes:
      - type: bind
        source: /tmp
        target: /tmp
      - type: bind
        source: /var/run/dbus
        target: /var/run/dbus
      - type: bind
        source: /dev/dri
        target: /dev/dri

    # only bring-up this container if others are successfully started
    depends_on:
      - weston
      - pilot

    # size of shared memory between containers
    shm_size: '256mb'

    device_cgroup_rules:
      # ... for /dev/dri devices
      - 'c 226:* rmw'

networks:
  backend:
    internal: true
  frontend:
    internal: false

I’ve tried a few more times and validated the docker-compose file, so I know it is right. If I look on the box after the failed updated I see:



Which seems to imply that it did something - the other containers are gone, but it doesn’t look like the containers really have anything to do with my docker-compose file.

Greetings @toebes,

Let’s try to get more information here as to what is failing in the update. Judging by your information it seemed the update failed on the last step (installation). Please try the following, while an update is happening. Run journalctl -f -u aktualizr* on your device. This will follow the log output of the update client. If there are any errors that occur during the update process, they should show up in these logs.

For further debugging the following information may be useful: Torizon Remote Updates Technical Overview

These are the steps that are executed during installation of a docker-compose update. It could help to debug which exact step the update installation fails on if the logs are not clear.

Best regards,
Jeremias

Ok that was helpful… It doesn’t like my private image so I found the link about Using Private Registries With the Torizon Platform which helped get past that point. Thinking about supporting a lot of these boxes in the field, a feature request for the team would be for the aktualizer failuers to be replicated to the Torizon OTA dashboard.

I did find an odd message when installing that is worth looking at. Notice the Missing R before Removing below and the extra R on the lines after it plus the missing D for Deleted containers.

Mar 16 20:39:57 apalis-imx8-06945459 aktualizr-torizon[1016]: Running command: /usr/bin/docker-compose --file /var/sota/storage/docker-compose/docker-compose.yml.tmp -p torizon up --detach --remove-orphans
Mar 16 20:39:59 apalis-imx8-06945459 aktualizr-torizon[33108]: Creating network "torizon_backend" with the default driver
Mar 16 20:39:59 apalis-imx8-06945459 aktualizr-torizon[33108]: Creating network "torizon_frontend" with the default driver
Mar 16 20:39:59 apalis-imx8-06945459 aktualizr-torizon[33108]: Creating pilot ...
Mar 16 20:39:59 apalis-imx8-06945459 aktualizr-torizon[33108]: Creating weston ...
Mar 16 20:40:06 apalis-imx8-06945459 aktualizr-torizon[33108]: [95B blob data]
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[33108]: [38B blob data]
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[1016]: [38B blob data]
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[1016]: emoving not used containers, networks and images
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[1016]: R
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[33503]: R
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[33503]: eleted Containers:
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[33503]: ffbaaa9d2994a8c1cbfaa2fac1a3b751f330e14060ec6f54b1473c5f52973c8f
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[33503]: Deleted Images:
Mar 16 20:40:08 apalis-imx8-06945459 aktualizr-torizon[33503]: untagged: torizon/weston-vivante@sha256:827417ba996cf20c4676bae19c0128e8ea27f41a620dd0b84174427f06779797

Yeah journald sometimes splices the log lines strangely but it’s a relatively minor problem. Sounds like you resolved the issue with the updates, so that’s good to hear. Are their any other issues, or was it just the private registry?

Best Regards,
Jeremias