Offline updates built with TCB 3.11 fail

Hi, I am running offline updates and the update/ folder is recognized as a valid update, but later it fails with the following message:

validateMetadata: bad config file extension [RE]
Loading of tarballs aborted!
Offline loading failed: Failed to load docker tarball 5c8d0b0d0a010e8e48d230ab255698e4cc654dad05c24fd0dd60929be26c51a7.tar
Event: AllInstallsComplete, Result - INSTALL_FAILED

I know that there is a TCB 3.12 known bug Torizon OS Issue Tracker and there is already a thread https://community.toradex.com/t/offline-updates-errors-in-docker-secondary-due-to-missing-metadata/, but I see the issue in the TCB 3.11 as well.

Here are the full logs

Aktualizr version tdx-d71f18a1 starting
Reading config: "/usr/lib/sota/conf.d/20-sota-device-cred.toml"
Reading config: "/usr/lib/sota/conf.d/30-rollback.toml"
Reading config: "/usr/lib/sota/conf.d/40-hardware-id.toml"
Reading config: "/usr/lib/sota/conf.d/50-secondaries.toml"
Reading config: "/usr/lib/sota/conf.d/60-polling-interval.toml"
Reading config: "/usr/lib/sota/conf.d/70-reboot.toml"
Reading config: "/etc/sota/conf.d/99-offline-updates.toml"
Use existing SQL storage: "/var/sota/sql.db"
Couldn`t import data: empty path received
Couldn't import client certificate: "/var/sota/import/client.pem" doesn't exist.
Couldn't import client TLS key: "/var/sota/import/pkey.pem" doesn't exist.
Initializing docker-compose Secondaries...
Use existing SQL storage: "/var/sota/storage/docker-compose/sql.db"
No valid metadata found in storage.
Adding Secondary with ECU serial: b54e9743578f3a85a05349167c36cbd50b16ba5bc015be65e0b93baa429211a1 with hardware ID: docker-compose
Initializing torizon-generic Secondaries...
Use existing SQL storage: "/var/sota/storage/bootloader/sql.db"
curl error 7 (http code 0): Couldn't connect to server
No valid metadata found in storage.
Adding Secondary with ECU serial: ff56c58aee29d65eb9ae3483ed055b2df8a958c05b95992a7ead20cf620dd429 with hardware ID: verdin-imx8mp-bootloader
Offline Updates are enabled
fetchMetaOffUpd() called with source_path: "/var/lib/docker/volumes/torizon_application/_data/update"
Current version for ECU ID: b54e9743578f3a85a05349167c36cbd50b16ba5bc015be65e0b93baa429211a1 is unknown
Current version for ECU ID: 70d27503aeaec3b221498b9fe8b58fe40fbe9f7874c20f8e69b632738f22892d is unknown
New updates found in Director metadata. Checking Image repo metadata...
2 new updates found in both Director and Image repo metadata.
Event: UpdateCheckComplete, Result - Updates available
Update available. Acquiring the update lock...
Current version for ECU ID: b54e9743578f3a85a05349167c36cbd50b16ba5bc015be65e0b93baa429211a1 is unknown
Current version for ECU ID: 70d27503aeaec3b221498b9fe8b58fe40fbe9f7874c20f8e69b632738f22892d is unknown
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
Image already fetched; skipping fetching
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
Event: DownloadTargetComplete, Result - Success
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
Event: DownloadTargetComplete, Result - Success
Event: AllDownloadsComplete, Result - Success
curl error 7 (http code 0): Couldn't connect to server
Current version for ECU ID: b54e9743578f3a85a05349167c36cbd50b16ba5bc015be65e0b93baa429211a1 is unknown
Current version for ECU ID: 70d27503aeaec3b221498b9fe8b58fe40fbe9f7874c20f8e69b632738f22892d is unknown
Waiting for Secondaries to connect to start installation...
curl error 7 (http code 0): Couldn't connect to server
Event: InstallStarted
curl error 7 (http code 0): Couldn't connect to server
docker-compose file matches expected digest
Loading metadata from tarball: "/var/lib/docker/volumes/torizon_application/_data/update/images/acae5b78faad5f7cd4fe6dfe77d9ce2bca384a434b694e68d05b85dae0f71af9.images/5c8d0b0d0a010e8e48d230ab255698e4cc654dad05c24fd0dd60929be26c51a7.tar"
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
Successfully loaded metadata from tarball: "/var/lib/docker/volumes/torizon_application/_data/update/images/acae5b78faad5f7cd4fe6dfe77d9ce2bca384a434b694e68d05b85dae0f71af9.images/5c8d0b0d0a010e8e48d230ab255698e4cc654dad05c24fd0dd60929be26c51a7.tar"
validateMetadata: bad config file extension [RE]
Loading of tarballs aborted!
Offline loading failed: Failed to load docker tarball 5c8d0b0d0a010e8e48d230ab255698e4cc654dad05c24fd0dd60929be26c51a7.tar
Event: AllInstallsComplete, Result - INSTALL_FAILED
Update install completed. Releasing the update lock...
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server
curl error 7 (http code 0): Couldn't connect to server

Hi, I’m the Toradex AI Assistant. The Toradex Support team will follow up with you shortly!

Did you like my answer?

:+1: :-1:


Hello,

Thank you for reaching out with your issue regarding offline updates on TorizonCore Builder 3.11. The error message “validateMetadata: bad config file extension [RE]” indicates a problem with the configuration file extensions or metadata in your update package, which is causing the update process to fail.

Here are some steps you can take to troubleshoot and resolve this issue:

  1. Check File Extensions: Ensure that all configuration and metadata files in your update package have the correct extensions. Incorrect extensions can lead to validation errors.

  2. Validate Metadata: Confirm that the metadata for your Docker images and other components is correctly formatted and complete. Ensure all necessary metadata files are present and properly signed.

  3. Review Docker Tarball: Verify the integrity of the Docker tarball mentioned in the error message. Consider re-exporting the Docker image to a tarball to ensure it is complete and correctly included in the update package.

  4. Consider Known Issues: While you mentioned a known bug in TCB 3.12, similar issues might exist in TCB 3.11. Review any available release notes for TCB 3.11 for known issues or patches related to offline updates.

  5. Recreate the Lockbox: If the issue persists, try recreating the Lockbox using TorizonCore Builder. Ensure all steps are followed correctly, including the canonicalization of the docker-compose file and proper signing of metadata.

By following these steps, you should be able to identify and correct the issue with your offline update process. If the problem continues, consider reviewing the update package creation process to ensure all components are correctly prepared and validated before deployment.

I hope this helps resolve your issue. Please feel free to reach out if you have any further questions.

Best regards,
Toradex AI Assistant

Please, also check if these links can help you solve your problem:

[aktualizr, offline updates] Found in archive a file with bad file type: 40960.

Additional info:
I am running updates one the Tezi 6.6.0 custom image (TorizonCore 6.6.0-devel-20241118120946+build.0) based on 6.6.0 with Aktualizr package being reverted to the version where offline updates are working with once command (as discussed here Yocto build TorizonOS doesn't recognize lockbox).
Any TCB version between 3.9 and 3.12 (also tried early-access) build Lockboxes are correctly recognized as valid updates, but all of them fail update process with this error.
The lockbox build process didn’t change.

More updates:
I checked exact place in the aktualizr secondaries docker code

I see that the error is happening because the docker .tar file manifest.json doesn’t contain “.json” suffix anymore, e.g. the correct manifest.json which passes the test

[{"Config":"bdb98bab30dd7dfe8a61489b4696d74f2647f55e31ed9ef272af8fa29848c2a1.json","RepoTags":["<registry path to docker image>:digest_sha256_99fa472fc4940552fc77d0cf7c3f94a0becfe097c1108fd4c382cfc4f25d9f76"],"Layers":["7f45b9f7d0513a43224efb200f9e282989160108a10810c6ebfea6249a1b9f52/layer.tar"]}] 

And now the manifest.json looks like (removed excessive)

[{"Config":"blobs/sha256/e6b8730997a9864ea189c9bd903d326100d3f06d607cb476f6f15013aff9916e","RepoTags":["<some docker image path>:digest_sha256_5c8d0b0d0a010e8e48d230ab255698e4cc654dad05c24fd0dd60929be26c51a7"],"Layers":["blobs/sha256/ff80ec55a37609daeaf3d3843aa8105f6c6b18984cf42badda290363444bdef3","blobs/sha256/30de07ac3f3ccdb48bcf5d295992e5906102534941ec755eb931067f06224dab","blobs/sha256/84b7ae1ff6e4fdb8019922978c64392e84c432967510fe060e560e532c1bcd3f
.... rest is not relevant

And when I changed the manifest.json and add .json extension to config file path and manifest json it worked just fine. As a temporary solution I’ll modify the Lockbox contents to have .json extension, but hope to have it fixed or at least to get the new format backported.

More updates. The solution worked on default Tezi image (6.4.0), on the custom image it gives an error about layers hash sha256 is invalid, although no changes were done to the layer files, only the tar archive was modified. Could somebody confirm that the root cause is correct and probably there are alternative solutions?

Greetings @ollamh,

Are you sure you observed this with Lockboxes generated by TCB 3.11? This issue was only ever observed with Lockboxes generated with TCB 3.12. The change that caused this issue was not present yet in 3.11 so I’m a bit confused on how you observed this there.

Does this happen reliably for you? And are you sure the Lockbox itself was generated from scratch with TCB 3.11?

Best Regards,
Jeremias

Hi @jeremias.tx !

I used a simpler setup, no custom images, no outdated pinned Aktualizr versions.

Base image - https://tezi.toradex.com/artifactory/torizoncore-oe-prod-frankfurt/kirkstone-6.x.y/release/22/verdin-imx8mp/torizon/torizon-core-docker/oedeploy/torizon-core-docker-verdin-imx8mp-Tezi_6.8.0+build.22.tar

I used -t parameter to define TCB version 3.10.

Here are the logs to confirm that TCB was of the correct version.


Setting up TorizonCore Builder with version 3.10.

Pulling TorizonCore Builder...
3.10: Pulling from torizon/torizoncore-builder
Digest: sha256:33acb23fbfce8e46ebfb58501056c9f0ccf83d6469d3f14b5f48f7f271d75867
Status: Image is up to date for torizon/torizoncore-builder:3.10
docker.io/torizon/torizoncore-builder:3.10
Done!

Setup complete! TorizonCore Builder is now ready to use.
TorizonCore Builder internal status and image customizations will be stored in Docker volume named 'storage'.
********************
Important: When you run TorizonCore Builder, the tool can only access the files inside the current working directory. Files and directories outside of the current working directory, or links to files and directories outside of the current working directory, won't be visible to TorizonCore Builder. So please make sure that, when running TorizonCore Builder, all files and directories passed as parameters are within the current working directory.
Your current working directory is: /home/ollamh/projects/aktina/code/abc4-torizon-builder/build
********************
For more information, run 'torizoncore-builder -h' or go to https://developer.toradex.com/knowledge-base/torizoncore-builder-tool
Fetching 'root.json' from image repository.
Fetching 'root.json' from director repository.
Shared data archive 'shared-data.tar.gz' successfully generated.
WARNING: This script is outdated. To update it, run 'wget -o tcb-env-setup.sh https://raw.githubusercontent.com/toradex/tcb-env-setup/master/tcb-env-setup.sh' 

Setting up TorizonCore Builder with version 3.10.

Pulling TorizonCore Builder...
3.10: Pulling from torizon/torizoncore-builder
Digest: sha256:33acb23fbfce8e46ebfb58501056c9f0ccf83d6469d3f14b5f48f7f271d75867
Status: Image is up to date for torizon/torizoncore-builder:3.10
docker.io/torizon/torizoncore-builder:3.10
Done!

Setup complete! TorizonCore Builder is now ready to use.
TorizonCore Builder internal status and image customizations will be stored in Docker volume named 'storage'.
********************
Important: When you run TorizonCore Builder, the tool can only access the files inside the current working directory. Files and directories outside of the current working directory, or links to files and directories outside of the current working directory, won't be visible to TorizonCore Builder. So please make sure that, when running TorizonCore Builder, all files and directories passed as parameters are within the current working directory.
Your current working directory is: /home/ollamh/projects/aktina/code/abc4-torizon-builder/build
********************
For more information, run 'torizoncore-builder -h' or go to https://developer.toradex.com/knowledge-base/torizoncore-builder-tool
Unpacking Toradex Easy Installer image.
Copying Toradex Easy Installer image.
Unpacking TorizonCore Toradex Easy Installer image.
Importing OSTree revision 234782eeec0096bb381f28d28c75c9ab0e2e358ac5007b827fca87a7a033185f from local repository...
1060 metadata, 9467 content objects imported; 584.0 MB content written
0 metadata, 0 content objects imported; 0 bytes content written
Unpacked OSTree from Toradex Easy Installer image:
  Commit checksum: 234782eeec0096bb381f28d28c75c9ab0e2e358ac5007b827fca87a7a033185f
  TorizonCore Version: 6.8.0+build.22
Creating Docker Container bundle...

Starting DIND container
Using Docker host "tcp://127.0.0.1:22376"
Connecting to Docker Daemon at "tcp://127.0.0.1:22376"
...

Then, I reset the board and installed the system.
Then, I created the new Lockbox based on the same base image, copied the update files to USB stick and provided the input for aktualizr-torizon (manually run).

Here are the logs from aktualizr-torizon
aktualizr-torizon.log (148.1 KB)

Did I do everything correctly?

Okay your latest reply is leaving me more confused. Why are you mentioning the version of the OS image? The only relevant version is the version of TCB, it’s TCB that is causing this issue not Torizon OS. This issue is largely independent from whatever version of Torizon OS is running on your device.

Then, I created the new Lockbox based on the same base image, copied the update files to USB stick and provided the input for aktualizr-torizon

I’m not sure what you mean by this statement. What do you mean you created a Lockbox “based on the same base image”?

Your Lockbox contains a docker container image, yes? What do you mean “base image”?

Please observe my setup. I have a Lockbox defined on the torizon.io web UI. This Lockbox contains a single docker-compose package as follows:

services:
  hello-world:
    image: hello-world@sha256:ffb13da98453e0f04d33a6eee5bb8e46ee50d08ebe17735fc0779d0349e889e9
version: '2.4'

If I generate this Lockbox using TCB V3.12 then I see the following:

#manifest.json TCB V3.12
[
  {
    "Config": "blobs/sha256/7066d68bd2f224dbb7c3332da105b1dac81a75b47a869602096c27b6a75a525c",
    "RepoTags": [
      "hello-world:digest_sha256_40d0cfd0861719208ff9f7747ab3f97844eeca509df705db44a736df863b76af"
    ],
    "Layers": [
      "blobs/sha256/f30f4621b75c25a838c91d0cdd3803b0f2b2d27e048c8101a4669f4df6ba85f1"
    ],
    "LayerSources": {
      "sha256:f30f4621b75c25a838c91d0cdd3803b0f2b2d27e048c8101a4669f4df6ba85f1": {
        "mediaType": "application/vnd.oci.image.layer.v1.tar",
        "size": 6656,
        "digest": "sha256:f30f4621b75c25a838c91d0cdd3803b0f2b2d27e048c8101a4669f4df6ba85f1"
      }
    }
  }
]

As you said the Config field doesn’t have a *.json extension on the file-path which causes issues during the offline update.

But, if I generate the same Lockbox but with TCB 3.11 then I see this:

#manifest.json TCB 3.11
[
  {
    "Config": "7066d68bd2f224dbb7c3332da105b1dac81a75b47a869602096c27b6a75a525c.json",
    "RepoTags": [
      "hello-world:digest_sha256_40d0cfd0861719208ff9f7747ab3f97844eeca509df705db44a736df863b76af"
    ],
    "Layers": [
      "6dd2b069cd18b4fb22312c4cd3f9acedcd05caf72e815070f0402721f38f996e/layer.tar"
    ]
  }
]

As you can see now there is the expected *.json extension and this Lockbox will work in the offline update as expected. In both of these I’m just running platform lockbox command nothing else. Your TCB logs on closer look are also confusing I don’t see anything in those logs that looks like you generating a Lockbox.

Can you understand now why I’m confused by your explanation and observation of events? As far as It’s currently known, the issue only occurs if the Lockbox gets created with TCB 3.12.

Best Regards,
Jeremias

Hi @jeremias.tx , thank you for your check and explanation, that’s my bad, I didn’t look correctly into the one part of the build process and it turned out I was still using version 3 of the TCB which currently is pointing to version 3.12. So, I suppose we can close this one or even delete as it is may mislead everyone. The 3.11 is working correctly.

Glad i was able to help clarify.

Best Regards,
Jeremias