Upstream metadata expiry breaks lockboxes?

Hello, thanks for the continued feedback on our new api!

I am a little confused what is being asked for here though. Curl doesn’t usually propagate non-200 http errors as a non-zero exit-status, which seems what you are trying to accomplish. I wouldve expected this to fail for old API as well?

For some of our internal toradex scripts we do something similar to solutions discussed here: stdout - Can I make cURL fail with an exitCode different than 0 if the HTTP status code is not 200? - Super User

For example, the OTA provisioning script does this:

function RegisterDevice() {

  echo "== Registering device (deviceID: ${DEVICE_ID}) in system, and downloading credentials."
  cd "$(mktemp -d)" || exit 1
  http_code=$(curl -s -w '%{http_code}' --max-time 30 -X POST -H "Authorization: Bearer $PROVISIONING_TOKEN" "$AUTOPROV_URL" -d "{\"device_id\": \"${DEVICE_ID}\", \"device_name\": \"${DEVICE_NAME}\"}" -o device.zip)

  if [[ ! $http_code -eq 200 ]]; then
    # if we failed, device.zip will be a text file with the errors
    echo -e "${RED}"
    echo "Failed to download token :("
    echo "HTTP ERROR ${http_code}"
    cat device.zip
    echo -e "${NC}"
    exit 1
  fi
}

Now, that being said, it’s possible i completely misunderstood your question… Our API endpoints should be responding with the proper HTTP errors (like 404 for that non-existent-endpoint, etc), if they aren’t it would be considered a bug.

Thanks for the follow-up. Sounds like I’ll have to look into this more; I was basing my comments of our prior script(s) which (it sounds like incorrectly) made that behavioural assumption. My mistake if it’s a PEBKAC error.

Ah, based on your link it appears we lost the -f argument which is satisfactory for our particular use case.

Glad we were able to help clarify.

1 Like

Hi @jeremias.tx ,

Unfortunately I need to resurrect the original issue in a different form, as we are now encountering lockboxes that again refuse to install - this time though due to the root metadata being expired. Inspection of the JSON shows the expiry of the root files isn’t being bumped out when the lockbox is created; we ended up with a lockbox built on July 10th (with an expiry of 1 year) that no longer works because the root metadata expired on August 25th of this year. Even in lockboxes built after that date, the root metadata is only ever extended by a year and not to the expiry date of the lockbox itself.

What can we do to mitigate this issue?

Regar

Hi @bw908,

We’re currently looking into this. Just to ask some clarifying questions about your situation.

First of all, when you’re talking about Lockbox creation/built could you specific what exact actions that means?

For example is “creation” defining the Lockbox on the web UI and “building” is when you download the files with TorizonCore Builder? Or do you mean something else?

Another question the team had is. Are you seeing this with completely new Lockboxes that you are defining in the web UI, or are you seeing this with old Lockboxes that you are updating, or are you seeing it with both?

Best Regards,
Jeremias

Apologies for the confusion. In our case I am using those (define/build) interchangeably, in our CI setup, we both define the lockbox (via API v2 call) and then build it in the same process.

We see this both on rebuilt lockboxes (where we bump the new expiry via the API and then rebuild it) as well as newly created ones, e.g. a new customer-facing software version which did not have a lockbox before.

Best,
~BW908

Thank you for the clarification. This was already brought to the attention of our cloud team. They’ll investigate and see if something strange is going on with regards to the expiry not being appropriately bumped for the metadata.

I’ll give an update once something is available to share.

Best Regards,
Jeremias

1 Like

Hi @bw908 ,

We are looking into this and noticed that your account lockboxes don’t have any expired metadatas… did you find a workaround? Or?

Also, can you specify which root metadata is expired? The latest director root metadata or the latest image-repo root metadata? (or both :grimacing: )

Thanks,
ben

Hi @ben.tx ,

Building a lockbox does not produce one that is already expired, but they will expire before their time:

In our case it is the latest director/#.root.json file in the lockbox was expired. The lockbox was built in July 2024, and had up to director/2.root.json containing the following
"version":2,"expires":"2024-08-24T15:42:44Z","consistent_snapshot":false,"_type":"Root"

so it was only working for ~1.5 months before it started getting rejected by (offline) systems that had not received any other director metadata updates. (A workaround was to install a newer lockbox with 3.root.json which doesn’t expire until Aug 24, 2025, and then downgrade).

However , for our “current” lockboxes that are generated post-August-2024, the director root metadata is still valid for less duration than the actual lockbox itself:

  • 3.root.json now expires 2025-08-24
  • the [lockbox_name].json :"expires":"2025-10-29T17:58:01Z",

which suggests that the lockbox will start failing to install again after August 2025 unless it is either rebuilt or the system otherwise gets a director root metadata update.

Everything else seems to be in order on inspection - the only other item which does not appear to have a bumped-out expiry that corresponds to the lockbox expiry date is image-repo/tdx-containers.json ("expires":"2025-01-01T00:01:00Z") but my understanding is this one does not impact the update itself if we are not using any of the toradex containers.

I hope that helps clarify. If it’s helpful for you to get your hands on one of these lockboxes for further inspection let me know and we’ll see what we can do.

Regards,
~BW908

I appreciate the response.

We are still working on this. I’m fairly confident we have root-cause but still amidst verifying for me to be 100% sure.

Thanks,
Ben

2 Likes

Any updates on this @ben.tx?

I checked in with our team.

We have a potential fix available internally. However, our team wants to run some more tests to make sure the issue is properly addressed and that the change doesn’t break other things. As this change would in theory affect everyone’s Lockbox metadata going forward, so we just want to be careful before deploying this publicly.

Best Regards,
Jeremias

1 Like

Just giving another update. The team has an initial improvement/fix on the server side. However, it was discovered additional changes would be needed to have the metadata expiration be more robust when used in a Lockbox. The team will work on these follow-up tasks. Also this may need some changes in the TorizonCore Builder tool as well. Just a heads-up in case you are using a specific version of TorizonCore Builder in your automation.

In any case, I just wanted to inform you that the team is still actively working on this, but like many things in software additional complications tend to pop up.

Best Regards,
Jeremias

1 Like

Thanks for the update. We do pin a specific version of TCB in our build process for environment consistency but this is not a major issue if we need to move to a newer version for the fix.

~BW

Hi @bw908,

We have a good update to bring you. The team made some adjustments to the metadata generation in the server backend. Furthermore, there was a change to TorizonCore Builder to adjust how the metadata gets fetched: backend/platform: fetch latest director root.json on lockbox download · torizon/torizoncore-builder@a7e893d · GitHub

These changes should help ensure that the root metadata that is generated as part of the Lockbox shouldn’t expire prematurely. It should be noted that there is currently no TorizonCore Builder release that contains the above commit yet. We don’t have any plans yet to do a release since we actually just did a release last week before this fix came in. That said, we do early-access builds of TorizonCore Builder every weekend that contains the latest commits. You could use this (after the next weekend at the time of writing), to get access to this fix. Or I guess just build the container image yourself.

That said, it was recently discovered that the latest TorizonCore Builder has some issues that result in generated Lockboxes containing docker-compose packages to not work correctly during an offline update. This was caused by the following commit: backend/bundle: Bump DIND container image version to 25.0.3 · torizon/torizoncore-builder@aeb51f6 · GitHub

So, keep that in mind if you use the latest version here. The team is aware of this other issue and is working on a solution.

Best Regards,
Jeremias

1 Like

Thanks for the update!

Is this a sporadic issue that is fixed by re-building the lockbox or a persistent one that depends on some other condition? (Just trying to understand potential implications of using the early-access build).

Regards,
~BW