Lockboxes not valid after a

mnano · August 30, 2024, 6:57am

I downloaded a lockbox for offline update of the application a week ago. But after awhile I get this error when trying to update:

Aug 30 06:36:26 verdin-imx8mp-14772918 aktualizr-torizon[925]: fetchMetaOffUpd() called with source_path: "/media/USB/update"
Aug 30 06:36:26 verdin-imx8mp-14772918 aktualizr-torizon[925]: Invalid role in root.json
Aug 30 06:36:26 verdin-imx8mp-14772918 aktualizr-torizon[925]: Invalid role in root.json
Aug 30 06:36:26 verdin-imx8mp-14772918 aktualizr-torizon[925]: Director metadata update failed: The version of role offline-updates does not match the entry in Snapshot metadata.
Aug 30 06:36:26 verdin-imx8mp-14772918 aktualizr-torizon[925]: Event: UpdateCheckComplete, Result - Error

This might have to do with our use case:
Our Current Setup:

We are testing offline updates in a system that simulates how devices will be updated in the field.
We’re using a lockbox created with Torizon Cloud to update devices to specific versions.
There is 2 types of updates: Formal and non-formal
- Formal updates use the lockboxes created by our CI and versioned. Going back and forth between those work. But after a non-formal update, I can’t update it anymore.
- Non-formal updates. This is where a dev have built an executable. The test system will take that executable, make it a docker image. Then it will push a docker-compose using that docker image to torizon to make it a lockbox. That lockbox is downloaded, sent to the target device and mounted to simulate USB.

Could you explain what this error mean and if our use-case is to blame for the error?

ToradexAI · August 30, 2024, 6:58am

Hi, I’m the Toradex AI Assistant. I’m in early beta, and I would appreciate your feedback.
The Toradex Support team will follow up with you shortly!

Did you like my answer?

The error you’re encountering, specifically “Invalid role in root.json” and “The version of role offline-updates does not match the entry in Snapshot metadata,” indicates a mismatch or inconsistency in the metadata of the lockbox. This can happen due to several reasons, especially in a setup where both formal and non-formal updates are being used.

Here are some potential causes and solutions:

Metadata Mismatch:
- When you create a lockbox, it includes metadata that ensures the integrity and authenticity of the update. If the metadata in the lockbox does not match the expected metadata on the device, the update will fail.
- Ensure that the metadata for both formal and non-formal updates is consistent. This means that any non-formal updates should be properly versioned and signed to match the expected format.
Snapshot Metadata:
- The snapshot metadata keeps track of the versions of all roles (including offline-updates). If a non-formal update modifies the snapshot metadata in an unexpected way, it can cause subsequent updates to fail.
- Make sure that when creating non-formal updates, the snapshot metadata is correctly updated and consistent with the formal updates.
Role Definitions:
- The error “Invalid role in root.json” suggests that there might be an issue with the role definitions in the root.json file of the lockbox.
- Verify that the root.json file in your lockbox correctly defines all necessary roles and that these roles are consistent with the ones expected by the device.
Lockbox Revocation:
- If a lockbox is revoked or modified, the device will no longer accept it. Ensure that the lockbox you are using has not been revoked or modified in a way that makes it invalid.
- You can check the status of the lockbox in the Torizon Cloud Web Interface to ensure it is still valid.
Testing Environment:
- Since you are simulating updates in a test environment, ensure that the environment closely matches the production setup. Any discrepancies in the environment can lead to unexpected issues.

To resolve the issue, you can try the following steps:

Recreate the Lockbox:
- Recreate the lockbox for the non-formal update, ensuring that all metadata is correctly generated and consistent with the formal updates.
Check Metadata Consistency:
- Verify the consistency of the metadata in the lockbox, especially the root.json and snapshot metadata.

bw908 · August 30, 2024, 5:02pm

Are you re-using the same lockbox “name” each time for one of your lockboxes?

we used to sporadically see this when we were constantly modifying the definition of the same (“Software_Test”) lockbox with new revisions for internal deploy - updates to newer builds went fine, but we’d see this issue if you tried to downgrade.

My (non-official) understanding is that the update system expects the version information in the lockboxes to only ever increase to support new versions. In our case, if we attempted to “downgrade” to an older Software_Test lockbox we would see this error because the metadata version was both decreasing (older lockbox build) as well as “different” from what the OTA system knew about that lockbox (because the definition had changed). This went away when we progressed past internal development and started making “formal” lockboxes for release which had the build version in the name (“Product_v1.0.0”, “…v1.0.1” and so on).

bruno.tx · August 30, 2024, 6:03pm

Hello @mnano,

This means there’s something wrong or outdated with the security metadata that is on the Lockbox you are trying to install. You can try fixing this by re-downloading the Lockbox with TorizonCore Builder. This will re-create the Lockbox, but with up-to-date security metadata.

Let’s say that you have lockbox A intalled on the device and you want to install lockbox B.
If you get that error, this means that:

Lockbox B was downloaded before lockbox A was created and or downloaded
The security metadata was somehow changed between the download of Lockbox B and the download of Lockbox A. This can mean that a new lockbox was created or a lockbox was revoked.

For this reason, the Lockbox B you downloaded before cannot be installed, as the security metadata would need to be downgraded, which would be a security flaw.

You can still install lockbox B if it was not revoked, as long as you download it again from the Torizon Cloud with the updated security metadata.

This topic can be tricky, so please let me know if my explanation was not clear or if you have follow-up questions.

Best Regards,
Bruno

mnano · September 2, 2024, 7:12am

Thank you for the explanation.

If I understand correctly, this means that every time a lockbox is created or revoked, the security metadata of all other lockboxes is updated.

From this understanding, the correct way of handling for us is:

(Formal versions) Our CI create lockboxes that are tagged V1.0.1, V1.0.2, … Everytime a new one is created, if we want to store them in our network drive, we need to redownload all of them. In consequence, it’s probably better to just store the just name of the lockbox.
(non-formal versions) Our test system creates a lockbox from a developer’s executable of our app. Thus, all lockboxes get their security metadata updated. This means the test system needs to download lockboxes for formal version directly from Torizon cloud.
We can update a device at an older security metadata with a lockbox at a newer security metadata. But not the opposite. This means we can’t really cache update packages with the test system.

For example, I can imagine a few things:

If a customer downloads V1.1.0 at some point and then update his device from V1.0.5 to V1.1.0 with a USB containing the lockbox. There is a bug in the app so he wants to downgrade. He cannot use his older USB containing V1.0.5 lockbox, he needs to re-download a V1.0.5 lockbox.
A customer downloads a V1.1.0 update at some point. It’s a customer with multiple devices. They update some of them but not all. 1 year later they update the ones that weren’t updated. As long as the expiration date of the lockbox hasn’t occurred, the lockbox should still be valid, right? The security metadata does not change other than when an update occurs?
Does the security metadata get updated when a lockbox expires?

In our field, it regularly happens that devices are rarely updated (medical) by customers. So I am trying to estimate the impact of this mechanism. If my understanding is correct, it should be fine as long as every update is downloaded directly from Torizon platform, whether you downgrade or upgrade.

Or are the security metadata of all other lockboxes untouched when one gets created/revoked/recreated?

bruno.tx · September 2, 2024, 3:40pm

Hello @mnano,

The general understanding you put here is correct, but I am unsure about the exact triggers of new security metadata.
It is also important to mention that there are a few pitfalls when dealing with lockboxes that may lead to reaching a version limit for the metadata.

I will ask for someone from the Torizon team to comment on this specific use case so we can be sure that such problems will not occur.

Best Regards,
Bruno

bruno.tx · September 4, 2024, 6:34pm

Hello @mnano,

I spoke with one of the Torizon OS developers so I can give some more specific answers here. I will try to reply point by point in an order that should help you understand the full picture, but if you have any follow-up questions, please let me know.

To be precise, the security metadata of your whole repository is updated.
And each account on the Torizon Cloud has a unique repository, with all the packages (OS, applications, subsystems, etc) in that account.

This is correct, the security metadata cannot be downgraded to protect against rollback-based attacks.
If a lockbox with known-vulnerable or problematic software was revoked and the system has up to date metadata with this information, it will not be possible to rollback the security metadata to a version where the problematic lockbox was not revoked and could be installed.

Yes, in general that should be correct.

The security metadata has an expiry date as well, so this will also influence the ability to install the lockbox.
As long as the security metadata’s expiry date and the lockbox expiry date have not been reached the customer will be able to install the update.

The security metadata is for the whole repository/account, so changes to other lockboxes will reflect on it.
Your understanding is correct, there should be no problems installing lockboxes which have not been revoked, as longs as they are downloaded from the Torizon Cloud.

This would be a valid approach, but you need to be careful about how this will be implemented to not run into the pitfall I mentioned previously.
There is a limit to the file size of the security metadata, and when you get to hundreds of unique lockboxes this will become a problem.
The way to avoid this is to not create multiple lockboxes, but create multiple versions of lockboxes.
For example, you can have effectively as many versions of the “non-formal” lockbox to test, as long as they are all versions of that lockbox, not unique lockboxes.
For the “formal” lockboxes, you could create one for each release, as long as the releases are planned in such a way that for the lifetime of the product the lockbox limit will not be a problem.
If the release schedule is once every quarter and the product has a lifetime of 20 years, you get 80 different release lockboxes, which should not be a problem.
Please note that even revoked lockboxes are added to the security metadata and therefore increase its file size, so revoking unused lockboxes is not really a solution.

If you have follow-up questions, please reach out as this topic can be quite complex.

Best Regards,
Bruno

mnano · September 5, 2024, 5:58am

Hello,

Thank you very much for the detailed answers. It seems a lot clearer to me

I still have questions about the last part:

bruno.tx:

This would be a valid approach, but you need to be careful about how this will be implemented to not run into the pitfall I mentioned previously.
There is a limit to the file size of the security metadata, and when you get to hundreds of unique lockboxes this will become a problem.
The way to avoid this is to not create multiple lockboxes, but create multiple versions of lockboxes.
For example, you can have effectively as many versions of the “non-formal” lockbox to test, as long as they are all versions of that lockbox, not unique lockboxes.
For the “formal” lockboxes, you could create one for each release, as long as the releases are planned in such a way that for the lifetime of the product the lockbox limit will not be a problem.
If the release schedule is once every quarter and the product has a lifetime of 20 years, you get 80 different release lockboxes, which should not be a problem.
Please note that even revoked lockboxes are added to the security metadata and therefore increase its file size, so revoking unused lockboxes is not really a solution.

What does it mean “create multiple versions of lockboxes”? I can create multiple versions of a package but for a lockbox I can just define a new one. Is that when you create a lockbox with the same name as an existing one? I was under the impression that it was not possible with the API. Which is why, for non-formal lockboxes, I thought that I needed to revoke the previous one and re-create the new one.

So in my Torizon Cloud Platform, I would have 1 lockbox per release + 1 lockbox for non-formal dev version. For testing, we would just create the non-formal dev lockbox?

Best regards,

bruno.tx · September 5, 2024, 10:44am

Hello @mnano,

You are correct that the lockboxes are not explicitly versioned, but you can update an existing lockbox by using the same name when defining a new lockbox.
This is also possible via the API, as you can check here.
There is no need to revoke an existing lockbox before updating it.

Yes, that would be a viable solution.

Best Regards,
Bruno

bw908 · October 7, 2024, 7:44pm

There is a limit to the file size of the security metadata, and when you get to hundreds of unique lockboxes this will become a problem.

Approximately what order of magnitude is this limit? We currently have several offline-only products feeding into our platform account with regular bugfix/patch releases, with more to come. We currently have about 100 lockbox definitions.

It sounds like you are proposing we could “buy” some time by making internal (SQA/dev) releases re-use a lockbox definition, but this sounds like it may not fully mitigate the concern for us, only delay it.

What are the practical consequences of hitting this limit, and can it be mitigated if encountered?

bruno.tx · October 8, 2024, 6:51am

Hello @bw908,

The limit on the actual number of lockboxes will depend on the name of the lockboxes.
Smaller lockbox names will have a smaller impact on the filesize.

However, as you already have about 100 lockboxes, I would strongly recommend that you review you current workflow.
If the security metadata file reaches the size limit, you will no longer be able to create new lockboxes.
There is no way to mitigate this.

This will depend on the lifetime of the product, how long and with which cadence releases are expected.
The team that supports customers in your region is currently at Embedded World North America in Texas, but I will ask them to reach out to you afterwards so we can avoid issues with lockboxes.

Best Regards,
Bruno

jeremias.tx · October 11, 2024, 6:26pm

Greetings @bw908,

Just wanted to check in with you. If you still have some concerns about the Lockbox limit then we are open to discussing this further with you if that would help.

What my colleague Bruno has said here so far is true. To add some context, so far we’ve only had 1 known customer hit this limit. Though this was mostly due to somewhat improper automation creating far more Lockboxes than was needed for their use-case. In this case we were able to get them back into a good state by manually changing the databases on our server-side. That said this is not something we want to do for obvious reasons.

Also it should be noted this “limit” is not something we are arbitrarily enforcing by our own choice. The security standard we adhere to demands that the security metadata files have a fixed max size they can be. Since these files list every Lockbox this creates this indirect limitation on the number of Lockboxes.

All that to say the limit does technically need to exist due to the security standard we follow. Again though if you have any concerns about this topic, we are happy to discuss this with you further.

Best Regards,
Jeremias

bw908 · October 11, 2024, 6:35pm

Hi @jeremias.tx ,

Yes, lets connect on this issue outside of this thread - Many of our products either have no networking hardware or are used by customers that do not grant them internet access, so lockboxes are a key component in our use case rather than a nice-to-have. I think there are a number of mitigations to this issue but of course not all of them may be practical.

BW908

jeremias.tx · October 11, 2024, 9:41pm

Yes, lets connect on this issue outside of this thread

Sure, let me talk with the team internally and see if we can get something organized.

Best Regards,
Jeremias