Corrupt image when building Torizon minimal

Hello

We are building a custom Torizon minimal OS for our devices. What we noticed is that sometimes when building it produces a “corrupt” image.

The build process finishes without errors and the image is uploaded to Torizon Cloud successfully. But when we install it on the device we see a lot of errors like here:

[FAILED] Failed to start Network Name Resolution.
[FAILED] Failed to start Network Time Synchronization.
[FAILED] Failed to start Network Name Resolution.
[FAILED] Failed to start Network Time Synchronization.
[FAILED] Failed to start Network Name Resolution.
[FAILED] Failed to start Network Time Synchronization.
[FAILED] Failed to start Network Name Resolution.
[FAILED] Failed to start Network Time Synchronization.
[FAILED] Failed to start Network Name Resolution.
[FAILED] Failed to start Network Name Resolution.
[FAILED] Failed to start Network Time Synchronization.
[FAILED] Failed to start Network Time Synchronization.
[FAILED] Failed to start Network Configuration.
[FAILED] Failed to start Avahi mDNS/DNS-SD Stack.
[FAILED] Failed to start D-Bus System Message Bus.
[DEPEND] Dependency failed for Network Manager.
[DEPEND] Dependency failed for Network Manager Wait Online.
[FAILED] Failed to start D-Bus System Message Bus.
[FAILED] Failed to start Network Configuration.
[FAILED] Failed to start D-Bus System Message Bus.
[FAILED] Failed to start Network Configuration.
[FAILED] Failed to start D-Bus System Message Bus.
[FAILED] Failed to start D-Bus System Message Bus.
[FAILED] Failed to start Network Configuration.
[FAILED] Failed to start D-Bus System Message Bus.
[FAILED] Failed to start Bluetooth service.
[FAILED] Failed to start Network Configuration.
[FAILED] Failed to start Network Configuration.
[FAILED] Failed to start System Logger Daemon "default" instance.
[FAILED] Failed to start Disk device monitor based on udisks2.
[FAILED] Failed to start System Logger Daemon "default" instance.
[FAILED] Failed to start System Logger Daemon "default" instance.
[FAILED] Failed to start System Logger Daemon "default" instance.
[FAILED] Failed to start System Logger Daemon "default" instance.
[FAILED] Failed to start System Logger Daemon "default" instance.

The device can’t connect anymore with the cloud agent so we can’t install another working software either.

We dug a bit deeper and found that the group file seems to be incomplete:

# NSS altfiles module is installed. Default user, group and shadow files are in
# /usr/lib/
adm:x:4:torizon
dialout:x:20:torizon
sudo:x:27:torizon
audio:x:29:torizon
video:x:44:torizon
plugdev:x:46:torizon
gpio:x:49:torizon
i2cdev:x:51:torizon
spidev:x:52:torizon
pwm:x:54:torizon
users:x:100:torizon
input:x:101:torizon
torizon:x:1000:

We are using a fork of your torizon-manifest repo and build using bitbake. My assumption is that it happens when we change branches and start a new build. Maybe you have some more insight why this happens?

Best Regards,

Ian

Hi, I’m the Toradex AI Assistant.

I might make mistakes or occasionally get you wrong. Our support team will step in if needed.


I’d be glad to help troubleshoot this issue with your custom Torizon minimal OS build! To provide the most accurate guidance, could you let me know:

  • Which hardware platform are you using?
  • Which exact version of Torizon OS are you building from?

You can get this information by running the tdx-info command on a working device. More details about this command are available at: Getting Device Information with Tdx-Info | Toradex Developer Center

Initial Analysis

Based on the errors you’re seeing, this appears to be a filesystem corruption issue affecting system groups and potentially other core system files (1). The incomplete /etc/group file you’ve identified is likely causing the cascade of service failures, as many system services depend on proper user/group resolution (1).

Potential Causes for “Corrupt” Images

When building Torizon OS with Yocto, several factors can lead to incomplete or corrupted builds (2):

1. Build State Contamination

Switching branches without proper cleanup can leave residual build artifacts that interfere with new builds (2). The Yocto build system maintains extensive state information that may not be fully reset between branch changes.

2. Incomplete Build Cleanup

The build system stores intermediate files in the build/tmp directory, and these may not be properly cleaned when switching contexts (2).

Recommended Solutions

1. Clean Build Environment

When switching branches in your forked torizon-manifest repo, perform a complete build cleanup (2):

$ rm -rf build/tmp
$ rm -rf build/cache

(2)

2. Verify Git State

Since Torizon OS requires all custom layers to be version controlled by Git for OSTree integration, ensure your branch switches are complete (3):

$ git status
$ git clean -fd

(3)

3. Use Fresh Build Directory

Consider using separate build directories for different branches to avoid cross-contamination (2).

4. Verify Layer Configuration

Check that your conf/bblayers.conf is properly updated for the target branch and that all required layers are present (2).

Once I know your specific hardware platform and Torizon OS version, I can provide more targeted troubleshooting steps for your build environment.

Best regards, Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.

Hello @i.staehli,

This error is not expected.

To be clear, the error only manifests when one of these “Broken” OS packages is sent as an update to the device, correct?
Is the OS update rolled back or does the device gets stuck?

If you flash a device with a TEZI image that was generated on the same Yocto build, does it also have the same problem?

Another point that may be relevant is the build environment.
Is this present only in a specific build machine or across different developers?
Is anything related to the sstate or build temporary files configured differently from default?

Best Regards,
Bruno

Hi @bruno.tx

  1. Yes, the errors on the device only appear after the broken OS package is sent to the device
  2. The OS update is not rolled back and we can’t install another update, because it’s not working properly
  3. When flashing with TEZI we get the same errors on the device
  4. So far only I encountered this problem, but we all use the same server to build our images
  5. I’m not familiar with how the sstate or temporary build files can be configured, but we shouldn’t have changed anything from default here