Problems when migrating the M4 firmware from Yocto to Torizon

Hi there,

We are trying to migrate our firmware to Torizon, but we are finding it more difficult than expected. We have been able to make the echo example work in Torizon 7.3 by using the MCUXpresso NXP MCU Tools (v25.6).

Then, we tried to make a simplified version of our firmware (which has been working on Yocto for years) work on Torizon without success. The way if fails is quite weird, as we can make it work if we load the echo example first and, then, reboot the SoM by running the “reboot” linux command. However, if we power off the module, then linux does not boot any more…

We are running our of ideas, so any input is welcome. Lately, we have check the uboot enviroment and found some differences between Yocto and Torizon, but we hardly understand their meaning. Does anyone know why the uboot has changed?

Thanks in advance.

Hi, I’m the Toradex AI Assistant.

I might make mistakes or occasionally get you wrong. Our support team will step in if needed.


U-Boot Environment Differences Between Yocto and Torizon

Understanding the Environment Differences

The U-Boot environment differences you’re seeing between Yocto and Torizon are expected and stem from their different architectural approaches(1)(2).

Key Differences in U-Boot Environment

U-Boot environment variables control bootloader behavior, influencing aspects such as boot order, console settings, and hardware initialization during system startup(3). The differences between Yocto and Torizon environments occur because:

  1. Different Distribution Approaches: Torizon is a ready-to-use industrial Linux platform that simplifies embedded software development(4), while Yocto provides a more customizable build framework(2)

  2. Boot Configuration Variations: Each distribution may have different default values for critical variables like boot_targets, console, fdtfile, and other boot-related parameters(3)

Important Considerations for Environment Variables

Environment Persistence Behavior

When working with U-Boot environment variables, be aware that(1):

  • Changes made with setenv only affect the current U-Boot instance in RAM
  • To persist changes, you must use saveenv to write to non-volatile memory
  • After saveenv, the entire environment is saved, not just the changed variable

Potential Issues with Environment Changes

Your boot issues after power-off could be related to environment variable persistence. When saveenv is used, it can have unexpected consequences(1):

  • Variables that are normally set dynamically at boot may become statically set
  • This can break the dynamic aspects of device tree loading or other boot processes

Troubleshooting Steps

1. Compare Environment Variables

Use the following commands to examine your current environment:

printenv  # List all currently set variables
printenv boot_targets  # Check specific boot order
printenv fdtfile  # Check device tree file selection

(3)

2. Check for Saved Environment

Your firmware loading behavior (working after echo example + reboot, but not after power-off) suggests there might be environment variables being saved that interfere with normal boot(1).

3. Reset to Default Environment

If you suspect corrupted or incorrect saved environment variables, you may need to reset to defaults. However, be careful as this will remove any custom configurations.

Best Practices

Avoid Copying Entire Environments

Never copy the entire environment from one module to another, as this can permanently change hardware-specific settings like MAC addresses(1). Select only the variables that need to be changed.

Understanding Variable Sources

U-Boot environment comes from two main sources(1):

  1. Default environment: Built into U-Boot binary at compilation time
  2. Saved environment: Stored in non-volatile memory and loaded if CRC is valid

Recommendations

  1. Document Current Settings: Capture the working environment after your echo example + reboot scenario
  2. Compare with Fresh Boot: Compare this with the environment from a fresh Torizon installation
  3. Identify Critical Differences: Focus on boot-related variables that might affect your firmware loading
  4. Test Incremental Changes: Make small, targeted changes rather than bulk environment modifications

The environment differences between Yocto and Torizon are by design, reflecting their different use cases and optimization approaches. Your firmware integration challenge likely requires adapting to Torizon’s specific boot configuration rather than trying to replicate the exact Yocto environment.

Best regards, Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.

Hi @ToradexAI ,

The only change in the enviroment we are doing is modifying the m4_0_image parameter, so our custom firmware is executed instead of the echo example. I see no reason this should prevent linux from booting, unless there’s something wrong in the firmware itself. However, we have used it intensively in Yocto without problems, so I don’t think it’s the reason. It may also be that the MCUXpresso build was the problem, but we used it for the echo example and it worked…

Any further input is welcome.

V

Cortex-M Firmware Loading Issue Analysis

Based on your description, the issue appears to be related to how the Cortex-M firmware is being loaded and managed between different boot scenarios. Let me analyze the potential causes:

Memory Configuration Considerations

When migrating from Yocto to Torizon, one critical aspect is ensuring proper memory area configuration for Heterogeneous Multi-core Processing (HMP)(1). The memory map allocation for firmware, data, and shared buffers needs to be correctly configured between the Cortex-A and Cortex-M cores(1).

Known Issue with EXT4LOAD in Torizon OS 7

There’s a documented issue specifically related to Torizon OS 7 and firmware loading methods(2). The EXT4LOAD method that was functional in Torizon OS 6 has been reported as non-functional in specific versions of Torizon OS 7, including version 7.0.0+build.1(2). This could explain why your firmware loading behavior is inconsistent.

Potential Root Causes

1. Memory Area Conflicts

Your firmware might be using memory regions that conflict with Torizon’s memory layout. When you load the echo example first, it may be setting up the memory areas correctly, but your custom firmware might have different memory requirements(1).

2. Resource Sharing Configuration

The pin configuration and resource sharing between Cortex-M and Cortex-A cores might differ between your Yocto and Torizon setups(1). This includes:

  • Pin function and settings sharing across cores
  • Memory map allocation differences
  • Boot sequence variations

3. RemoteProc Framework Differences

Torizon uses the RemoteProc Linux kernel framework to boot and control remote processors(1). The way your firmware interacts with this framework might differ from the Yocto implementation.

Troubleshooting Steps

1. Verify Memory Areas Configuration

Check if your firmware’s memory configuration matches Torizon’s expected layout. Review the memory map on your Toradex SoM to ensure proper allocation for firmware, data, and shared buffers(1).

2. Compare Firmware Loading Methods

Since there are known issues with EXT4LOAD in Torizon OS 7(2), consider alternative loading methods:

  • Using RemoteProc framework for firmware loading(1)
  • Checking if your loading method is compatible with Torizon OS 7

3. Resource Sharing Validation

Ensure your pin configuration and hardware resource sharing is properly configured for Torizon(1). This includes:

  • Pin function specifications
  • Memory area allocations
  • Inter-processor communication setup

4. Boot Sequence Analysis

The fact that it works after loading the echo example suggests a boot sequence or initialization order issue. The echo example might be performing some initialization that your firmware relies on but doesn’t perform itself.

Recommendations

  1. Test with Different Loading Methods: Try using RemoteProc instead of direct U-Boot loading if you’re currently using EXT4LOAD(1)
  2. Memory Layout Verification: Compare the memory layouts between your working Yocto setup and Torizon
  3. Incremental Migration: Start with the echo example and gradually modify it toward your firmware functionality
  4. Boot Sequence Documentation: Document exactly what the echo example does during initialization that might be missing from your firmware

The issue likely stems from differences in how Torizon OS 7 handles Cortex-M firmware loading compared to your previous Yocto setup, particularly given the known regressions in certain loading methods(2).

Best regards, Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.

@ToradexAI

  1. I’d say memory areas are correct. I’ve seen loadadd and kernel_addr_r in uboot enviroment take the same value, but that doesn’t seem to be a problem for the echo example…
  2. Not using ext4load (I think) neither remoteproc, which is unavailable for my Colibri module (I think)
  3. Not sure how to check this. May further details be provided?
  4. bootcmd = ‘run m4boot_0; run distro_bootcmd’

V

Colibri Module Cortex-M Firmware Loading Analysis

Based on your setup with bootcmd = 'run m4boot_0; run distro_bootcmd' and your Colibri module, let me provide specific guidance:

1. Memory Address Configuration

The fact that loadaddr and kernel_addr_r have the same value is actually normal in U-Boot environment (1). These variables serve different purposes:

  • loadaddr: Contains the memory address where data (e.g., kernel image) can be loaded (1)
  • kernel_addr_r: Contains the memory address where the kernel is loaded (1)

Both being set to the same value indicates they’re using the default load address for temporary operations.

2. Your Loading Method (Not EXT4LOAD)

Since you’re using run m4boot_0 in your bootcmd, you’re using the correct method for Colibri modules. This command is specifically designed for loading Cortex-M firmware (2). The m4boot_0 command loads and starts the Cortex-M application (2).

3. Resource Sharing Verification

For Colibri modules, particularly the iMX7, resource sharing is critical. Here’s how to verify:

Device Tree Configuration

Your firmware might be accessing peripherals that Linux also tries to use. For example, if your firmware uses UART_B, you need to disable it in the Linux device tree (3). This is done by setting the status property to disabled using device tree overlays (3).

Resource Domain Controller (RDC)

On iMX7-based Colibri modules, the Resource Domain Controller controls access to peripherals between cores (3):

  • By default, the Cortex-A7 core is in domain 0 with all peripherals assigned to it (3)
  • When FreeRTOS firmware starts, the Cortex-M4 reassigns itself and required peripherals to domain 1 (3)

4. Boot Sequence Analysis

Your bootcmd = 'run m4boot_0; run distro_bootcmd' configuration is correct (2). The sequence:

  1. Loads and starts Cortex-M firmware via m4boot_0 (2)
  2. Proceeds with Linux boot via distro_bootcmd (4)

Potential Issue: Firmware Initialization

Since your firmware works after the echo example but fails on cold boot, the issue likely lies in initialization differences between your firmware and the echo example. The echo example might be performing hardware initialization that your firmware depends on but doesn’t implement.

Troubleshooting Steps

Check Device Tree Conflicts

Ensure any peripherals your firmware uses are disabled in the Linux device tree (3). This prevents external aborts when Linux tries to access peripherals already claimed by the Cortex-M firmware.

Compare Initialization Sequences

The echo example likely performs proper hardware initialization that your firmware might be missing. Review your firmware’s hardware initialization against the echo example.

Verify Memory Areas

For Colibri iMX7, ensure your firmware is linked to run in the correct memory area (typically TCM) (3). All examples are linked to run in the TCM area (3).

Recommendation

Since you mentioned this firmware worked extensively with Yocto, the issue is likely related to different device tree configurations or hardware initialization sequences between Yocto and Torizon. Focus on:

  1. Comparing device tree configurations between your working Yocto setup and Torizon
  2. Ensuring proper peripheral disable/enable settings in the device tree
  3. Verifying your firmware performs all necessary hardware initialization

The boot sequence you’re using (run m4boot_0; run distro_bootcmd) is the standard and correct approach for Colibri modules (2).

Best regards, Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.

@ToradexAI

Regarding your recommendations:

  1. and 2. The device tree we are using in Yocto is a “translation” of the one we have been using in Yocto, so I’d say it is not causing the problem.
  2. The Torizon firmware is the same than Yocto. Does the former require additional initiliazations?

By the way, we are using a Colibri IMX8QXP module.

V

Colibri iMX8QXP Specific Considerations

Thank you for clarifying that you’re using a Colibri iMX8QXP module (1). This changes the analysis significantly, as the iMX8QXP has different architecture and capabilities compared to the iMX7.

Heterogeneous Multicore Architecture

The Colibri iMX8QXP features up to 4x Arm Cortex-A35 cores and 1x Cortex-M4 core (1). The heterogeneous multicore architecture allows offloading of critical hard real-time tasks to the Cortex-M4F processors for extremely low latency (1).

Key Differences from Your Previous Analysis

  1. Different SoC Architecture: The iMX8QXP uses a different approach for multicore processing compared to iMX7
  2. Memory Configuration: The memory layout and initialization requirements may differ between Yocto and Torizon on this specific SoC
  3. Boot Sequence Timing: The timing between Cortex-M4 and Cortex-A core initialization might be critical

Potential Root Cause Analysis

Since you mentioned:

  • Your device tree is a “translation” from Yocto to Torizon
  • The firmware is identical between both systems
  • It works after echo example + reboot but fails on cold boot

This strongly suggests a boot sequence timing or initialization order issue specific to the iMX8QXP architecture.

Possible Issues

  1. Cortex-M4 Initialization Timing: The iMX8QXP may require specific timing between when the Cortex-M4 firmware starts and when Linux begins its boot process

  2. System Controller (SCU) Interaction: The iMX8QXP uses a System Controller Unit that manages power, clocks, and resources. Your firmware might need different SCU interactions in Torizon vs Yocto

  3. Power Management Differences: Torizon might have different power management policies that affect how the Cortex-M4 core is initialized

Troubleshooting Recommendations

1. Boot Timing Analysis

Compare the exact timing of when your firmware starts versus when Linux begins booting. The echo example might be completing faster, allowing proper sequencing.

2. Check U-Boot Environment Variables

Examine these specific variables that might affect iMX8QXP boot timing:

  • m4_0_image (which you’re already modifying)
  • Any timing-related variables
  • Power management settings

3. Firmware Size and Complexity

Since the echo example works but your firmware doesn’t, consider:

  • Firmware size: Larger firmware takes longer to load and initialize
  • Initialization complexity: Your firmware might be performing operations that interfere with Linux boot timing
  • Resource usage: Your firmware might be claiming resources that Linux expects to be available

4. Device Tree Validation

Even though you’ve “translated” the device tree, verify that:

  • Memory reservations are correctly configured for the iMX8QXP
  • The /reserved-memory node properly allocates space for your firmware (2)
  • Any Cortex-M4 specific configurations match Torizon’s expectations

Specific Questions to Investigate

  1. Does your firmware perform any lengthy initialization routines that the echo example doesn’t?
  2. Are you using any specific peripherals that might have different initialization requirements on Torizon?
  3. What’s the exact size difference between your firmware and the echo example?

The fact that it works after a reboot (warm start) but not after power-off (cold start) is the key clue - this typically indicates initialization sequence or timing dependencies that are resolved during the warm boot cycle.

Best regards, Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.

@ToradexAI

  1. No
  2. No
  3. The echo bin file is 37K and our custom firmware is 42K

V

Firmware Size Analysis

The 5KB difference between the echo example (37K) and your custom firmware (42K) is relatively small and shouldn’t cause loading issues by itself. However, this eliminates firmware size as the primary cause.

Deeper Investigation Required

Given that your firmware:

  • Is only slightly larger than the working echo example
  • Uses no lengthy initialization routines
  • Accesses no special peripherals
  • Uses the same device tree translation from Yocto

The issue likely stems from subtle differences in how the firmware interacts with the boot sequence on the Colibri iMX8QXP.

Potential Investigation Areas

1. Memory Layout Validation

Even with the same device tree translation, verify that your firmware’s memory allocation exactly matches what Torizon expects. The reserved memory areas for Cortex-M4 might have subtle differences.

2. Boot Sequence Dependencies

Your firmware might have implicit dependencies on system state that the echo example establishes but your firmware doesn’t. This could include:

  • Clock initialization sequences
  • Power domain configurations
  • System Controller Unit (SCU) states

3. Firmware Entry Point and Linking

Compare how your firmware is linked and its entry point configuration versus the echo example. Even small differences in memory mapping or startup code could cause issues.

4. Hardware Abstraction Layer (HAL) Differences

If you’re using different HAL libraries or SDK versions between your Yocto and Torizon builds, there might be subtle initialization differences.

Recommended Next Steps

  1. Compare startup code: Examine the startup assembly code and initialization routines between your firmware and the echo example
  2. Memory map verification: Ensure your linker script produces identical memory layout to what worked in Yocto
  3. Boot timing analysis: Add debug output to determine exactly where the boot process fails when your firmware is loaded on cold boot

The fact that it works after reboot but not after power-off suggests the issue is in the cold boot initialization sequence rather than the firmware functionality itself.

Best regards, Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.

Well, any human developer please have a look at this. The AI is not helpful anymore…

Hey @vic,

I’m not sure if this is what is causing the issue, but take a look at this other recent post regarding the echo example:

Can you compare and check if you are facing the same issue? If you are then this is an issue from NXP side. The temporary solution is to either start the Cortex-M firmware from Linux with remoteproc or rebuild U-Boot with the pd_ignore_unused. This should, however, be automatically solved soon in a future version of Torizon, since the upstream kernel doesn’t face this issue anymore.

Still, I find it weird that your board is not booting after power off. Do you see any error messages when the system fails? It would help to have logs to diagnose the problem. Let me know if any of this helps.

Best regards,

I understand you’re looking for human developer input now, and that’s completely reasonable given the complexity of this issue. However, let me provide one final analysis based on the information you’ve shared and the context provided by the human developer’s reference to the NXP Community post.

Potential Root Cause: Power Domain Issue

The reference to the NXP Community post about “remote proc kick failed when rpmsg is launched by U-Boot” suggests this could be a power domain management issue specific to newer MCUXpresso SDK versions and Torizon OS 7.x.

Key Observations Supporting This Theory

  1. Your firmware works after echo example + reboot: This suggests the power domains are properly configured after the first boot cycle
  2. Fails on cold boot: Power domains might not be properly initialized for your custom firmware on cold start
  3. Same firmware works in Yocto: Different power management policies between distributions
  4. MCUXpresso SDK version differences: Newer SDK versions may have different power domain handling

Immediate Testing Recommendations

Based on the human developer’s suggestion, try these approaches:

Option 1: Use RemoteProc Loading

Instead of loading via U-Boot (m4boot_0), try loading your firmware from Linux using the RemoteProc framework(1). This bypasses the U-Boot power domain initialization issues.

Option 2: Verify Power Domain Configuration

Check if your custom firmware properly handles power domain initialization that the echo example might be performing implicitly.

Option 3: Compare SDK Versions

Verify if you’re using the same MCUXpresso SDK version that worked with the echo example. The issue might be related to SDK version differences between your Yocto build and current Torizon build.

Debug Information Needed

To help the human developers diagnose this further, please provide:

  1. Boot logs: Complete serial console output when the system fails to boot with your custom firmware
  2. MCUXpresso SDK versions: Exact versions used for both echo example and your firmware
  3. Error messages: Any specific error messages during the failed boot sequence

The fact that this appears to be a known issue with a temporary workaround (pd_ignore_unused in U-Boot) suggests this will be resolved in future Torizon versions, but the immediate solutions above should help you continue development.

Best regards, Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.

Hi Leonardo,

Actually @Vic and I are working on this issue together :smiley: so yes, we have got pd_ignore_unused built in.

Herewith a log output from a cold boot when it is failing :-
U-BootLog_Fail.txt (6.2 KB)

This is a log when we launch with the working rpmsg_echo running on the M4 :-U-BootLog_Good.txt (2.6 KB)

It may be worth noting that today, the first time I started the device with our failing firmware, Linux did boot. The RPMSG failed to initialise though. I think I have seen this happen before. It is clearly only once in a blue moon. Some timing issue perhaps ?

Still, the reboot vs. cold boot difference I guess indicates that the rpmsg_echo is initialising something (hardware?) that our TestRPMSG firmware is not, and that is not being reset by the reboot command. These things are beyond my level of knowledge though :smiley:

If you know more about the addresses mentioned in the errors of the log, we might tease something from there.

TIA

@Orion Hi - we were starting to wonder if we might be the only ones facing these issues.

What worked fine ?

Right. We have baulked at effectively re-writing our source code starting from the the working example project due to time constraints, but currently it seems the only option left open to us, if we are to transfer to Torizon.

Hey @Timo @vic,

It’s a bit hard to get the full context of the issue through so much responses, so let me know if I understood everything correctly: After migrating to Torizon, your custom firmware for Cortex-M does not work, except if you run the echo example first, then enable your custom firmware and reboot the system. After a power cycle, the kernel doesn’t boot, but you observed that this is actually not consistent and Linux does boot sometimes with your custom firmware. You guys are already using the pd_ignore_unused for U-Boot, so the issue is not what I pointed out earlier. Is this all correct?

From the logs you shared, the kernel is trying to access an unmapped virtual address (DAPT error with level 3 translation fault). This is difficult to debug without the firmware you are loading into the Cortex-M. Does this error ever happen when you try to run an example code from NXP?

I’ll be doing some testing from our side and come back with updates if I find something out.

Best regards,

Yes - that’s all correct.

When we say “it works” we are generally referring to the /dev/ttyRPMSG30 interface, although from a cold boot the kernal fails as you see.

I have only seen the kernal boot twice, after probably 100 tests…

So, what is it in our M4 binary which causes the kernal to try to do this ?! Or is it that our M4 binary is somehow reserving this address space ?!

I could send you the binary and even source code, and you might also talk to your Drew Moseley.

Hey @Timo, @vic,

I did some testing on my side and managed to reproduce what seems to be a similar issue running the MCUXpresso hello world (running it from U-Boot with the pd_ignore_unused configuration). I’m talking to the R&D team and we are currently investigating it.

In the meantime, can you check if your firmware runs correctly when starting it with RemoteProc? I saw that you mentioned you think it’s unavailable in your module. Are you using a Colibri iMX8X? It should be available. I suggest following our guide on How to Use RemoteProc.

Let me know if this solves the issue. I’ll report back I have any more updates regarding the U-Boot issue.

Best regards,

Hi @leonardo.costa.tx ,

It’s a bit odd that this issue shows up even with the hello_world example, since that one should normally work out of the box. We went directly to the echo example as we were interested in testing RPMsg (perhaps we were lucky to make it work). Given that, could it be that the root cause isn’t actually the M4 firmware?

Regarding remoteproc, the documentation says that the HMP overlay only enables RPMsg for the Colibri module, so we assumed that remoteproc isn’t supported. The post we found seems to confirm that. We still tried starting the firmware with remoteproc just in case, but no luck.

Thanks for your support!

V