Verdin iMX8MP 4GB v1.1B kernel boot issue with M7 environment variables set

Hi,

We’re having a boot issue with the Verdin iMX8MP v1.1B (00631101) SOMs using the same custom image that we’ve been using without issues on the Verdin iMX8MP v1.1A (00631100) SOMs. Our custom Torizon image version is 6.5.0. In order to get better info on the issue I ran some tests using downloaded TorizonCore images rather than our custom image. I ran the images through TorizonCore builder and added the verdin-imx8mp_hmp_overlay.dts overlay from device-tree-overlays.git. I tested v6.5.0, v6.6.1, and v6.7.0 with v1.1A and v1.1B SOMs. The v1.1B SOMs will not boot with any of the kernel versions. The v1.1A SOMs will boot with all kernel versions. The 6.5.0 image boot fails at the following point:

[ 1.048320] Unable to handle kernel paging request at virtual address ffff80000a4ccfff
[ 1.056247] Mem abort info:
[ 1.059057] ESR = 0x0000000096000007
[ 1.062809] EC = 0x25: DABT (current EL), IL = 32 bits
[ 1.068131] SET = 0, FnV = 0
[ 1.071185] EA = 0, S1PTW = 0
[ 1.074326] FSC = 0x07: level 3 translation fault
[ 1.079212] Data abort info:
[ 1.082093] ISV = 0, ISS = 0x00000007
[ 1.085928] CM = 0, WnR = 0
[ 1.088919] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000049c2d000
[ 1.095622] [ffff80000a4ccfff] pgd=100000013ffff003, p4d=100000013ffff003, pud=100000013fffe003, pmd=100000010019d003, pte=0000000000000000
[ 1.108175] Internal error: Oops: 96000007 [#1] PREEMPT SMP

We are setting the following u-boot environment variables:

m4boot=ext4load mmc 2:1 0x48200000 /ostree/deploy/torizon/var/pwmcontroller.bin; cp.b 0x48200000 0x7e0000 20000; dcache flush; bootaux 0x7e0000
bootcmd=run m4boot; run bootcmd_mmc2

The boots logs for all tests are attached below:

Successful 6.5.0 on v1.1A:
Boot-6.5.0-build.8_IMX8mp4g-v1.1A.txt (2.8 KB)

Failed 6.5.0 on v1.1B:
Boot-6.5.0-build.8_IMX8mp4g-v1.1B.txt (6.1 KB)

Failed 6.6.1 on v1.1B:
Boot-6.6.1+build.14_IMX8mp4g-v1.1B.txt (6.1 KB)

Failed 6.7.0 on v1.1B:
Boot-6.7.0-devel-202405+build.23_IMX8mp4g-v1.1B.txt (5.8 KB)

Thank you.

John

Greetings @johnb,

Given your observations here it would appear there is some difference between the 1.1A and 1.1B hardware revisions that cause your boot to fail.

At the moment I don’t have these specific hardware revisions that you do, so I can’t attempt a reproduction at this time. In the meantime could you try some tests and answer some questions for me.

First of all in your bootcmd variable I see you run m4boot then run bootcmd_mmc2. From your logs I can see in both the 1.1B and 1.1A cases it failed to load /ostree/deploy/torizon/var/pwmcontroller.bin, is that expected for you?

Next, could you try and see if this same behavior happens with our reference BSP images? Torizon OS is based on top of our BSP, so it would help narrow down the issue. If the issue still happens on our reference BSP it means the issue is inherent with our BSP itself, if not then it must be something specific about Torizon OS.

Best Regards,
Jeremias

Hi Jeremias,

We copy the M7 binary to /var in a later configuration step so the failed to load message is expected. The 1.1A SOM will run the pwmcontroller.bin binary on the M7 without issues.

I repeated the tests using Reference-Minimal-Image_6.5.0+build+9. I first booted the image and modified overlays.txt to include the verdin-imx8mp_hmp_overlay.dtbo overlay. The u-boot environment variables are set as follows:

m4boot=ext4load mmc 2:2 0x48200000 /var/pwmcontroller.bin; cp.b 0x48200000 0x7e0000 20000; dcache flush; bootaux 0x7e0000
bootcmd=run m4boot; run distro_bootcmd

The behavior of the two SOM versions is the same. The v1.1A SOM boots without issues and ran the M7 binary after I copied it over. I’m seeing the same kernel paging request error at line 396 on the v1.1B boot log. Here are both boot logs:

Boot-Reference-Minimal-Image_6.5.0+build9_IMX8MP4G-v1.1A (41.4 KB)
Boot-Reference-Minimal-Image_6.5.0+build9_IMX8MP4G-v1.1B (26.6 KB)

Thank you.

Best regards,
John

Given that you were able to see the same issue with 1.1B on our reference BSP, it would appear the issue is somewhere in our base BSP. That does help narrow things by a bit.

Let me report this to our BSP team and see if something can be discovered. Thank you for reporting this possible issue to us.

Best Regards,
Jeremias

Hello @johnb,
I saw from the logs you provided that it’s remoteproc that’s failing to initialize while trying to allocate the resources:

433 [    1.889786] imx6q-pcie 33800000.pcie: Detected iATU regions: 4 outbound, 4 inbound
434 [    1.897948]  rproc_handle_resources.constprop.0+0xb8/0x1a4
435 [    1.897954]  rproc_boot+0x45c/0x614     

Considering it’s the same binary that’s being loaded in both cases, I can’t explain the different behavior between the two SoM versions.

Could you share the pwmcontroller.bin file with us? It would be even better if you could share the source code as well, I think it can help reproduce the issue on our end. You can share privately if you want.

Best regards,
Rafael

Hi Rafael,

In order to further isolate the issue I re-ran the tests using the hello_world.bin from SDK_2_13_0_MIMX8ML8xxxKZ:

hello_world.tar.xz (163.6 KB)

The behavior between v1.1A and v1.1B was the same. SOM v1.1A booted hello_world.bin without issues and v1.1B had the same error:

Boot-Ref-Min-Image_6.5.0+build9_IMX8MP4G-v1.1B-hello_world (25.6 KB)

I also noticed that the boot failure is dependent on setting m4boot. It will fail with m4boot set while only booting the kernel. It will boot fine once m4boot is cleared.

Thank you.

Best regards,
John

Hi @johnb,

Our team is still investigating and analyzing the issue you brought up here. Our team was able to easily reproduce this on 1.1B as you said. On your 1.1B units you can try the following interim solution by executing the following in U-Boot:

Verdin iMX8MP # mw.w 0x550ff000 0 64

This forces the memory region here to be set to 0 which puts it in a good state to work with the M7 core.

Best Regards,
Jeremias

Hi Jeremias,

I ran the test you suggested on a v1.1B with the Verdin-iMX8MP_Reference-Minimal-Image-Tezi_6.5.0+build.9 image and hello_world.bin. After executing mw.w 0x550ff000 0 64 the kernel boots with the M7 normally.

Thank you.

Best regards,
John

Thank you for confirming the interim solution. Our team is still working on a final solution for this. We’ll try and inform you once we have something.

Best Regards,
Jeremias

Hi @johnb,

Just to put a final note on this situation. Our team has decided that running mw.w 0x550ff000, is the best solution for this. We have updated our documentation here to reflect this: How to Load Compiled Binaries into Cortex-M | Toradex Developer Center

You should be fine to set this on both 1.1A and 1.1B. On 1.1B it will fix the issue, while on 1.1A it shouldn’t have any adverse effects.

Best Regards,
Jeremias