Zephyr Remoteproc (verdin-imx8mp on verdin-dev)

I am trying to use Linux remoteproc with a Verdin IMX8MP on a Verdin Dev board. I am building YP with the following layers/versions:

meta-toradex-nxp     = "kirkstone-6.x.y:364341c23ea49fb875e84f980b208699311aaf9c"
meta-freescale       = "kirkstone:1c7f17f6063d0b747d94a17059b176f3ebdb3e3e"
meta-freescale-3rdparty = "kirkstone:e1ec96f0b1d89adab9ec1f224e7d6dcb0ef565c0"
meta-toradex-bsp-common = "kirkstone-6.x.y:6ee1533ec892e81a3f5937315ab132aa46fac3c7"
meta-oe              
meta-filesystems     
meta-gnome           
meta-xfce            
meta-networking      
meta-multimedia      
meta-python          
meta-initramfs       = "kirkstone:a9c25bef8882a69fd35ea7dbac3d978ce440fe06"
meta-freescale-distro = "kirkstone:d5bbb487b2816dfc74984a78b67f7361ce404253"
meta-qt6             = "dev:cfbcd5f4f1121f1d222ff2bdd7161c06b7809a2d"
meta-toradex-distro  = "kirkstone-6.x.y:2712e46aadd2ec74711d69503b61dba3548e0aa4"
meta-poky            
meta                 = "kirkstone:d3e378397395542179a8497fbff4b610f6f7e802"

I am using this device tree include to enable remoteproc/rpmsg:

#include <dt-bindings/clock/imx8mp-clock.h>
#include <dt-bindings/gpio/gpio.h>

/ {
	aliases {
		i2c0 = &i2c1;
		i2c1 = &i2c2;
		i2c2 = &i2c_rpbus_3;
	};

	reserved-memory {
		#address-cells = <2>;
		#size-cells = <2>;
		ranges;

		/delete-node/ linux,cma;

		m4_reserved: m4@0x80000000 {
			no-map;
			reg = <0 0x80000000 0 0x1000000>;
		};

		m7_itcm: m4@0x7E0000 {
			no-map;
			reg = <0 0x7E0000 0 0x20000>;
		};

		m7_dtcm: m4@0x800000 {
			no-map;
			reg = <0 0x800000 0 0x20000>;
		};

		vdev0vring0: vdev0vring0@55000000 {
			reg = <0 0x55000000 0 0x8000>;
			no-map;
		};

		vdev0vring1: vdev0vring1@55008000 {
			reg = <0 0x55008000 0 0x8000>;
			no-map;
		};

		vdevbuffer: vdevbuffer@55400000 {
			compatible = "shared-dma-pool";
			reg = <0 0x55400000 0 0x100000>;
			no-map;
		};

		rsc_table: rsc_table@550ff000 {
			reg = <0 0x550ff000 0 0x1000>;
			no-map;
		};
	};

	imx8mp-cm7 {
		compatible = "fsl,imx8mn-cm7";
		rsc-da = <0x55000000>;
		clocks = <&clk IMX8MP_CLK_M7_DIV>;
		mbox-names = "tx", "rx", "rxdb";
		mboxes = <&mu 0 1
			      &mu 1 1
			      &mu 3 1>;
		memory-region = <&vdevbuffer>, <&vdev0vring0>, <&vdev0vring1>, <&rsc_table>, <&m4_reserved>, <&m7_itcm>, <&m7_dtcm>;
		status = "okay";
	};
};

/*
 * ATTENTION: M7 may use IPs like below
 * ECSPI0/ECSPI2, FLEXCAN, GPIO1/GPIO5, GPT1, I2C3, I2S3, UART4,
 * PWM4, SDMA1/SDMA2
 */
&ecspi2 {
	status = "disabled";
};

&flexcan1 {
	status = "disabled";
};

&flexspi {
	status = "disabled";
};

/delete-node/ &i2c3;

&pwm4{
	status = "disabled";
};

&sai3 {
	status = "disabled";
};

&sdma3{
	status = "disabled";
};

&uart4 {
	status = "disabled";
};

I am building a Zephyr image with

west build -p always -b mimx8mp_evk_ddr zephyr/samples/philosophers

for the Philosophers test application. I prefer that application as it continuously outputs the philosopher status to the serial console of the M core. I also build the application for itcm to test both.

Starting either application for ddr or itcm from the u-boot prompt works fine. Starting Linux after shows the running M core attached to remoteproc:

# cat /sys/class/remoteproc/remoteproc0/state 
attached

Stopping the M core also works fine by

# echo stop >/sys/class/remoteproc/remoteproc0/state

However, starting the M core again either with the itcm or ddr elf causes a page fault in Linux:

For ddr:

# echo start >/sys/class/remoteproc/remoteproc0/state                                                                                                                                                                                                     
[  120.544522] remoteproc remoteproc0: powering up imx-rproc
[  120.550578] remoteproc remoteproc0: Firmware is an elf32 file
[  120.550596] remoteproc remoteproc0: Booting fw image rproc-imx-rproc-fw, size 676512
[  120.558483] imx-rproc imx8mp-cm7: iommu not present
[  120.558516] remoteproc remoteproc0: No resource table in elf
[  120.564240] imx-rproc imx8mp-cm7: map memory: 0000000091bca557+100000
[  120.564283] imx-rproc imx8mp-cm7: map memory: 00000000d2321245+8000
[  120.564295] imx-rproc imx8mp-cm7: map memory: 00000000c7b96d40+8000
[  120.564305] imx-rproc imx8mp-cm7: map memory: 00000000c8f03ea6+1000000
[  120.564316] imx-rproc imx8mp-cm7: map memory: 00000000bc20cfac+20000
[  120.564327] imx-rproc imx8mp-cm7: map memory: 00000000ea51ea99+20000
[  120.564339] remoteproc remoteproc0: phdr: type 1 da 0x80000000 memsz 0x6e9c filesz 0x6e9c
[  120.564349] remoteproc remoteproc0: da = 0x80000000 len = 0x6e9c va = 0x00000000b21df214
[  120.564382] Unable to handle kernel paging request at virtual address ffff80000c006e5c
[  120.572359] Mem abort info:
[  120.575171]   ESR = 0x96000061
[  120.578363]   EC = 0x25: DABT (current EL), IL = 32 bits
[  120.583692]   SET = 0, FnV = 0
[  120.586759]   EA = 0, S1PTW = 0
[  120.589926]   FSC = 0x21: alignment fault
[  120.593949] Data abort info:
[  120.596841]   ISV = 0, ISS = 0x00000061
[  120.600703]   CM = 0, WnR = 1
[  120.603671] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000049888000
[  120.610384] [ffff80000c006e5c] pgd=100000013ffff003, p4d=100000013ffff003, pud=100000013fffe003, pmd=0068000080000711
[  120.621023] Internal error: Oops: 96000061 [#1] PREEMPT SMP
[  120.626602] Modules linked in: cfg80211 fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine rng_core authenc libdes bluetooth rfkill hid_multitouch crct10dif_ce snd_soc_imx_hdmi sec_mipi_dsim_imx sec_dsim dw_hdmi_cec snd_soc_fsl_sai snd_soc_nau8822 snd_e
[  120.626672] CPU: 3 PID: 874 Comm: sh Tainted: G           O      5.15.77-6.1.0-devel+git.349786b46e61 #1
[  120.626679] Hardware name: Toradex Verdin iMX8M Plus on Verdin Development Board (DT)
[  120.626682] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  120.626689] pc : __memcpy+0x188/0x250
[  120.626699] lr : rproc_elf_load_segments+0x184/0x260
[  120.626707] sp : ffff80000b8bbb20
[  120.626709] x29: ffff80000b8bbb20 x28: 0000000000006e9c x27: 0000000000006e9c
[  120.626717] x26: 0000000080000000 x25: 00000000000000d4 x24: 0000000000000001
[  120.626725] x23: ffff0000c11e7000 x22: 0000000000000020 x21: 0000000000000005
[  120.626732] x20: 0000000000000001 x19: ffff80000bad5054 x18: ffffffffffffffff
[  120.626740] x17: 303030307830203d x16: 2061762063396536 x15: 72796870655a2067
[  120.626747] x14: 6e69746f6f42202a x13: 623738663038672d x12: 303635332d302e32
[  120.626755] x11: 000a2a2a2a206236 x10: 3038343962373866 x9 : 3038672d30363533
[  120.626763] x8 : 2d302e322e33762d x7 : 72796870657a2064 x6 : 6c69756220534f20
[  120.626770] x5 : ffff80000c006e9c x4 : ffff80000badbf70 x3 : ffff80000c006e40
[  120.626778] x2 : ffffffffffffffcc x1 : ffff80000badbf54 x0 : ffff80000c000000
[  120.626786] Call trace:
[  120.626789]  __memcpy+0x188/0x250
[  120.626794]  imx_rproc_elf_load_segments+0x20/0x40
[  120.779785]  rproc_start+0x30/0x168
[  120.779795]  rproc_boot+0x344/0x5f0
[  120.779800]  state_store+0x44/0x104
[  120.779805]  dev_attr_store+0x18/0x30
[  120.779812]  sysfs_kf_write+0x44/0x54
[  120.779819]  kernfs_fop_write_iter+0x118/0x1ac
[  120.779825]  new_sync_write+0xe8/0x184
[  120.779831]  vfs_write+0x22c/0x290
[  120.779835]  ksys_write+0x68/0xf4
[  120.779839]  __arm64_sys_write+0x1c/0x2c
[  120.779844]  invoke_syscall+0x48/0x114
[  120.779851]  el0_svc_common.constprop.0+0xd4/0xfc
[  120.779856]  do_el0_svc+0x28/0x90
[  120.779861]  el0_svc+0x28/0x80
[  120.779867]  el0t_64_sync_handler+0xa4/0x130
[  120.779872]  el0t_64_sync+0x1a0/0x1a4
[  120.779883] Code: a97e2488 a9032c6a a97f2c8a a904346c (a93c3cae) 
[  120.779888] ---[ end trace 6d13cbdfead451d0 ]---

For itcm:

# echo start >/sys/class/remoteproc/remoteproc0/state 
[   86.857238] remoteproc remoteproc0: powering up imx-rproc
[   86.881319] remoteproc remoteproc0: Firmware is an elf32 file
[   86.881339] remoteproc remoteproc0: Booting fw image rproc-imx-rproc-fw, size 676496
[   86.889187] imx-rproc imx8mp-cm7: iommu not present
[   86.889223] remoteproc remoteproc0: No resource table in elf
[   86.894994] imx-rproc imx8mp-cm7: map memory: 0000000048d67ebf+100000
[   86.895044] imx-rproc imx8mp-cm7: map memory: 00000000b777ff91+8000
[   86.895065] imx-rproc imx8mp-cm7: map memory: 00000000a047c2fb+8000
[   86.895077] imx-rproc imx8mp-cm7: map memory: 000000001459a6a0+1000000
[   86.895201] imx-rproc imx8mp-cm7: map memory: 00000000b2e789f0+20000
[   86.895212] imx-rproc imx8mp-cm7: map memory: 00000000637ae398+20000
[   86.895222] remoteproc remoteproc0: phdr: type 1 da 0x0 memsz 0x6e84 filesz 0x6e84
[   86.895230] remoteproc remoteproc0: da = 0x0 len = 0x6e84 va = 0x00000000db7dc377
[   86.895298] remoteproc remoteproc0: phdr: type 1 da 0x6e84 memsz 0x12 filesz 0x12
[   86.895304] remoteproc remoteproc0: da = 0x6e84 len = 0x12 va = 0x00000000bdcd6411
[   86.895309] remoteproc remoteproc0: phdr: type 1 da 0x6e96 memsz 0x4 filesz 0x4
[   86.895315] remoteproc remoteproc0: da = 0x6e96 len = 0x4 va = 0x000000007e8d31ef
[   86.895321] remoteproc remoteproc0: phdr: type 1 da 0x20000018 memsz 0x43c0 filesz 0x0
[   86.895327] remoteproc remoteproc0: da = 0x20000018 len = 0x43c0 va = 0x0000000037d3ce0e
[   86.895380] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[   86.904206] Mem abort info:
[   86.907027]   ESR = 0x96000004
[   86.910101]   EC = 0x25: DABT (current EL), IL = 32 bits
[   86.915449]   SET = 0, FnV = 0
[   86.918519]   EA = 0, S1PTW = 0
[   86.921672]   FSC = 0x04: level 0 translation fault
[   86.926575] Data abort info:
[   86.929469]   ISV = 0, ISS = 0x00000004
[   86.933318]   CM = 0, WnR = 0
[   86.936287] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000106616000
[   86.942751] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[   86.949560] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   86.955136] Modules linked in: cfg80211 fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine rng_core authenc libdes bluetooth rfkill hid_multitouch crct10dif_ce snd_soc_imx_hdmi snd_soc_nau8822 dw_hdmi_cec lm75 ina2xx lontium_lt8912b flexcan snd_soc_fsle
[   86.955205] CPU: 3 PID: 880 Comm: sh Tainted: G           O      5.15.77-6.1.0-devel+git.349786b46e61 #1
[   86.955211] Hardware name: Toradex Verdin iMX8M Plus on Verdin Development Board (DT)
[   87.007503] pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   87.007509] pc : __memcpy+0x110/0x250
[   87.007519] lr : rproc_start+0x84/0x168
[   87.007526] sp : ffff80000db5bbd0
[   87.007528] x29: ffff80000db5bbd0 x28: ffff0000c175b800 x27: 0000000000000000
[   87.007536] x26: 0000000000000000 x25: ffff0000c13d8c80 x24: ffff0000c1783f80
[   87.007544] x23: ffff0000c1048340 x22: ffff0000c1783f80 x21: ffff0000c1048038
[   87.007551] x20: ffff800009e1d000 x19: ffff0000c1048000 x18: ffffffffffffffff
[   87.007559] x17: 303030307830203d x16: 2061762030633334 x15: 7830203d206e656c
[   87.007567] x14: ffff800009e1d000 x13: ffff8000099e2500 x12: 0000000000000666
[   87.007575] x11: 0000000000000222 x10: ffff8000099e2500 x9 : ffff8000099e2500
[   87.007582] x8 : 00000000ffffefff x7 : ffff800009a3a500 x6 : ffff800009a3a500
[   87.007590] x5 : ffff800009e1d400 x4 : 0000000000000400 x3 : 0000000000000000
[   87.007597] x2 : 0000000000000400 x1 : 0000000000000000 x0 : ffff800009e1d000
[   87.007605] Call trace:
[   87.007608]  __memcpy+0x110/0x250
[   87.007613]  rproc_boot+0x344/0x5f0
[   87.007619]  state_store+0x44/0x104
[   87.007624]  dev_attr_store+0x18/0x30
[   87.007631]  sysfs_kf_write+0x44/0x54
[   87.007639]  kernfs_fop_write_iter+0x118/0x1ac
[   87.007645]  new_sync_write+0xe8/0x184
[   87.007651]  vfs_write+0x22c/0x290
[   87.007655]  ksys_write+0x68/0xf4
[   87.007660]  __arm64_sys_write+0x1c/0x2c
[   87.007664]  invoke_syscall+0x48/0x114
[   87.007671]  el0_svc_common.constprop.0+0xd4/0xfc
[   87.007676]  do_el0_svc+0x28/0x90
[   87.007681]  el0_svc+0x28/0x80
[   87.007687]  el0t_64_sync_handler+0xa4/0x130
[   87.007693]  el0t_64_sync+0x1a0/0x1a4
[   87.007701] Code: cb01000e b4fffc2e eb0201df 540004a3 (a940342c) 
[   87.007706] ---[ end trace d0ace529e7c99988 ]---

I have read and studied pretty much all information I was able to find on the web:

It is entirely possible that the elf files are not created correctly by the Zephyr build process.

This is the readelf -S dump for the ddr image:

# readelf -S zephyr.philosophers.mimx8mp_evk_ddr.elf                                                                                                                                                                                                      
There are 30 section headers, starting at offset 0xa4df0:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] rom_start         PROGBITS        80000000 0000d4 0002bc 00 WAX  0   0  4
  [ 2] text              PROGBITS        800002bc 000390 0051fc 00  AX  0   0  4
  [ 3] .ARM.exidx        ARM_EXIDX       800054b8 00558c 000008 00  AL  2   0  4
  [ 4] initlevel         PROGBITS        800054c0 005594 000038 00   A  0   0  4
  [ 5] devices           PROGBITS        800054f8 0055cc 000048 00   A  0   0  4
  [ 6] sw_isr_table      PROGBITS        80005540 005614 0004f8 00  WA  0   0  4
  [ 7] device_handles    PROGBITS        80005a38 005b0c 000016 00   A  0   0  2
  [ 8] log_const_se[...] PROGBITS        80005a50 005b24 000020 00   A  0   0  4
  [ 9] zephyr_dbg_info   PROGBITS        80005a70 005b44 000040 00  WA  0   0  4
  [10] rodata            PROGBITS        80005ab0 005b84 0013ec 00   A  0   0  4
  [11] .ramfunc          PROGBITS        80200000 006f86 000000 00   W  0   0  1
  [12] datas             PROGBITS        80200000 006f70 00000c 00  WA  0   0  4
  [13] device_states     PROGBITS        8020000c 006f7c 000006 00  WA  0   0  1
  [14] bss               NOBITS          80200018 006f88 00067a 00  WA  0   0  8
  [15] noinit            NOBITS          80200698 006f88 003d40 00  WA  0   0  8
  [16] .comment          PROGBITS        00000000 006f86 000020 01  MS  0   0  1
  [17] .debug_aranges    PROGBITS        00000000 006fa8 000fd8 00      0   0  8
  [18] .debug_info       PROGBITS        00000000 007f80 04a890 00      0   0  1
  [19] .debug_abbrev     PROGBITS        00000000 052810 008b90 00      0   0  1
  [20] .debug_line       PROGBITS        00000000 05b3a0 01792f 00      0   0  1
  [21] .debug_frame      PROGBITS        00000000 072cd0 002464 00      0   0  4
  [22] .debug_str        PROGBITS        00000000 075134 00be6b 01  MS  0   0  1
  [23] .debug_loc        PROGBITS        00000000 080f9f 016fa6 00      0   0  1
  [24] .debug_ranges     PROGBITS        00000000 097f48 003c08 00      0   0  8
  [25] .ARM.attributes   ARM_ATTRIBUTES  00000000 09bb50 00002e 00      0   0  1
  [26] .last_section     PROGBITS        80006eae 006f82 000004 00   A  0   0  1
  [27] .symtab           SYMTAB          00000000 09bb80 004f90 10     28 659  4
  [28] .strtab           STRTAB          00000000 0a0b10 00419d 00      0   0  1
  [29] .shstrtab         STRTAB          00000000 0a4cad 000142 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), y (purecode), p (processor specific)

The start address is 0x8000.0000 which seems correct to me. However, the DDR maps into the A53 address space and into the M7 address space at the same start address albeit with different address space sizes.

Where I am lost is at the address mapping:

[  120.564349] remoteproc remoteproc0: da = 0x80000000 len = 0x6e9c va = 0x00000000b21df214
[  120.564382] Unable to handle kernel paging request at virtual address ffff80000c006e5c

If the virtual address after mapping is va = 0x00000000b21df214 why would the paging request occur at ffff80000c006e5c?

And this one is for the itcm elf:

# readelf -S zephyr.philosophers.mimx8mp_evk_itcm.elf                                                                                                                                                                                                     
There are 30 section headers, starting at offset 0xa4de0:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] rom_start         PROGBITS        00000000 0000d4 0002bc 00 WAX  0   0  4
  [ 2] text              PROGBITS        000002bc 000390 0051fc 00  AX  0   0  4
  [ 3] .ARM.exidx        ARM_EXIDX       000054b8 00558c 000008 00  AL  2   0  4
  [ 4] initlevel         PROGBITS        000054c0 005594 000038 00   A  0   0  4
  [ 5] devices           PROGBITS        000054f8 0055cc 000048 00   A  0   0  4
  [ 6] sw_isr_table      PROGBITS        00005540 005614 0004f8 00  WA  0   0  4
  [ 7] device_handles    PROGBITS        00005a38 005b0c 000016 00   A  0   0  2
  [ 8] log_const_se[...] PROGBITS        00005a50 005b24 000020 00   A  0   0  4
  [ 9] zephyr_dbg_info   PROGBITS        00005a70 005b44 000040 00  WA  0   0  4
  [10] rodata            PROGBITS        00005ab0 005b84 0013d4 00   A  0   0  4
  [11] .ramfunc          PROGBITS        20000000 006f6e 000000 00   W  0   0  1
  [12] datas             PROGBITS        20000000 006f58 00000c 00  WA  0   0  4
  [13] device_states     PROGBITS        2000000c 006f64 000006 00  WA  0   0  1
  [14] bss               NOBITS          20000018 006f70 00067a 00  WA  0   0  8
  [15] noinit            NOBITS          20000698 006f70 003d40 00  WA  0   0  8
  [16] .comment          PROGBITS        00000000 006f6e 000020 01  MS  0   0  1
  [17] .debug_aranges    PROGBITS        00000000 006f90 000fd8 00      0   0  8
  [18] .debug_info       PROGBITS        00000000 007f68 04a890 00      0   0  1
  [19] .debug_abbrev     PROGBITS        00000000 0527f8 008b90 00      0   0  1
  [20] .debug_line       PROGBITS        00000000 05b388 01792f 00      0   0  1
  [21] .debug_frame      PROGBITS        00000000 072cb8 002464 00      0   0  4
  [22] .debug_str        PROGBITS        00000000 07511c 00be6b 01  MS  0   0  1
  [23] .debug_loc        PROGBITS        00000000 080f87 016fa6 00      0   0  1
  [24] .debug_ranges     PROGBITS        00000000 097f30 003c08 00      0   0  8
  [25] .ARM.attributes   ARM_ATTRIBUTES  00000000 09bb38 00002e 00      0   0  1
  [26] .last_section     PROGBITS        00006e96 006f6a 000004 00   A  0   0  1
  [27] .symtab           SYMTAB          00000000 09bb68 004f90 10     28 659  4
  [28] .strtab           STRTAB          00000000 0a0af8 0041a4 00      0   0  1
  [29] .shstrtab         STRTAB          00000000 0a4c9c 000142 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), y (purecode), p (processor specific)

Here the start address is 0x0000.0000 which is correct for the ITCM on the M7 but on the A53 it should be 0x007e.0000.

Since @marcel.tx was apparently able to make this work for the 8M Plus according to the presentation, I would appreciate some suggestions.

Hi @RudolfStreif,

I’m sorry for the delay. I reproduced the issue and will let you know once I have more information.

Thanks for your patience.

Best Regards,
Hiago.

Thank you, @hfranco.tx. Much appreciated. I looked at the memory allocation code in imx_rproc.c to see how this works. Unfortunately, I had not had sufficient time to fully go through the various mapping steps. It feels a little more complicated than it ought to be given that a physical address in the M7 space simply needs to be mapped to one in the A53 space. However, it’s entirely possible that I am missing some intricacies.

I also posted a similar message to this one on the NXP community board. If NXP puts the auxiliary core on the chip they should also provide software means to use it. :slight_smile: However, since the kernel is a Toradex fork from the NXP kernel there might be some code that has not been ported yet. I didn’t do that analysis yet but you probably know better about the Toradex kernel tree ancestry.

Thank you for your support. Much appreciated.
Rudi

Hi @RudolfStreif,

Sorry for keeping you waiting. I want to give you an update that I’ve created a ticket internally to investigate this issue better and I will keep you updated on the status.

The imx_rproc.c will basically read the elf headers and try to convert the memory addresses between the two processors. For example, checking the NXP Reference manual, we can see that the TCM memory is accessible from Corterx-A on address 0x007E_0000, while the same memory is accessible from address 0x0000_0000 on the Cortex-M7 side. Checking the driver, we can see this conversion:

static const struct imx_rproc_att imx_rproc_att_imx8mn[] = {
	/* dev addr , sys addr  , size	    , flags */
	/* ITCM   */
	{ 0x00000000, 0x007E0000, 0x00020000, ATT_OWN | ATT_IOMEM },
	/* OCRAM_S */
	{ 0x00180000, 0x00180000, 0x00009000, 0 },
	/* OCRAM */
	{ 0x00900000, 0x00900000, 0x00020000, 0 },
	/* OCRAM */
	{ 0x00920000, 0x00920000, 0x00020000, 0 },
	/* OCRAM */
	{ 0x00940000, 0x00940000, 0x00050000, 0 },
	/* QSPI Code - alias */
	{ 0x08000000, 0x08000000, 0x08000000, 0 },
	/* DDR (Code) - alias */
	{ 0x10000000, 0x40000000, 0x0FFE0000, 0 },
	/* DTCM */
	{ 0x20000000, 0x00800000, 0x00020000, ATT_OWN | ATT_IOMEM },
	/* OCRAM_S - alias */
	{ 0x20180000, 0x00180000, 0x00008000, ATT_OWN },
	/* OCRAM */
	{ 0x20200000, 0x00900000, 0x00020000, ATT_OWN },
	/* OCRAM */
	{ 0x20220000, 0x00920000, 0x00020000, ATT_OWN },
	/* OCRAM */
	{ 0x20240000, 0x00940000, 0x00040000, ATT_OWN },
	/* DDR (Data) */
	{ 0x40000000, 0x40000000, 0x80000000, 0 },
};

So I’m investigating the elf that has been generated from the Zephyr SDK and I also suspect there is a problem with the Resource Domain Controller. This is the part responsible to allocate the peripherals for the different processors, so it’s possible that Cortex-M7 is trying to access something that is under the Cortex-A domain and this is creating problems (kernel panics, freezing…).

Can you share with me the link for the NXP ticket? We are using a fork from 5.15 on BSP 6, so it should be updated with their drivers.

Thanks for your patience.

Best Regards,
Hiago.

Hello @hfranco.tx,

Thank you for your response.

So I’m investigating the elf that has been generated from the Zephyr SDK and I also suspect there is a problem with the Resource Domain Controller. This is the part responsible to allocate the peripherals for the different processors, so it’s possible that Cortex-M7 is trying to access something that is under the Cortex-A domain and this is creating problems (kernel panics, freezing…).

Thank you. From how I understand the output it happens when the Linux kernel tries to write the Zephyr segments to the shared memory. If it was the Zephyr binary accessing addresses under the Cortex-A domain I would expect the crash to happen when I load the Zephyr binary to the address, start it from u-boot. When I then start the Linux kernel I would either the Linux kernel or the Zephyr program on the M7 to crash. But that is not the case.

The address mapping table imx_rproc_att_imx8mn (which is the same for the mp SoC) looks correct when compared to the data sheet. It seems a straight-forward address map. However, it does not seem to be that simple looking at static int imx_rproc_prepare(struct rproc *rproc) in imx_rproc.c which eventually calls

mem = rproc_mem_entry_init(priv->dev, NULL, (dma_addr_t)rmem->base, rmem->size, da,
                                           imx_rproc_mem_alloc, imx_rproc_mem_release,
                                           it.node->name);

This is a function from the rproc framework which uses the SoC-specific callbacks imx_rproc_mem_alloc and imx_rproc_mem_release from imx_rproc.c.

When I start the M7 from u-boot and then stop it from Linux (with debug enabled), dmesg shows:

[   28.996986] remoteproc remoteproc0: stopped remote processor imx-rproc
[   29.003548] imx-rproc imx8mp-cm7: unmap memory: 0x0000000055400000
[   29.003573] imx-rproc imx8mp-cm7: unmap memory: 0x0000000055000000
[   29.003583] imx-rproc imx8mp-cm7: unmap memory: 0x0000000055008000
[   29.003592] imx-rproc imx8mp-cm7: unmap memory: 0x0000000080000000
[   29.003601] imx-rproc imx8mp-cm7: unmap memory: 0x00000000007e0000
[   29.003611] imx-rproc imx8mp-cm7: unmap memory: 0x0000000000800000

This output is from imx_rproc_mem_release and it shows the correct addresses. This is the code that prints the debug message:

dev_dbg(rproc->dev.parent, "unmap memory: %pa\n", &mem->dma);

The address is &mem->dma. When I start the M7 from Linux (with debug enabled), dmesg shows:

[  386.339797] remoteproc remoteproc0: powering up imx-rproc
[  386.364651] remoteproc remoteproc0: Firmware is an elf32 file
[  386.364670] remoteproc remoteproc0: Booting fw image zephyr.philosophers.mimx8mp_evk_ddr.elf, size 676512
[  386.374335] imx-rproc imx8mp-cm7: iommu not present
[  386.374367] remoteproc remoteproc0: No resource table in elf
[  386.380090] imx-rproc imx8mp-cm7: map memory: 00000000d1f628b8+100000
[  386.380128] imx-rproc imx8mp-cm7: map memory: 000000008c193bbd+8000
[  386.380142] imx-rproc imx8mp-cm7: map memory: 0000000092e31f02+8000
[  386.380152] imx-rproc imx8mp-cm7: map memory: 0000000058904677+1000000
[  386.380163] imx-rproc imx8mp-cm7: map memory: 000000005350f1d8+20000
[  386.380174] imx-rproc imx8mp-cm7: map memory: 000000000514bd46+20000

The map memory entries are printed by this code in imx_rproc_mem_alloc:

dev_dbg(dev, "map memory: %p+%zx\n", &mem->dma, mem->len);

The address pointers for the regions &mem->dma do not match the addresses from the memory regions. The sizes are correct.

When the remoteproc framework detects an already running M7 core (started from u-boot) it maps the addresses correctly. However, when it is starting the core the mapping goes wrong.

Can you share with me the link for the NXP ticket? We are using a fork from 5.15 on BSP 6, so it should be updated with their drivers.

The preview should show the response to it from NXP.

Hi @RudolfStreif,

Thank you for your detailed answer, I really appreciate that. This will help on my side as well.

I can see from the link you shared that NXP gave you an answer recently:

I hope you are doing well.

We tried the binaries provided by you.
It worked in u-boot using bootaux command.


We didn't observe any issues with imx8mp_m7_TCM_hello_world.elf and imx8mp_m7_TCM_rpmsg_lite_str_echo_rtos.elf.


It turns out that DDR-mapped elf support is not added to imx_rproc.

It should be mentioned that the NXP is planning to add the same in the future release.

I will first check on my side (and also check the mainline version) before taking any conclusions. For now, we can say that this feature isn’t supported yet.

Best Regards,
Hiago.

Hi @hfranco.tx ,

Thank you. Yes, I saw their response and I had seen earlier posts that said that it works with TCM. I tried TCM too and had the same issues. I will try with their examples for TCM.

In my DTS the TCM is mapped too. But the debug output shows that these addresses are off too. The mailbox addresses don’t seem to look right either.

Hi @RudolfStreif,

Can you share your hardware version and also how much RAM you have, please?

Best Regards,
Hiago.

Hi @hfranco.tx,

Can you share your hardware version and also how much RAM you have, please?

Verdin IMX8MP Q 4GB IT, V1.1A, 07321142

I tried a TCM build too of the Zephyr philosopher application but with the same result. I posted a response on the NXP board as well as the TCM binaries. Unfortunately, they did not post the binaries that they say work for them which are RTOS builds.

Since it comes down to the device tree and the M7 binaries, I asked them to post the configuration that works for them.

I don’t have the RTOS build setup. But even if I do and build the examples, I have just introduced another variable.

If I can get the TCM to work it’s a good starting point. At least it unblocks further development. I don’t know if the 128k of ITCM, DTCM will be sufficient but we will find out.

Thank you for your support,
Rudi

Hi @RudolfStreif,

Thanks for your information.

I have the same scenario on my side, the TCM is not working as well. Still testing to see what I can find.

Yes, I agree, I will focus on trying to make TCM work first since NXP also said that DDR is possibly not supported yet. I believe 128kB should be enough, since the remote proc loader strips the elf and loads only the code into the cortex m7, at least from what I could see from the imx_rproc.c.

Best Regards,
Hiago.

Thank you for your support, @hfranco.tx. It is very much appreciated.

Yes, the 128k are definitely sufficient. The stripped binary is less than 30k.

I have not heard from NXP on their open mailing list. I posted again. Since Toradex is building a lot of modules with NXP silicon, I am wondering if you have direct technical support channels.

My customer is of course also concerned about their project timeline.

Thank you.