Error in M4 ocram linker file

We have some issue with the generated elf file that probably comes from the linker file. Issue is, that the m_data section gets placed at the wrong memory location. I found the problem because the elf loader from remoteproc issued an error.

[ 1944.561752] remoteproc remoteproc0: Booting fw image RTOScewo.elf, size 109220
[ 1944.576416] remoteproc remoteproc0: bad phdr da 0x91c2e0 mem 0x72e0
[ 1944.582750] remoteproc remoteproc0: Failed to load program segments: -22
[ 1944.589537] remoteproc remoteproc0: Boot failed: -22

If I understand this correctly the elf file tries to load a memory section of size 0x72e0 to address 0x91c2e0. This would overflow the m_text region (0x9235C0 > 0x920000).

/* Specify the memory areas */
MEMORY
{
  m_interrupts          (RX)  : ORIGIN = 0x00000000, LENGTH = 0x00000240
  m_text                (RX)  : ORIGIN = 0x00910000, LENGTH = 0x00010000
  m_data                (RW)  : ORIGIN = 0x20220000, LENGTH = 0x00020000 /* EPDC */
}

I wondered why this was even possible and if there weren’t any memory size checks. Funny thing is, they exist, but they don’t see another issue. If I check the elf file with readelf I find this section:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x001000 0x00000000 0x00000000 0x00240 0x00240 R   0x1000
  LOAD           0x002000 0x00910000 0x00910000 0x0c2e0 0x0c2e0 RWE 0x1000
  LOAD           0x00f000 0x20220000 0x0091c2e0 0x00210 0x072e0 RW  0x1000

So to me it seems that it places the m_data section into the m_text section. Issue is I don’t know how to resolve this as I don’t know how the virtual address resolution is done. I checked the nxp manuals. The 0x20220000 is an alias for the 0x00920000 region as it seems, but why is it then placed in the first section anyway?

This answer can be helpful if one wants to do experiments with remoteproc

  __etext = .;    /* define a global symbol at end of code */
  __DATA_ROM = .; /* Symbol is used by startup for data initialization */

  .data : AT(__DATA_ROM)
  {
    . = ALIGN(4);
    __DATA_RAM = .;
    __data_start__ = .;      /* create a global symbol at data start */
    *(.data)                 /* .data sections */
    *(.data*)                /* .data* sections */
    KEEP(*(.jcr*))
    . = ALIGN(4);
    __data_end__ = .;        /* define a global symbol at data end */
  } > m_data

  __DATA_END = __DATA_ROM + (__data_end__ - __data_start__);
  text_end = ORIGIN(m_text) + LENGTH(m_text);
  ASSERT(__DATA_END <= text_end, "region m_text overflowed with text and data")

  /* Uninitialized data section */
  .bss :
  {
    /* This is used by the startup in order to initialize the .bss section */
    . = ALIGN(4);
    __START_BSS = .;
    __bss_start__ = .;
    *(.bss)
    *(.bss*)
    *(COMMON)
    . = ALIGN(4);
    __bss_end__ = .;
    __END_BSS = .;
  } > m_data

It seems like the issue is that the .data section that belongs to the m_data memory is put at the end of the m_text memory. This causes the rest of the m_data memory to be added continuously. Assuming the .data section was willingly added to the m_text memory this should resolve the issue:

data_start = ORIGIN(m_data);
.bss : AT(data_start)

As a side question. What is the advantage of putting the .data section into the m_text memory?

WARNING: There is a currently unresolved bug when applying these changes related to the tty rpmsg driver. See Rpmsg not running stable on m4 - Technical Support - Toradex Community

Ok, this was never really an issue it was just that remoteproc did an error check the wrong way. It took the memory size instead of the file size for the boundary check and generated an error despite the elf file being fine.

Thanks for the update.

What GCC version you using ? ( am I wrong , you are using GCC, linker?)

Can I verify what you doing in your linker wrt to memory placements:

 /* Specify the memory areas */
 MEMORY
 {
   m_interrupts          (RX)  : ORIGIN = 0x00000000, LENGTH = 0x00000240
   m_text                (RX)  : ORIGIN = 0x00910000, LENGTH = 0x00010000
   m_data                (RW)  : ORIGIN = 0x20220000, LENGTH = 0x00020000 /* EPDC */
 }

Your interrupts go to OCRAM_S ( loading to place where M4 would look for SP, PC by default after M4 platform reset ), your code to OCRAM , and data OCRAM_EPDC.
Is there reason you used alias addresses for data and interrupts, but not for text?

This made me check my generated elf file, and re-check the linker script.

Could you share your full linker script please? I want to understand how do you get the screw up with bss as you show here.

The snippets of the linker file that you posted above are same (except for addresses used) as I have, yet the bss is placed where it’s expected.

It’s the unchanged linker file from the toradex repository ( freertos-toradex.git - FreeRTOS for the Cortex M4 core of Heterogeneous Multicore modules branch colibri-imx7-m4-freertos-v8 )

Yes, we use gcc namely gcc-arm-none-eabi-4_9-2015q3

Here is the linker file used Linker file

@andriscewo Ok to satisfy my curiously about your case, I looked at the linker file, and luckily its identical Freescale linker script I’m using. I adjusted my memory linker map to suit your layout, and look at what’s produced.

I don’t see a misplacement problem. Lets see below my linked elf and how I explain it, may be you can match it against your case.

/* Specify the memory areas */
MEMORY
{
  m_interrupts          (RX)  : ORIGIN = 0x00000000, LENGTH = 0x00000240
  m_text                 (RX)  : ORIGIN = 0x20210000, LENGTH = 0x00010000
  m_data                (RW)  : ORIGIN = 0x20220000, LENGTH = 0x00020000 /* EPDC */
}

(used all aliases, vs yours/toradex’s script, a bit clearer)

Generated Elf segments:
[upload|1zsVdiD2cH+1xcUCkutnT7ulPEI=]

Looking at memory/file size of the text - 0xDA00, and the Physical address of where data starts - 0x2021DA00 - which is in the text/code segment, but this is data initialization values, which may as well go into a ROM. See linker file , __DATA_ROM. So code ends, RO data starts.

From symbol table:
[upload|orNRoSRg7bgskaKEhM5mMEwv7Nw=]

You see RO data after text in text section, while actual data section starts later as placed in linker script - 0x20220000(data start). The size of the RO data , from addresses, is 0x3D8 / 984, which is the File Size of the data segment , as you see from segments table. The File Size is what the firmware loader could/should actually be loading. For interrupts and text it’s same as Memory Size , more naturally , than for data.
For my firmware blob, the linker says my data memory is 34856, while shows it will load 984 only, and that’s actually goes into text segment.

So the rest of it, looking at Elf sections:
[upload|stPUpI9KTbqvEKgjXq214AAlVUo=]
Total data is (.data+.bss+.heap+.stack) 34856 (Memory Size). Of this, the last 3 do not need to be loaded, this is uninitialized data.

As such, in my example, everything seems to add up as it should. Is it different to yours, do you really have bss placement issue?

About:

" #define __STARTUP_CLEAR_BSS… as it seems that the .bss section doesn’t get cleared that reliable."

May be I haven’t caught it yet, but so far I cannot say the C library does not clear/zero-out what’s needed. (I might still run into the issue)

You have the same behavior. Your .bss, .heap and .stack sections will be loaded into “ROM” at run time. It actually works because the M4 doesn’t use ROM at all, but 2 flashes. You can see that in your first picture. The virtual address is the one you defined, but the physical address where it will actually load everything is different. It is right after the m_text region ends.

This happens because the AT(__DATA_ROM) will create a virtual address and drag everything behind to that place not just the data. The address you see in the symbol table is probably just the virtual one. If your code gets big enough the remoteproc driver will issue an error that you write over the boundaries of your section. It works because the 2 memory regions are continuous for the OCRAM and TCM, but in the DDR case it would probably lead to problems. I hadn’t yet time to look more at this, but this hack could also cause other sorts of issues…

I don’t see the dragging. The ROM / read-only is where it should be (the ROM term is just that, it gets placed where it should: where it must not be modified). It’s irrelevant that M4 has no ROM.

I see no hack.

The “virtual” (which is not a virtual really, at least nothing to do with Linux virtual …) sets the sections boundaries, and the “physical” is only different for placing RO data.

… drag everything behind to that place not just the data.

What specifically everything? After code and init data is placed, you left with … ?

And programming empty sections like bss, heap, stack would be wasteful .

… regions are continuous for the OCRAM and TCM

I don’t think I used TCM above.

If your code gets big enough the remoteproc driver will issue an error that you write over the boundaries of your section.

May be it expects post processing of the elf, or generated in specific way.
Because it cannot handle the “usual” elf. Just that

Ok, I think I figured it out now.

I understood this part wrong due to the error of the remoteproc

LOAD           0x00f000 0x20220000 0x0091c2e0 0x00210 0x072e0 RW  0x1000

The remoteproc had an error telling me it writes outside the boundaries of the section because 0x0091c2e0 + 0x072e0 > 0x00920000, but actually it would only have to load the file size which would be 0x0091c2e0 + 0x00210 < 0x00920000. Didn’t realize that the .bss, .heap and .stack don’t need to be loaded

So it seems it was a problem of the remoteproc

Great, the provided OCRAM linker script, for intended use of it, is Ok then no error in it.

For remoterpoc, I think we should spawn another thread , or post here link text

The remoteproc does appear, from code, to program the filesizes, and in the trace you provided it fails elsewhere: the place where it resolves or grabs device address of the section (from elf’s phys addr) to system/kernel address, using implementation defined (=imx) memory layout. See remoteproc_elf_loader.c here: link text

/* grab the kernel address for this device address */
ptr = rproc_da_to_va(rproc, da, memsz);

And that uses the full memory section size, which will not fly of course when you loading with-in section.

The imx rproc device to system mapping (link text) does handle offsets, and maps fine with-in section provided sane length requested.

So it could/should be a bug: the last parameter to rproc_da_to_va should be filesz. ( From test above that call, we also guaranteed filesz is less/equal memsz, so it’s safe…)

I think I’ll ask Toradex to review it ( @stefan.tx or @marcel.tx ). I wouldn’t mind then to send the (“huge”) patch to this to them.

Otherwise, you can “hack” your linker script to avoid the with-in placement.