Rpmsg non 100% reliable on VF61

I’m going to describe a really weird issue, and so I know perfectly that it will be difficult to investigate and find a solution.

But maybe, sharing ideas and efforts could bring to something useful.

I implemented rpmsg using ARM.AMP 1.0.1-dev1 from ARM CMSIS_5 on M4 core (I use Keil DS-MDK IDE) and rpmsg library included in Toradex CE Libraries 2.2b4428-20180525 on A5 core running CE 6.

I found that rpmsg communication between M4 and A5 core of VF61 is not reliable in this sense:

  • I compile sources for M4 and the communication works
  • I change something in M4 application (i.e. either I increase the size of some buffers - without using and/or accessing the extra space; or I add other code to my sources, so increasing the code size) and rpmsg communication doesn’t work anymore

When I say “doesn’t work anymore” I mean that CE 6 doesn’t see answer from M4 core, even if I increase to rx timeout to 1 second or more.

Based on this situation, I used GPIO pins to debug M4 firmware, and I see that M4 firmware works as expected: M4 receives the message from A5 and answers in milliseconds (as it does when the communication works).

This is really strange for me, and so I though to some problems in rpmsg library for CE 6 (memory not cleaned, or something like that).

I tried to use Map_Mem library to map RxRingAddr and TxRingAddr buffers on CE 6 side, but I don’t see what I expected to do (even when communication works).

But I see that @andriscewo has a similar issue with different hardware and different OS on A-core.

I don’t know if the issues can be somehow related, or not.

I need help finding some ideas on how to narrow-down and debug what happened.

Dear @vix,

Thank you for contacting support and detailed description.

This issue is talking about code compile section and execute problem on M4 core. But, In your case M4 firmware runs successfully and the issue seems to be communicating with A5 is the problem.

We are doing Map_MapMemory for RxRingAddr and TxRingAddr inside the library, you don’t need to do it on your application side.

Maybe this issue related to finding correct RxRingAddr and TxRingAddr on both the sides and fix it.

Do you have RPMsg library code?

If possible, please share reproducible demo application and we will try to look it.

We are doing Map_MapMemory for
RxRingAddr and TxRingAddr inside the
library, you don’t need to do it on
your application side.

Is this available in a preliminary release that I can download and test?

Do you have RPMsg library code?

No, I don’t have.

If possible, please share reproducible demo application and we will try to look it.

I’ll try to do my best to create an example as simple as possible.
Is ok for you if I share a compiled application fo M4 core and a simple (I hope so) example for A5 core?

Dear @vix,

Is this available in a preliminary release that I can download and test?

Yes, it is available from Dec’12 -2016. Please use the preliminary release or standard library release

Do you have RPMsg library code?

ok

I’ll try to do my best to create an example as simple as possible. Is ok for you if I share a compiled application fo M4 core and a simple (I hope so) example for A5 core?

It is fine. Please share.

Hi @raja.tx

can you clarify how I can access to RxRingAddr and TxRingAddr from CE 6?

I use the following instructions

//Rx shared memory address, this must be identical to the VRING0_BASE value in "\freertos-toradex\middleware\multicore\open-amp\porting\vf6xx_m4\platform_info.c"
returnValue = Rpmsg_SetConfigInt(hRpmsg, L"RxRingAddr",    0x3F070000,       StoreVolatile);
    
//Tx shared memory address, this must be identical to the VRING1_BASE value in "\freertos-toradex\middleware\multicore\open-amp\porting\vf6xx_m4\platform_info.c"
returnValue = Rpmsg_SetConfigInt(hRpmsg, L"TxRingAddr",    0x3F074000,       StoreVolatile);

to configure the rpmsg channel, but I don’t understand how to access these buffers (because 0x3F070000 are 0x3F074000 physical addresses).

This is the reason why I thought to use Map_Mem.

Dear @vix,

It is possible to get the virtual address of the RxRingAddr and TxRingAddr on the application and access it. But, you don’t really need. The Rpmsg_Read and Rpmsg_Write are actually using those memory areas to share data with the M4 core from the application.

Also, Please set those memrory range as Non-cacheable ex: Map_VirtualSetAttributes(proc->sh_buff.start_addr,proc->sh_buff.size,0x000,0x1CC,NULL)

Please refer http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100511_0401_10_en/Chunk539635961.html and
ARM-V7 TRM for more information.

Hi @raja.tx

I know that for normal usage, rpmsg communication works without the need to get the virtual address of RxRingAddr and TxRingAddr.

Maybe this can be an useful approach in case I need to do an advanced debug.

I need some more detail about how to use Map_VirtualSetAttributes.

I asked some weeks ago but in the end it was not necessary use it.

Moreover the second link you posted is broken. Could you verify, please?

Dear @vix,

Shared memory region between Heterogeneous multi-core should be uncached. That need to be set using the Map_VirtualSetAttributes.

Moreover the second link you posted is broken. Could you verify, please?

Corrected the ARM-v7 TRM link in above, Thanks.

Thanks for your help.

I’m going to look into this issue, but this requires some weeks.

I let you know as soon I have news.

This issue is quite difficult to debug (unfortunately), but it seems is somehow related to where data and core are placed in VF61 memory.

In my application, M4 uses OCRAM, TCMU and 2MB of DDR (reserved to M4).

I know that the topic “rpmsg not running stable on m4” is for i.MX7 and it contains a lot of messages now but I wonder if my issue can have something in common.

In the first message of the topic, @andriscewo wrote

It will quite often fail already at startup. Also some other rpmsg crashes that happen later might be related to this and we experience quite often that the rpmsg doesn’t initialize correctly.

If I understood, that topic now is focused to some region of memory not zeroized properly (but I can be wrong).

Is it possible that VF61 or Rpmsg library for VF61 suffers of the same potential issues as i.MX7?

Hi vix,

I don’t know about the similarities between the VF61 / iMX7, but this sounds quite familiar

I change something in M4 application
(i.e. either I increase the size of
some buffers - without using and/or
accessing the extra space; or I add
other code to my sources, so
increasing the code size) and rpmsg
communication doesn’t work anymore

So maybe a short summary for you and maybe a thing you could look into

If I understood, that topic now is focused to some region of memory not zeroized properly (but I can be wrong).

Not exactly. The point is more like it is beeing zeroed out to late. Meaning there are still 0’s in the linux side cache when the m4 startup code is executed. This can potentially overwrite some data in RAM that is used by the m4 application.
To check if this or something similar could be the issue you can take a look at your .data section in RAM after the m4 code started. Check if it is the same everytime you boot and if not you have an issue at hand.

Dear @vix,

If you are interested, we would like to share Vybrid Rpmsg library code. The Library code can easily build and debug using VS2008 as like Win32 native application. This will enable you specifically look into the rpmsg code and you may get some references or clues to solve the issue. Please let us know if you are interested to take it and debug with that.

If you are suspecting cache would be the reason then please disable the M4 cache (Refer Cache control register (LMEM_PSCCR) in Vybrid Reference manual). This application note maybe interesting for you related to the topic you are working on(if you did not read already).