Toradex CE library 2.4: Can Messages Lost / System Blocked

Hello,

I am just testing CAN bus behaviour for WEC7/Apalis,OS image is based on 1.5b4

I am using Toradex CE library 2.3. I test with an IO board that is connected via CAN-bus. On this IO board I have made a loop back from an output to an input. Quite seldom (maybe 1 in 100000) I do not get back the answer.

In release notes for reivision 2.4 (which obviously has not been released yet) under number 45299 you say that it is a known bug for Colibri modules that sometimes CAN messages are lost

Is this a known bug for Colibri only or also for Apalis ?

Is this bug also in revision 2.3 ?

I have also tested 2.3b4567, After a few seconds the whole system was blocked (I could not even access the system via telnet). We have 2 threads, one for sending and one for receiving. As I saw under number 41015 you made modifications in 2.3 regarding this scenario.

Kind regards

Dear @Frax222

The issue about occasional loss of CAN messages (#45299) affects both Colibri and Apalis iMX6 modules. It was solved in the preliminary version of the libraries V2.3b4536.

  • The build number is increasing only, so the problem should also be solved in the version V2.3b4567 that you mentioned in your question.
  • V2.3 is older than any V2.3bxxxx preliminary version (sorry for the confusion), so V2.3 has this known bug.

Issue #41015 was only about the blocking of Can_Read() vs Can_Write(), which should never affect the whole system.

All in all it seems you triggered an unknown problem in the libraries. I would appreciate the source code of a test application which allows us to reproduce and debug the issue - ideally by connecting two Colibri / Apalis boards, so we don’t have a dependency on unknown external IO-boards.

Regards, Andy

hello,

thank you for clarification. I will try to provide a test application - but this will take some time.

Kind regards

Hello,

please find attached a test application that shows the problem.

This test application is based on an earlier test application for hardware acceleration of bitblt (full hd - this now works when ipumem is increased).

You have to start blttest.exe on two systems. Default ID for CAN telegrams is 0x180 = 384. On the second system you have to enter the ID as command line argument, e.g. blttest.exe 385

There are 3 threads:

  • Thread10ms which should run every 10ms, priority is 50. This thread puts each time it runs 5 telegrams into a buffer

  • ThreadCanSend, priority 52, checks the buffer and sends the data

  • ThreadCanRead, priority 49, it reads incomming telegrams and checks whether one telegram is missing

With Toradex Libraray 2.3 I see that there are missed telegrams (not receiving subsequent numbers).

Also with 2.3b4536 I can see missed telegrams, but the systems do not block.

With 2.3b4567 the systems get blocked as soon as I connect CAN bus between the two systems.

There is a function DrawTextBig in test application. The strange thing with this function is that depending on scaling factors the 10ms-Thread is sometimes not run every 10ms (but only after 12ms). Maybe this does also prevent CAN bus receive procedure from reacting in time.

I hope you can see the problems and find a solution.

Kind regards

Dear @Frax222
I am able to reproduce the issue here. It looks it will take some time to come up with a solution. Do you have any deadline until when you need this fixed?
Regards, Andy

Hello Andy,

do you think end of august would be possible ?

Kind regards

Dear @Frax222
From what we can judge now, we should be fine to have a solution by then.
Regards, Andy

Dear @Frax222,

Thank you for your patience. We solved the bug and would like to share the preliminary release with you for your testing, please download it from here : CAN Bug Fix.

Please let us know is that solves the issue.

Hello,

I have now tested CE library of 12.09.2019 in our application. It works without problems.
Thank you for your assistance !

Kind regards