Apalis IMX6 CAN Lib Socket API issue with Ethernet

Team

I think I have found another issue.

I am using SocketCAN API to get CAN data from CAN0 interface (using NON blocking socket) but I am seeing intermittently usually after 70 seconds, I get a lull period of 8-10 seconds where socket gives me “Resource Temporarily unavailable error” continuously.

I have sued CANDUMP and it shows incoming data perfectly at 10msec intervals.

Since my socket is NON BLOCKING so, I poll it in a loop of 10msec and incoming CAN data is also @ 10msec intervals. With this setup I am bound to see couple of “Resource Temporarily Unavailable” messages, for e.g. see below, I am getting good data i.e. IMU Data[4]: XXX every 4th call to the socket or so.:

GOOD CASE:
[upload|t1/5dHiRbuUeUNaK74SCX0CsXco=]

BAD CASE (looks like Socket is struggling):

[upload|CDCoyV/cZhszdF7odXtUasnJfh4=]

What could cause the Socket to behave this way ?

External traffic, Is there any tool like wireshark which I could run on Toradex to see what is going on ?

Your timely help is appreciated.

​Fellow TORADEXers

I figured out the root cause (roughly) for this issue.

It is the ethernet cable connection.

Whenever I connect ethernet cable connected to our network, CAN Socket behaves erratically (Lull Periods) and when it is disconnected, everything works fine.

Sounds like ethernet drivers are doing something fishy.

Can you folks please look into it ?

Are the ethernet drivers updated recently ?

I am using Linux apalis-imx6 3.14.52-v2 binary created on 31 March 2016.

I need the following information from you folks:

Is there any utility like WIRESHARK that monitors traffic on APALIS.
Why is that when ethernet cable is plugged in only once, I see CAn Socket behaving badly. It needs a complete re-boot without ethernet cable in order to fix it.
AmI the only one suffering because of this ?

Hi Harish,

The observation with the Ethernet driver is interesting… Do you see the issue also with candump and Ethernet connected?

If not, it is kind of hard for us to verify the issue without having access to your application…

To answer the Wireshark part: Wireshark supports CAN bus. You can tunnel the traffic through SSH, use the following commands:

opkg update && opkg install tcpdump

Then, from your Linux development host:

ssh root@<ip-of-module> tcpdump -i can0 -U -s0 -w - | wireshark -k -i -

I used that with Ethernet in the past (plus the additional rule ‘not port 22’ to avoid sniffing the ssh tunnel itself), but it seems to work with CAN too.

Just did a little experiment here: candump allows to specify the socket type using arguments. So you can specify SOCK_RAW (=3) and SOCK_NONBLOCK (=2048) from command line. With that I get the expected -EAGAIN (Resource temporary unavailable) return:

root@colibri-vf:~# candump can0 --type=2051 
interface = can0, family = 29, type = 2051, proto = 1
read: Resource temporarily unavailable

Unfortunately the utility is not prepared to retry reading as you would do with NONBLOCK, but you can work around by just invoke the utility every 10ms:

while [ true ]; do candump --type=2051; sleep 0.01; done

Is the issue reproduce able for you using this method?

YEs, I do see the issue with CANDUMP and ethernet connected. CANDUMP just works fine, it is my application that gets starved periodically as it uses CAN SOCKET lib, whereas CANDUMP just runs healthy.

I can do a remote sharing session with you folks if you want .

If it is an issue with the Kernel, we need to be able to reproduce it here, remote access typically does not help for that.

What is “CAN SOCKET lib”? Is it different from how candump is accessing the CAN bus (other than the blocking vs. non-blocking difference)?

To rule out problems with the driver/NON_BLOCKING access, I would recommend is to use alter candump to use NON_BLOCKING, which should be fairly easy. candump.c is a one file 250 lines c file:
http://git.pengutronix.de/?p=tools/canutils.git;a=blob;f=src/candump.c

If the issue is reproducible with this modification, we should be able to reproduce it here too.