Can0 goes BUS-OFF after cansend

Hello,

I am having some problems with can bus. I am using SN65HVD230D transceiver.

I can connect to can0 and receive messages. can0 is in ERROR-ACTIVE state until I send the first message.
As soon as I do that can0 goes to BUS-OFF state.

I am using tdx-reference-multimedia-image 5.1.0-devel-202012

The changes I did to the device tree are:

to imx8qxp-colibri-eval-v3.dtsi I have added:
in

&flexcan2 {
    	status = "okay";
}; 

in imx8qxp-colibri.dtsi I changed to

/* Colibri optional CAN on PS2 */
pinctrl_flexcan2: flexcan1grp {
	fsl,pins = <
		IMX8QXP_FLEXCAN1_TX_ADMA_FLEXCAN1_TX		0x06000021		/* SODIMM  55 */
		IMX8QXP_FLEXCAN1_RX_ADMA_FLEXCAN1_RX		0x06000021		/* SODIMM  63 */
	>;
};

I was able to send and receive messages with an earlier installation and same configuration to device tree so I don’t think the transceiver is the problem.

I am sorry not to give you /etc/os-release and /etc/issue and other board details but I don’t currently have the device.

Could you give me some pointers how to troubleshoot or if you know of a similar issue and how to solve it.

Thank you for your help!

Hi @stravs,

I know you don’t have the /etc/os-release and /etc/issue information, but could you obtain that for us?

It helps to track this issue because it’s hard to tell “where” the problem is without much more information.

Best regards,
André Curvello

Also, could you confirm precisely the Colibri model you are using?

Hi @stravs

Did you add termination resistors to the CAN wires? Can you also post the output of “dmesg” are there any errors shown there?

Regards,
Stefan

Hi Stefan_e.tx

Yes we have a terminated network. 120 ohms on both sides.
I will send the dmesg output tomorrow.
Will also check with oscilloscope tomorrow.
For me the mystery is that it worked normal with 5.0.0 and now it doesn’t with 5.1.0

I hope that it is just an issue with some clock speed and that the Baud rate is off and that is what is causing the problems.

Thank you!

Hi Stefan

dmesg | fgrep -i can
[    5.053054] can: controller area network core (rev 20170425 abi 9)
[    5.064072] can: raw protocol (rev 20170425)
[    5.077027] slcan: serial line CAN interface driver
[    5.077037] slcan: 10 dynamic interface channels.
[    5.207472] CAN device driver interface
[    5.430698] IPv6: ADDRCONF(NETDEV_CHANGE): can0: link becomes ready
[   24.616487] ieee80211 phy0: mwifiex_cfg80211_sched_scan_start : Invalid Sched_scan parameters
[   51.619044] ieee80211 phy0: mwifiex_cfg80211_sched_scan_start : Invalid Sched_scan parameters
[  132.364763] ieee80211 phy0: mwifiex_cfg80211_sched_scan_start : Invalid Sched_scan parameters

Because can0 is not working we are using a workaround with a usb can adapter that’s why you also see slcan :slight_smile:

Thanks

Hi Stefan
I just tried to send a message on an empty bus and the same thing happened.
Error-Active before cansend and BUS-OFF after.
I wanted to check the Baud rate with oscilloscope but can’t do that if i cant send a message.
I have no idea what is happening every thing looks Ok, so if you have any ideas what to try please tell me.
Thank you.

Hello,
first thanks for your help!

could you confirm precisely the Colibri model you are using?

Colibri iMX8X v1.0b 2gb wifi

/etc/os-release

ID=tdx-xwayland
NAME="TDX Wayland with XWayland"
VERSION="5.1.0-devel-20201127115732+build.0 (dunfell)"
VERSION_ID=5.1.0-devel-20201127115732-build.0
PRETTY_NAME="TDX Wayland with XWayland 5.1.0-devel-20201127115732+build.0 (dunfell)"

/etc/issue

TDX Wayland with XWayland 5.1.0-devel-20201127174510+build.0 (dunfell) \n \l
Colibri-iMX8X-V10B_Reference-Multimedia-Image

ip -details link show can0 - before cansend can0

4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
    link/can  promiscuity 0 minmtu 0 maxmtu 0 
    can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 
	  bitrate 250000 sample-point 0.875 
	  tq 50 prop-seg 37 phase-seg1 32 phase-seg2 10 sjw 1
	  flexcan: tseg1 2..96 tseg2 2..32 sjw 1..16 brp 1..1024 brp-inc 1
	  flexcan: dtseg1 2..39 dtseg2 2..8 dsjw 1..4 dbrp 1..1024 dbrp-inc 1
	  clock 40000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

ip -details link show can0 - after cansend can0

4: can0: <NO-CARRIER,NOARP,UP,ECHO> mtu 16 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 10
    link/can  promiscuity 0 minmtu 0 maxmtu 0 
    can state BUS-OFF (berr-counter tx 0 rx 0) restart-ms 0 
	  bitrate 250000 sample-point 0.875 
	  tq 50 prop-seg 37 phase-seg1 32 phase-seg2 10 sjw 1
	  flexcan: tseg1 2..96 tseg2 2..32 sjw 1..16 brp 1..1024 brp-inc 1
	  flexcan: dtseg1 2..39 dtseg2 2..8 dsjw 1..4 dbrp 1..1024 dbrp-inc 1
	  clock 40000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

Hope this helps. If you need anything more just say and I will send it.
Thanks again!

I do know as a matter of fact that on Apalis iMX8X Flexcan does work just fine. So I would advice looking at that device tree for inspiration.

Hi @stravs,

Just as an update, we are analyzing this issue carefully at Toradex in order to reproduce it and get to know workarounds or fixes.

The response from the middle to the end of December was delayed due to reduced personnel.

In the meantime, did you have any progress in this matter at your side?

Best regards, André Curvello

Hi,

do you mean you see absolutely no activity on CAN pins? I see you have bus-off recovery disabled can state BUS-OFF (berr-counter tx 0 rx 0) restart-ms 0 . Specifying non zero restart-ms as an argument to ip link set, then repeating cansend should allow you to measure bit time with scope (shortest pulse length).

Edward

Hey @andrecurvello.tx

We solved the situation by using an external CAN to USB converter.

We still have 1 aster board and one iMX8X module and I will also try to recreate this.

Will keep you updated.

Best regards Anze

Thanks for the feedback.

Also, could you confirm if you are using TorizonCore 4 or TorizonCore 5?

Just for your information, we recently had our first production release of TorizonCore 5.1.0.

Best regards,
André Curvello

Hi @stravs.

@jaski.tx tested all the three flexcan interfaces at the same time, and haven’t seen any errors.

Could you update to TorizonCore 5.1.0+build.1 and check if you still see any errors?

Best regards,
André Curvello

Hello,

I seem to have the same problem on an imx7 colibri using flexcan1 (pins 55 and 63). The state goes to BUS-OFF as soon as I try to send the first message.

    # ip -details link show can0
    4: can0:  mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
        link/can  promiscuity 0 minmtu 0 maxmtu 0
        can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
              bitrate 250000 sample-point 0.875
              tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
              flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
              clock 24000000numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    # cansend can0 123#deadbeef
    # ip -details link show can0
    4: can0:  mtu 16 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 10
        link/can  promiscuity 0 minmtu 0 maxmtu 0
        can state BUS-OFF (berr-counter tx 0 rx 0) restart-ms 0
              bitrate 250000 sample-point 0.875
              tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
              flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
              clock 24000000numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    # dmesg | fgrep -i can
    [   12.015535] CAN device driver interface
    [   12.045204] flexcan 30a00000.can: 30a00000.can supply xceiver not found, using dummy regulator
    [   80.563886] IPv6: ADDRCONF(NETDEV_CHANGE): can0: link becomes ready
    [  116.604819] can: controller area network core (rev 20170425 abi 9)
    [  116.616042] can: raw protocol (rev 20170425)
    [  116.618702] flexcan 30a00000.can can0: bus-off
    #

I’m running this from a docker image with these numbers:

    # cat /etc/issue
    Debian GNU/Linux 10 \n \l
    # cat /etc/os-release
    PRETTY_NAME="Debian GNU/Linux 10 (buster)"
    NAME="Debian GNU/Linux"
    VERSION_ID="10"
    VERSION="10 (buster)"
    VERSION_CODENAME=buster
    ID=debian
    HOME_URL="https://www.debian.org/"
    SUPPORT_URL="https://www.debian.org/support"
    BUG_REPORT_URL="https://bugs.debian.org/"

And the Torizoncore version:

    ~$ cat /etc/os-release
    ID=torizon-upstream
    NAME="Torizoncore Upstream"
    VERSION="5.1.0+build.1 (dunfell)"
    VERSION_ID=5.1.0-build.1
    PRETTY_NAME="Torizoncore Upstream 5.1.0+build.1 (dunfell)"
    BUILD_ID="1"
    ANSI_COLOR="1;34"
    ~$ cat /etc/issue
    Torizoncore Upstream 5.1.0+build.1 \n \l

Hope this is relevant.
Has any solution to this issue been found?

Hi @stravs,

How is your setup for CAN communication?

  • Are you using 120ohm terminators on both ends?
  • What is the bitrate of your communication?
  • Which devices are involved in the communication?

I ask that because one of the situations in which the CAN bus will be turned off is when the CAN bus is unstable, then the driver will be turned off to avoid any problems.

Best regards,
André Curvello

hi community,

So I think we have found out what went wrong. And it had nothing to do with the OS.

It was probably a combination of EMI and SN65HVD230D not having a resistor on RS pin for slope control.

When the device was working it was in our office(almost no EMI) when the problems occurred the device was in its final location where there was a lot of EMI(electromotors, high voltage batteries…)

I hope this helps somebody.

Thanks again Toradex for helping.

Best regards Anze

Hi @stravs,

Of course this is an excellent feedback, and I’ve added this Rs (the Slope Resistor for SN65HVD230D) note for my reference.

Thank you!

Best regards,
André Curvello