CAN overrun errors when using UART

Hi everyone,

CAN overrun errors occur frequently when using the UART at the same time. We found out that these errors happen when messages are exchanged between our device and another one via UART with almost no delay between transmissions. It can be clearly seen that the most of the generated interrupts are UART ones.

We have tried several things to try to solve the problem without success:

  • Adding a delay of 15 ms in our system before sending the mentioned messages (kernel mode)
  • Reading the incoming messages quickly in our system by using a dedicated thread that buffers those messages in an application buffer (user space)
  • Incrementing the CAN input buffer at /proc/sys/net/core/rmem_max

Do you have any ideas, how could we get rid of the CAN overrun errors?

Thank you very much in advance.

UART baud rate: is 38400 baud
CAN baud rate: 500 kbps
VF61
Image version 2.8.6

Hi @j.mingorance !

Are you able to reproduce the CAN overruns on the latest release of BSP 2.8? You are using version 2.8.6, which is not the latest version.

If you are building using Yocto, you can refer to the section “Update an Existing Configuration” to update your layers: Build a Reference Image with Yocto Project/OpenEmbedded | Toradex Developer Center

Please be sure to follow the documentation for BSP 2.8: you can check by the version number at top-right region of the Developer pages.

Best regards,

Hi @henrique.tx, thank you for your answer.

Unfortunately, BSP version 2.8.6 does not seem to solve my problem. CAN overrun errors are still present.

Any ideas?

Thank you in advance

Hi

@henrique.tx meant latest 2.8.7, not 2.8.6.

What is your CAN message rate and CPU usage when you observe overrun?

Please verify UART DMA is enabled.

VF61 is capable to handle full 1Mbps bus of extended ID messages, ~9k+ msgs/s along with RS232, USB and Ethernet comms.

This certainly shouldn’t matter. It is for whole net/core, not just for CAN. Default 176kB setting seems big enough for big rate of tiny struct canmsg’s, though perhaps it is flooded with some extra data, I don’t know.

That’s quite default way to handle every receive stream for most of apps. If you meant using no threads and select() instead, then you shouldn’t wonder about poor performance. You need to use blocking calls and thus unavoidable dedicated receive thread.

Using several executables for CAN is another bad idea. Instead of wasting CPU power on frequent task switches for single app, it will loose more CPU time for several apps. Instead you may send less but perhaps bigger messages from some message dispatcher to other apps using some kind of IPC.

These lines in flexcan driver seem never changing.

/* 8 for RX fifo and 2 error handling */
#define FLEXCAN_NAPI_WEIGHT		(8 + 2)

If one NAPI iteration wouldn’t be interruptible, then this setting would be OK for 8 messages long Rx FIFO. I think NAPI indeed is interruptible and on heavy load may lead to unnecessary task switches.

	/* handle RX-FIFO */
	reg_iflag1 = flexcan_read(&regs->iflag1);
	while (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE &&
	       work_done < quota) {

// what if task is switched here? Delayed continue
// due to high CPU %
// may lead to RX FIFO (nearly) full again, but quota 
// too small to read FIFO without reschedule
		work_done += flexcan_read_frame(dev);
		reg_iflag1 = flexcan_read(&regs->iflag1);
	}

I haven’t try to increase that setting myself, a fresh idea, I’ll try it later.

Edit: FlexCAN driver in more fresh kernels defines FLEXCAN_QUIRK_USE_RX_MAILBOX for vf610 and many other families, so uses dedicated MB’s instead of FIFO and isn’t using FLEXCAN_NAPI_WEIGHT setting on vf610. FlexCAN FIFO is quite short, “up to 6 frames” instead of 8 claimed by driver, but still you may try double or triple that setting in your old kernel from BSP 2.8.x.
Regarding of what VF61 is capable, I meant using more fresh kernel with above mentioned quirk. For curiosity I may retest it against kernel from BSP 2.8.x, but not now.

Regards
Edward

1 Like

Thank you for your detailed response. DMA was not enabled, so I am trying to enable it. Do you know what are the steps I have to follow to enable DMA, perhaps in kernel and in user space?

Thank you in advance

How did you check it? After some UART IO /cat/proc/interrupts indicating zero interrupt events for fsl-lpuart and non zero events for eDMA? That’s quite easy, but perhaps there are better ways.

Device tree should provide right settings, kernel config should have DMA driver enabled. BSP 2.8.7 seems having UART DMA’s enabled. I don’t see in logs on git.toradex.com, when VF DMA could be enabled or disabled. Some very old BSP IIRC had it disabled, but I think not BSP 2.8.x.

1 Like

Hello @j.mingorance ,
Were you able to solve your issue with the info provided by @Edward ?

Best regards,
Josep

Hello @josep.tx,

we could solve the problem by enabling DMA.

Thank you for your support.

Best regards