extreme CAN framerate limitations, transmit queue full

Hello,

We are currently attempting to communicate with an Ingenia EVE-XCR motor driver from a Verdin IMX8MM Q 2GB WBIT V1.1A on Dahlia V1.0C board using CANOpen. We have successfully communicated with the motor driver with cycle delay set to 50 milliseconds (20 Hz). However, we get transmission errors when the CAN cycle delay is set below 10 milliseconds (100 Hz). This behavior is identical when using a Verdin IMX8MM Q 2GB WBIT V1.0B on a custom development board.

Our timing loop is managed by Lely CANopen. This is the kernel error message printed at higher loop rates, which we believe is symptomatic of a low-level CAN hardware issue:

CAN frame successfully queued after dropping 1 frame
warning: CAN transmit queue full; dropping frame: Resource temporarily unavailable
CAN frame successfully queued after dropping 2 frames
warning: CAN transmit queue full; dropping frame: Resource temporarily unavailable
CAN frame successfully queued after dropping 1 frame
warning: CAN transmit queue full; dropping frame: Resource temporarily unavailable
CAN frame successfully queued after dropping 1 frame
warning: CAN transmit queue full; dropping frame: Resource temporarily unavailable
CAN frame successfully queued after dropping 1 frame
warning: CAN transmit queue full; dropping frame: Resource temporarily unavailable

During this error, CANdump produces the following output:

  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 
  can0  080   [0] 

In this case, 080 is the sync message.

If the cycle delay is sufficiently high, no errors will be present and CANdump will show the following output:

  can0  1A0   [1]  0A
  can0  2A0   [4]  50 42 F0 FF
  can0  3A0   [6]  50 42 00 00 00 00
  can0  320   [4]  0F 00 00 00
  can0  080   [0] 
  can0  1A0   [1]  0A
  can0  2A0   [4]  50 42 03 00
  can0  3A0   [6]  50 42 00 00 00 00
  can0  320   [4]  0F 00 00 00
  can0  080   [0] 
  can0  1A0   [1]  0A
  can0  2A0   [4]  50 42 FF FF
  can0  3A0   [6]  50 42 00 00 00 00
  can0  320   [4]  0F 00 00 00

We found that the minimum cycle delay before failure increases based on how much information we try to send to the motor driver. The highest minimum delay we found was 100,000 microseconds running all desired tasks, and the lowest minimum delay we found was 2,000 microseconds when sending no information. Note that the motor driver will always send some data to the verdin (the data shown in the error free candump). Our CANOPEN implementation sends messages in a synchronous manner; every 1 millisecond 8 messages are sent in a burst between the verdin and our motordrive over the CAN bus.

The main program is running inside a docker container on the Verdin.

Our system is as follows:

SOM:
Verdin IMX8MM Q 2GB WB IT, V1.1A

Carrier board:
Dahlia V1.0C

Motor Driver:
Ingenia EVE-XCR

CAN protocol:
CANOpen

CAN library:
Lely CANOpen in C++

OS:
5.4.115-rt57-5.3.0-devel

Running:
Docker containers

Device tree:
Imx8mm-verdin-wifi-dev.dts on Toradex Kernel 5.3 - branch: toradex_5.4-2.3.x-imx

Greetings @tissue,

Before we even begin to investigate here, it wasn’t clear from your post but what is your goal here? What kind of frequency/delay cycle do you require for your system?

Furthermore, do you think it’s possible for us to reproduce this on our side here at Toradex in a minimal way? Meaning with as little external software and hardware as possible? If this is truly a hardware issue/limitation as you suspect then ideally we’d want to reduce this issue down to the bare minimum so there’s less variables to consider.

Best Regards,
Jeremias

Hello, I’m not sure if my previous response was properly moved into moderation since I don’t see it here, but essentially:

We’d like to communicate to the motor driver from Verdin at 1 kHz. To reproduce the equivalent load on your test system, it could be as simple as:

  • 2 CANopen nodes
  • 1 kHz SYNC pulse
  • 4 TPDO and 4 RPDO per SYNC pulse

As long as you are able to achieve this rate on the bus, then we can safely conclude it’s not a hardware problem on our side. Our cabling has the correct termination resistors and our devices are set to 1 Mbit/s baud rate. txqueuelen is set to 1,000.

I performed some additional testing.

When I disable all PDOs, I am reliably able to get the slave to boot at a communication cycle period of 5 ms. However, as soon as I add all PDOs back, the minimum communication cycle period required for slave boot and proper sending of frames is 12 ms.

I tested the same code on a BeagleBone Blue and was able to get a working communication cycle period as low as 2 ms.

This is what it looks like when the slave fails to boot:

alt text

8 messages every 1ms. or 8kmsgs/s is quite a lot even for @1Mbps arbitration rate. One 11bit message takes from 44 to 134 bit times depending on payload size and stuffing bits involved (44-134 microseconds @ 1Mbps). Take reciprocal of worst case message length and you already get less than 8kmsgs/s. Taking into account unavoidable Linux jitter and so unavoidable send delays … you requirements are hard to meet.

Worth checking out this thread as well:
link text

@edwaugh Thanks, we took a long look at that thread already, and @Edward yes, I agree that we’re probably pushing the system close to limits. To help alleviate the potential latency issues, we are running a realtime kernel with minimal other network traffic.

In the meantime, we pared the number of total PDOs from 8 to 6. What’s interesting to me is that by reducing period time, we are failing to config the slave during SDO setup, not during transmission of PDOs.

In addition to what @Edward has said. It should be noted that the Verdin i.MX8MM has no native CAN controllers, we implement CAN via CAN controllers on the SPI. With the addition of considering SPI in the equation your requirements aren’t very practical.

Furthermore there is a known issue/limitation, when using the SPI CAN controllers at near full load, that a noticeable amount of frames will be dropped: Toradex System/Computer on Modules - Linux BSP Release

Unfortunately there is no good workaround for this at the moment. However as a suggestion perhaps you could try your implementation on the Verdin i.MX8MP, unlike the i.MX8MM this module has a native intergrated CAN controller. Therefore it shouldn’t have nearly as bad frame drops on higher bus loads. Though of course I can’t make any promises since it’s hard to say until you run your exact use-case on the hardware itself, though just a suggestion.

Best Regards,
Jeremias

Thanks for the insight. In the end, we were able to reduce our PDOs to two and run synchronously at 250 Hz before bumping into txqueue overflow. We’re currently looking into reconfiguring our device tree to see if SPI can also run at a higher frequency.

Edit: we actually have one Verdin Plus without the separate CAN controller chip, so we will try that and see if we can improve the comm rate.

Alright, glad you were able to find a compromise.

Just saw your edit, please let me know how the results look on 8MP. Whether they improve or not would be nice to know.

Was the verdim plus 8MP able to process faster CAN messages ?

Yes, iMX8MP internal can controller should process CAN message faster than external MCP2518FDT located on Verdin 8MM module.

With the Verdin iMX8MP, we were able to achieve 6 CANOpen PDOs at a steady 1,500 Hz with no issues. It seems as if the Mini’s external CAN chip is crippling for anyone looking to do real-time motor control.

Hi @tissue, thank you for confirming and providing feedback.

Best Regards,
Jeremias