UART data loss on TorizonCore image

We are using the Modbus RTU in UART2 with RS-485. We found that the TorizonCore image has data loss with baud rate 230400.

Data size is 255 bytes. Lost data is between bytes 30 and 33 up to bytes 57 and 60.

In the OpenEmbeddedCore image this issue does not occur. All bytes are received.

Frame sent from PC:

0x01, 0x03, 0xfa, 0xba, 0xaa, 0x63, 0x77, 0xdb, 0xa0, 0xe4, 0x15, 0xa2, 0x5a, 0x1d, 0x72, 0x62, 0x69, 0xf4, 0x1b, 0x1f, 0x3f, 0x81, 0x1b, 0x3a, 0x30, 0x7b, 0xc9, 0xb4, 0x8b, 0xd2, 0x3f, 0x9f, 0x70, 0x17, 0x0c, 0x02, 0xfa, 0xbd, 0xea, 0x7f, 0x4d, 0x54, 0x91, 0xaa, 0x24, 0x6d, 0x1f, 0x99, 0x6e, 0x83, 0xd5, 0x8c, 0x5f, 0x16, 0x3f, 0x1d, 0x62, 0x60, 0x92, 0x99, 0x2d, 0x9b, 0xd7, 0x3c, 0x96, 0xc0, 0xed, 0xcd, 0x85, 0xb3, 0x38, 0x0b, 0x8a, 0x3b, 0xe5, 0x4b, 0x55, 0xd7, 0x53, 0x79, 0x2b, 0xf0, 0x6a, 0x38, 0xb4, 0xf3, 0x3e, 0xf0, 0xac, 0x3c, 0x34, 0xc2, 0x43, 0xea, 0x37, 0xc0, 0x43, 0xf8, 0x07, 0x0f, 0x7c, 0xe1, 0x4a, 0x1e, 0x5d, 0x80, 0xbf, 0x12, 0xe3, 0xed, 0x13, 0x72, 0x0c, 0x7f, 0xa7, 0x4b, 0xc6, 0xa9, 0x5a, 0x2e, 0xbe, 0xd7, 0x71, 0x4a, 0xe6, 0x45, 0xf0, 0x21, 0x8a, 0x3e, 0xf9, 0x55, 0xad, 0x38, 0x1c, 0x45, 0xe2, 0x8a, 0xba, 0x27, 0x7a, 0x85, 0x78, 0x1b, 0x17, 0x62, 0x14, 0xc4, 0x56, 0x6d, 0x75, 0x4b, 0x29, 0x43, 0x83, 0xeb, 0x04, 0x6e, 0x05, 0x32, 0x7a, 0x65, 0x3c, 0x10, 0xc6, 0xb4, 0x79, 0xb5, 0xc1, 0xcd, 0x91, 0x35, 0x2d, 0x5e, 0xed, 0x19, 0xb4, 0xff, 0x10, 0xf7, 0x89, 0xbb, 0x24, 0xcf, 0xee, 0xf6, 0x67, 0x12, 0x09, 0x33, 0xb5, 0x2e, 0xd7, 0x77, 0x36, 0x65, 0x56, 0x41, 0x44, 0x63, 0x66, 0x79, 0x94, 0xe0, 0xce, 0xdf, 0x06, 0xa7, 0xd9, 0xd3, 0x7a, 0x2d, 0x1b, 0xeb, 0xc9, 0x0d, 0xaf, 0x60, 0x5f, 0x64, 0x2a, 0x47, 0xff, 0x6c, 0xf0, 0x7c, 0xf9, 0x2d, 0x1b, 0xeb, 0xc9, 0x0d, 0xaf, 0x60, 0x5f, 0x64, 0x2a, 0x47, 0xff, 0x6c, 0xf0, 0x7c, 0xf9, 0x2d, 0x1b, 0xeb, 0xc9, 0x0d, 0xaf, 0x60, 0x5f, 0x64, 0x2a, 0x83, 0x51

Received frame in Colibri:

0x01, 0x03, 0xfa, 0xba, 0xaa, 0x63, 0x77, 0xdb, 0xa0, 0xe4, 0x15, 0xa2, 0x5a, 0x1d, 0x72, 0x62, 0x69, 0xf4, 0x1b, 0x1f, 0x3f, 0x81, 0x1b, 0x3a, 0x30, 0x7b, 0xc9, 0xb4, 0x8b, 0xd2, 0x3f, 0x9f, (...LOST DATA...) 0x92, 0x99, 0x2d, 0x9b, 0xd7, 0x3c, 0x96, 0xc0, 0xed, 0xcd, 0x85, 0xb3, 0x38, 0x0b, 0x8a, 0x3b, 0xe5, 0x4b, 0x55, 0xd7, 0x53, 0x79, 0x2b, 0xf0, 0x6a, 0x38, 0xb4, 0xf3, 0x3e, 0xf0, 0xac, 0x3c, 0x34, 0xc2, 0x43, 0xea, 0x37, 0xc0, 0x43, 0xf8, 0x07, 0x0f, 0x7c, 0xe1, 0x4a, 0x1e, 0x5d, 0x80, 0xbf, 0x12, 0xe3, 0xed, 0x13, 0x72, 0x0c, 0x7f, 0xa7, 0x4b, 0xc6, 0xa9, 0x5a, 0x2e, 0xbe, 0xd7, 0x71, 0x4a, 0xe6, 0x45, 0xf0, 0x21, 0x8a, 0x3e, 0xf9, 0x55, 0xad, 0x38, 0x1c, 0x45, 0xe2, 0x8a, 0xba, 0x27, 0x7a, 0x85, 0x78, 0x1b, 0x17, 0x62, 0x14, 0xc4, 0x56, 0x6d, 0x75, 0x4b, 0x29, 0x43, 0x83, 0xeb, 0x04, 0x6e, 0x05, 0x32, 0x7a, 0x65, 0x3c, 0x10, 0xc6, 0xb4, 0x79, 0xb5, 0xc1, 0xcd, 0x91, 0x35, 0x2d, 0x5e, 0xed, 0x19, 0xb4, 0xff, 0x10, 0xf7, 0x89, 0xbb, 0x24, 0xcf, 0xee, 0xf6, 0x67, 0x12, 0x09, 0x33, 0xb5, 0x2e, 0xd7, 0x77, 0x36, 0x65, 0x56, 0x41, 0x44, 0x63, 0x66, 0x79, 0x94, 0xe0, 0xce, 0xdf, 0x06, 0xa7, 0xd9, 0xd3, 0x7a, 0x2d, 0x1b, 0xeb, 0xc9, 0x0d, 0xaf, 0x60, 0x5f, 0x64, 0x2a, 0x47, 0xff, 0x6c, 0xf0, 0x7c, 0xf9, 0x2d, 0x1b, 0xeb, 0xc9, 0x0d, 0xaf, 0x60, 0x5f, 0x64, 0x2a, 0x47, 0xff, 0x6c, 0xf0, 0x7c, 0xf9, 0x2d, 0x1b, 0xeb, 0xc9, 0x0d, 0xaf, 0x60, 0x5f, 0x64, 0x2a, 0x83, 0x51

We simplified the scenario to use only UART2 (/dev/ttymxc1). Without RS-485 and modbus.

Compile serial_toradex.c and run on Colibri.

Run python3 serial_pc.py on PC

serial_toradex.c and serial_pc.py

Is the data loss on Torizon consistently reproducible?

One theory I have is the difference in kernel between the OE reference image and the Torizon image. By default Torizon uses an upstream mainline kernel while the OE image uses a downstream NXP based kernel. However you can build the OE image with the same upstream kernel by changing the distro as seen here: https://developer.toradex.com/knowledge-base/board-support-package/openembedded-core#Distro

Can you try testing again on the OE image using an upstream kernel and see if the results are similar to Torzion. Then at least we can isolate the issue to a specific kernel version/branch.

Best Regards,
Jeremias

The data loss is consistently reproducible on Torizon and OE with upstream mainline kernel.

I tested OE with upstream mainline kernel and the problem occurs.
I tested Torizon with Downstream kernel and the problem does not occurs.

There was just one thing that happened that did not work 100%.
When I built the image with DISTRO=“torizon”, something was different with u-boot environment and the kernel did not load and u-boot tried to load kernel from TFTP server.
What I did was to get the u-boot-initial-env from the upstream image version and replace with the generated u-boot-initial-env. Is DISTRO=“torizon” and u-boot-initial-env fully compatible with ostree partitioning stuff?

Alright I did some tests on your end. I wasn’t able to completely replicate your exact setup but I got as close as I could.

I have an aster carrier board but no iMX6ULL available. However I do have an iMX6 which uses a similar ttymxc UART. I then also modified your code slightly to instead use ttyUSB0 and ttymxc0 since it was easier for me to hook that up with the cables I had available. I made sure to disable this uart as the serial console so there was no other signals on the UART1, other than what is being sent by your code.

So I ran your test code several times and so far all bytes sent by the PC seemed to be received by the module with no error. Meaning I haven’t been able to reproduce this quite yet.

Out of curiosity what happens if you try a different UART like UART1 like I did? Make sure to disable it as serial console to avoid other noise.

Perhaps this is a 6ULL specific issue? Or maybe specific to a specific UART?

Let me see internally if anyone has the hardware to more closely replicate your setup than I.

There was just one thing that happened that did not work 100%…

This isn’t surprising, while you can build with downstream Torizon, it’s not something we build and test regularly ourselves, so issues aren’t unexpected.

Best Regards,
Jeremias

I used the UART1 (ttymxc0) but the problem still persist.

Alright I got a colleague of mine with a iMX6ULL to try and reproduce. This time they were able to reproduce your issue. Furthermore my colleague noticed that if he lowered the baud rate in your testing programs to 115200, then it worked with no issues.

In summary this seems to be a very specific issue with the iMX6ULL on the upstream Kernel, at higher baud rates possibly.

I’ll need to report this to our R&D team for further investigation and possibly fixing. Thank you for bringing this to our attention. I’ll update you if we find anything.

Best Regards,
Jeremias