SPI in burst mode creates 6us to 8us random gaps between every 16 bytes of data

kaloianpenev · March 14, 2018, 11:31am

Hi,
We are using SPI of the Colibri imx6 with the SPIDEV driver. When burst amount of bytes more than 16bytes, (for example 100 or 1000 bytes in single transaction) we see pause gaps with random size from 6us to 8us after every 16 bytes. The CS signal remains 0 so there is now new transfer, that’s still the same transfer, but there is no CLK or DATA for anout 8us after every 16 bytes.

This is unacceptable because it slows down average throughput speed, and also because it is kind of random from 6us to 8us it ruins the ADC sample rate (we are using SPI of the Toradex Colibri board for ADC interface).

jaski.tx · March 14, 2018, 2:46pm

hi kaloyan

what version of hardware and software are you using? How are you sending the spi data? Can you share an example code?

Thanks and best regards
Jaski

kaloianpenev · March 15, 2018, 11:35am

We are using Colibri iMX6 with Angstrom ToradexLinux 2.7 and “spidev” SPI driver. The issue appear also with the standard spidev_test example application of the spidev driver when you send more than 16bytes. We use it in our application the same way as in spidev_test example with " ioctl(fd, SPI_IOC_MESSAGE(1), &tr);" and the same settings

That’s the portion of the code:

static const char *device = "/dev/spidev3.0"; 
static uint8_t mode; 
static uint8_t bits = 8; 
static uint32_t speed = 20000000; 
static uint16_t delay;

static void transfer(int fd) { //unsigned long cnt = 1000; int ret;

 //uint8_t tx[] = {}
     
 uint8_t rx[ARRAY_SIZE(tx)] = {0, };
 struct spi_ioc_transfer tr = {
     .tx_buf = (unsigned long)tx,
     .rx_buf = (unsigned long)rx,
     .len = ARRAY_SIZE(tx),
     .delay_usecs = delay,
     .speed_hz = speed,
     .bits_per_word = bits,
 };
 ret = ioctl(fd, SPI_IOC_MESSAGE(1), &tr);
 if (ret < 1)
     pabort("can't send spi message");
} 

Where actually the commented //uint8_t tx[] = {} array to be transfered is declared externally in .h file simply with long array pattern to generate dummy transfers to the ADC spi in order to get data out of it:

const uint8_t tx[] = { 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ 0x80, 0x00, 0x00, 0x80, 0x00, 0x00, /KPE/ ...................... // and so on... long pattern... }

jaski.tx · March 16, 2018, 2:56pm

what is the exact type of your module? colibri imx6 or imx6ull? Can you tell the exact hw version of your module (1.x?). It seems that the dma transfer for spidev is not enabled. Could you check this?

kaloianpenev · March 16, 2018, 4:57pm

Hardware version of the module is:
Col.iMX6DL 512MB
V1.0A
04944244
How to enable the spidev DMA functionality or to configure the DMA size?

kaloianpenev · March 20, 2018, 9:48am

We updated with the new 2.8b1 Toradex Linux Angstrom v2017.12 - Kernel.

Unfortunately, now the situation is even worse, now there are 70us - 80us gaps between every 16 bytes of data, and that makes the Toradex spidev driver completely unusable for ADC interface.

BR,

jaski.tx · March 21, 2018, 7:57am

With Bsp 2.7 the DMA for spidev is enabled. What are your requirements for the spi transmission? What is your use case?

kaloianpenev · March 21, 2018, 9:35am

Our requirements are once the SPI transfer is started and until entire tx buffer is send end to have one continuous data transfer, that’s all we need.

Currently, as seen on the images from scope, we have continues 16bytes transfer than “pause” than another portion of 16 bytes and so on.

At all that time CS signal is 0, and that’s OK, because it is simply single one long data transfer, but our problem is why we have these “pauses” between every 16 bytes instead of continues bytes transfer without any “pause” between them.

Thanks

jaski.tx · March 22, 2018, 4:08pm

hi
we can reproduce the issue and we are working on it. We will come to you back soon.
best regards
Jaski

kaloianpenev · March 23, 2018, 3:21pm

Hi, just to notice, if this will help.

With new Linux 2.8b1 there are about 64 bytes than 60-80us gap and than again 64bytes.

With previous Linux version 2.75 there were 16bytes and than 6-8us gaps in between and than again 6-8us gap!

So large byte packs->large gaps between, small byte packs->small gaps in between.

The size of the entire Tx Buffer and everything else for spidev configuration is the same.

BR

jaski.tx · March 23, 2018, 3:50pm

Thanks for the update. 64 bytes seems to be as expected, since the fifo size is 64 bytes. It seems that dma transfer is not working properly. We will keep you updated.

max.tx · April 24, 2018, 8:50am

Hi

I can reproduce your issue, also on the NXP community e.g. the following
reports the issue with an i.MX7.

With the 4.9 kernel the test if DMA can be used has been changed. Now
you need to request a multiple of 32 FIFO words for DMA to be used,
other lengths will be worked on in interrupt mode resulting in the 70-80us latencies you see.

The driver currently sets the TX watermark to 0, i.e. it waits until the
TX FIFO is empty before DMA starts to refill it with additional data.
This seems to be due to the implemented workaround for the following errata:

ERR009165 eCSPI: TXFIFO empty flag glitch can cause the current FIFO transfer to be sent twice

One way to get continuous SPI transmission is to reduce the SPI clock to
a value were DMA is fast enough to set the FIFO even with the TX watermark
at 0.

To improve the situation one could also set the SPI word with to 32bit.
This requires you to send multiples of 32 * 4 bytes.
(Note that if you keep passing a byte array to tx/rx you will need to
byte swap each 4 byte sequence)

So with the current driver and in the light of the HW limitation it is
not possible to continuously send SPI data if the SPI clock is above a
certain limit.

As the NXP community post above suggests one could revert the effects of the
errata workaround and set the tx watermark to e.g. spi_imx->wml and risk
that the errata triggers.

That allowed me to send with 10MHz SPI clock continuous 8bit data with the 4.9 kernel.

--- a/drivers/spi/spi-imx.c
+++ b/drivers/spi/spi-imx.c
@@ -428,7 +428,7 @@ static int mx51_ecspi_config(struct spi_device *spi,
                tx_wml = spi_imx->wml / 2;
 
        writel(MX51_ECSPI_DMA_RX_WML(spi_imx->wml) |
-               MX51_ECSPI_DMA_TX_WML(tx_wml) |
+               MX51_ECSPI_DMA_TX_WML(spi_imx->wml) |
                MX51_ECSPI_DMA_RXT_WML(spi_imx->wml) |
                MX51_ECSPI_DMA_TEDEN | MX51_ECSPI_DMA_RXDEN |
                MX51_ECSPI_DMA_RXTDEN, spi_imx->base + MX51_ECSPI_DMA);

Max

kaloianpenev · April 27, 2018, 12:34pm

Hi,
Thanks a lot for your answer.

Do you have any idea or observation of the way that errata ERR009165 actually repeat the transfer,
does it repeat data until watermark level or repeat entire FIFO buffer?

This is not quite clear from NXP errata data-sheet, if it repeats until watermark than we may found a workaround (at least in our use case), but if it dumps again entire fifo buffer regardless of watermark than unfortunately we can’t work it out.

Thanks

max.tx · April 30, 2018, 3:24pm

Hi

Unfortunately we have no further information beyond what the NXP errata document reveals.

Max

jars121 · June 17, 2021, 12:59am

Hi @max.tx

I’m having the same issue on an iMX8QM. Can you please confirm if the iMX8QM also uses the MX51 spi-imx driver? I.e. the above watermark patch to the spi-imx.c file should also work for the iMX8?

Thanks!

alex.tx · June 17, 2021, 1:54am

Yes, iMX8 uses the same MX51 spi-imx driver. However its source code was updated. So you can’t blindly apply that patch. You need to open your version of spi-imx.c file. Find the mx51_setup_wml() function there and apply suggested fix manually.

jars121 · June 17, 2021, 5:21am

Thanks for confirming @alex.tx

I’ve applied the change in my spi-imx.c file and performed a complete rebuild. I’m seeing a ~1us delay between consecutive transfers which is reasonable. The delay is consistent for both 8-bit and 32-bit words.

jaski.tx · June 17, 2021, 6:10am

Perfect that your issue is solved. Thanks for the feedback.