Re-configuring DMA for SPI burst on Colibri IMX7D


I have a question about achieving instant / real time SPI bursts that are triggered from a GPIO pin. I know other MCU have DMAs that can be triggered by an external event (GPIO, timer, etc…). I am wondering if this is possible on the Colibri IMX7D?

To draw a picture of what I am doing and why, I have an external FPGA connected to the SOM. The FPGA has a small FIFO that it fills up with sensor data. When the FIFO is half full, the FPGA asserts a GPIO pin that indicates a memory read is required. Ideally I would like to service this read request as fast as possible. If the read request is not serviced fast enough, the FPGAs FIFO will overflow and data is lost. Under normal conditions the FPGA asserts the pin about 1 a mille-second, maybe a bit faster.

I tried using an interrupt method, but the overhead is so high, there is too much latency. By the time the SPI actually starts bursting, the FIFO overflows and the data is lost.

I tried a polling method, where a real-time thread (I updated the kernel to real-time as well) polls the pin every 100 us, but every once in a while there are “lags” and the request is not services fast enough. I am not sure why this happens, but even if I could fix it, I do not like this method. It is clunky and consumes a lot of CPU resources.

My last options are:

  • Use the M4 Core for all real-time IO. This can be done via interrupts (real-time) and then using shared memory, I can pass data from the M4 to my application.
  • Use DMA and external events. This way is completely hardware driven.

My questions are:

  1. Is the DMA method possible?
  2. How would I set it up?
  3. Do I need to mmap the DMA configuration / control registers and manual change them?
  4. Can I do this through Linux and have it permanent / semi permanent so that Linux does not take the DMA back and break my application?

Hello @Iosif.grigoryev and Welcome to the Toradex Community!

So first of all setting up DMA in Linux will be a challenging task. You can have a look to the DMA API of Linux here. As you already stated the overhead and latency will still high using the DMA for fast transfers in Linux.

Therefore I would suggest you to use M4 without DMA Setup, if there are not other CPU intensive tasks running on M4. On M4 you can use the polling or interrupt method.

For setting M4, may we know, how much data with which data rate you want to transfer? What is the data used for?

Best regards,

Thanks! That’s all I needed to get on the right track. I already started configuring the M4 Core. The data rate is about 4-8 Mbps, and it is used to store / send sensor data.

I have almost everything except for one issue dealing with shared memory (OCRAM). Let me explain:

The FPGA is a small FIFO and that is why I have to read the data very quickly. I want to turn the OCRAM into a large FIFO so that Linux can process the data at its leisure. Data flow will look like:

Sensors → FPGA (BUFFER) → M4 → OCRAM (BUFFER) → Linux will process it.

I found that when running the M4 from UBOOT everything works fine. I am using / want to use almost all of the 288KB of data in shared memory (OCRAM, OCRAM_EPDC and ECRAM_PXP).
A few issues I came across are:

  1. When Linux boots, it writes to OCRAM on boot (I had the M4 print a few bytes continuously). Does Linux use OCRAM after boot?
  2. If the M4 core writes to OCRAM_PXP when Linux is running, then the M4 core just hangs (not sure why). How can I fix this? I want as large a buffer as I can. Should I just avoid using the OCRAM_PXP? Does Linux use this continuously?

Dear @Iosif.grigoryev

  1. Our default Linux BSP occupies a part of the OCRAM. Refer to the following community post:
  1. In a quick research I was not able to find out whether the OCRAM_PXP is used by Linux or not. There are two explanations for the system hang you observe:

    • Linux reserves the memory for internal use

    • The OCRAM_PXP might require some power domain to be turned on, or a clock to be configured correctly before using it.

Maybe you can get some more information about this by leaving the A7 in U-Boot when running the M4 code.

Anyway, if you need a large buffer, you can also use the DRAM. From a performance perspective, DDR is slower than OCRAM, but still massively faster than the 8Mbps which you require.

Regards, Andy