Verdin iMX8M-Plus: pin interrupt latency is about 300us

Hi,

We have a product uses Verdin iMX8M-Plus module (v1.0D) with Toradex BSP 5.6 (with Dunfell) and with the Linux Version:

uname -a
Linux verdin-imx8mp 5.4.193-0+git.ae0c2c1c7920 #1 SMP PREEMPT aarch64 aarch64 aarch64 GNU/Linux

On our custom board we have an FPGA connected to the SOM through the PCIe bus.

The structure is more less like that:

  1. The FPGA fills its FIFO with a measurement data
  2. Whenever FIFO is not empty, it pulls the GPIO1.7 pin high configured as a high level interrupt source on the SOM
  3. In the pin IRQ, the SOM disables the level interrupt and setup the DMA to read data in the FIFO through PCIe. (IRQ routine takes max 10us)
  4. Once DMA reads the FIFO, it generates an MSI interrupt, and inside MSI interrupt I push this data to a kfifo and enable GPIO1.7 level high interrupt back. (this interrupt takes about 3 to12us)

This structure works fine with one small problem. When the FPGA pulls the interrupt pin high, my interrupt routine takes the action within about 300uS which is too long for this processor. Our application requires it to be 50uS max.

This latency can be seen on the oscilloscope output above, the yellow is the FPGA pin which triggers pin interrupt on the SOM, and the green one is the first action taken in the pin IRQ.

Because I have two interrupts sequentially chained (first pin interrupt, and then DMA MSI interrupt), I observe similar latency between those two.

One interesting note that, according to the scope output above, only first 2 interrupts have this 300uS latency, the rest are executed almost immediately (3us). Any idea why it could be?

There is a similar discussion I found below:

I have followed the all suggestions there such as using IRQF_NO_THREAD flag or CONFIG_NO_HZ_FULL or CONFIG_HZ_1000 in the kernel config but I see no difference at each of those settings.

We didn’t try RT_PATCH because it causes USB Gadget issues from our previous experience, therefore I skip this option.

Since I stuck here and don’t know what to try next, your suggestions are highly appreciated.

Relevant driver code like below:

//this takes max 11us
static irqreturn_t fpga_pin_interrupt_handler(int irq, void *dev_id)
{
    gpio_set_value(GPIO4_IO03, 1);

	// disable IRQF_TRIGGER_HIGH interrupt
    disable_irq_nosync(irq);

    spin_lock_irqsave(&io_dev.streaming_spinlock, pin_irq_flags);
    
    (void)setup_dma_to_read(); // triggers MSI interrupt at completion

    spin_unlock_irqrestore(&io_dev.streaming_spinlock, pin_irq_flags);

    return IRQ_HANDLED;
}

//executed by PCIe-MSI interrupt: this takes between 3 - 12us
static void streaming_dma_scan_done_cb(void *arg)
{
    spin_lock_irqsave(&io_dev.streaming_spinlock, msi_irq_flags);   


    //////////////////////////////////////// 
    // push data to kfifo
    ////////////////////////////////////////
    ...
    ////////////////////////////////////////

    wake_up_interruptible(&streaming_dma_fifo_poll_wait);    

    spin_unlock_irqrestore(&io_dev.streaming_spinlock, msi_irq_flags);

    gpio_set_value(GPIO4_IO03, 0);

	// enable IRQF_TRIGGER_HIGH interrupt back
    enable_irq(fpga_interrupt_pin_irq);
}

int register_ioctl(struct pci_dev *pci_dev, void __iomem **bar_addrs)
{
	...
	...	
	
    if (request_irq(fpga_interrupt_pin_irq, (void*)fpga_pin_interrupt_handler, IRQF_TRIGGER_HIGH | IRQF_NO_THREAD, "fpga_interrupt_pin", (void*)&io_dev))
    {
        dev_err(&pci_dev->dev, "%s: FPGA_INTERRUPT_PIN %d cannot register IRQ: %d\n", __func__, FPGA_INTERRUPT_PIN, fpga_interrupt_pin_irq);
        goto err7;
    }
}
cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  1:          0          0          0          0     GICv3  25 Level     vgic
  3:      35694      30825      57117      25633     GICv3  30 Level     arch_timer
  4:          0          0          0          0     GICv3  27 Level     kvm guest vtimer
  6:       7759       5544       5140       2112     GICv3  79 Level     timer@306a0000
  7:          0          0          0          0     GICv3 130 Level     imx8_ddr_perf_pmu
  9:          0          0          0          0     GICv3  23 Level     arm-pmu
 20:          0          0          0          0     GICv3 110 Level     30280000.watchdog
 21:          0          0          0          0     GICv3  52 Level     caam-snvs
 22:          0          0          0          0     GICv3  51 Level     rtc alarm
 23:          0          0          0          0     GICv3  36 Level     30370000.snvs:snvs-powerkey
 26:          0          0          0          0     GICv3  63 Level     30820000.spi
 27:          0          0          0          0     GICv3  58 Level     30860000.serial
 28:        373          0          0          0     GICv3  60 Level     30880000.serial
 29:          0          0          0          0     GICv3  59 Level     30890000.serial
 32:      24436          0          0          0     GICv3  67 Level     30a20000.i2c
 33:        372          0          0          0     GICv3  68 Level     30a30000.i2c
 34:       5212          0          0          0     GICv3  70 Level     30a50000.i2c
 36:          0          0          0          0     GICv3 108 Level     30ad0000.i2c
 37:          0          0          0          0     GICv3  55 Level     mmc1
 38:      19233          0          0          0     GICv3  56 Level     mmc2
 39:          0          0          0          0     GICv3 139 Level     30bb0000.spi
 40:          0          0          0          0     GICv3  34 Level     sdma
 41:          0          0          0          0     GICv3 166 Level     eth0
 42:       2978          0          0          0     GICv3 167 Level     eth0
 43:          0          0          0          0     GICv3 135 Level     sdma
 48:          0          0          0          0     GICv3 180 Level     32f10100.usb
 49:          0          0          0          0     GICv3 181 Level     32f10108.usb
 50:          0          0          0          0     GICv3  39 Level     hantrodec
 51:          0          0          0          0     GICv3  40 Level     hantrodec
 52:          0          0          0          0     GICv3  62 Level     hx280enc
 53:          4          0          0          0     GICv3  72 Level     dwc3
 54:      46194          0          0          0     GICv3  73 Level     xhci-hcd:usb1
 55:        351          0          0          0     GICv3 137 Level     30901000.jr
 56:          4          0          0          0     GICv3 138 Level     30902000.jr
 57:          0          0          0          0     GICv3 146 Level     30903000.jr
 61:          0          0          0          0  gpio-mxc   3 Edge      pca9450
 64:       2595          0          0          0  gpio-mxc   6 Edge      (null)
 65:       3430          0          0          0  gpio-mxc   7 Level     fpga_interrupt_pin
 68:          3          0          0          0  gpio-mxc  10 Level     stmmac-0:07
100:          0          0          0          0  gpio-mxc  10 Edge      usb_1_id
102:          0          0          0          0  gpio-mxc  12 Edge      30b50000.mmc cd
154:          0          0          0          0  gpio-mxc   0 Edge      Wake-Up
218:          0          0          0          0   pca9450   0 Level     pca9450-pmic
220:          0          0          0          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
221:       4257          0          0          0   PCI-MSI 524288 Edge      avalon_dma
IPI0:      9213      24924      25849      18176       Rescheduling interrupts
IPI1:       138         78        160        184       Function call interrupts
IPI2:         0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0       CPU stop (for crash dump) interrupts
IPI4:       540        142        193       2772       Timer broadcast interrupts
IPI5:      6560       4544       6300        941       IRQ work interrupts
IPI6:         0          0          0          0       CPU wake-up interrupts

Thanks

Hi @Fide ,

I hope you are doing well.

Thanks for the question and all the detailed information.

I am having a look at this and I will come back to you asap.

Best Regards
Kevin

1 Like

Hi @Fide,

Have you tried to toggle the gpio already in the probe function? Does it still take 300us until you see the pin go 1 in the interrupt?
Something like this:

...probe...
    gpio_set_value(GPIO4_IO03, 1);
    udelay(100);
    gpio_set_value(GPIO4_IO03, 0);

One other thing could be that there are some caching effects and it first has to load the specific part from RAM to cache and once it is in there it will be executed faster. Unfortunately, I think 50us could be an issue to achieve as latency for Linux in general. Independent of using the RT-Patch or not. If you wait several seconds between two transfers, will it then take again 300us or will it stay at the good latency?

Regards,
Stefan

Hi @stefan_e.tx,

I don’t have a probe function. Do you mean that in the pin interrupt handler like below?

static irqreturn_t fpga_pin_interrupt_handler(int irq, void *dev_id)
{
    SET_LATENCY_PIN();  // gpio_set_value(GPIO4_IO03, 1);

    ...

    CLEAR_LATENCY_PIN(); // gpio_set_value(GPIO4_IO03, 0);

    return IRQ_HANDLED;
}

I used above code to measure how much time it takes to complete the irq interrupt handler. It takes max 11us. And yes Latency is same as before like 300us.

We may be hitting the cache issue, I agree, if it is the case any idea how it could be improved?
About the second question, If I trigger the measurement every seconds, this 300us latency is periodic and occurs every second at the beginning of each measurement.

If I trigger the measurement every 5ms, then sometimes I have this latency:
(the latency is the delay between the first rising edge of the yellow and the first rising edge of the green)

And sometimes, I don’t have:

Thank you.

Hi @Fide,

No the probe function should be called when the driver probes. Maybe you didn’t implement that but then you should still have some init function to do the calls. The idea is to call the gpio functions before the interrupt occurs to be sure it is not some kind of gpio setup issue.

However, if it is as you write and the issue happens most of the time when you trigger it every second I think this will not work. One other idea could be to tweak the interrupt affinity. It seems you currently handle the fpga_interrupt_pin (I assume this is your implementation) on core 0 where you also have all other interrupts. You should move that specific interrupt to a core dedicated to interrupt handling. So e.g. move that interrupt to core 3 and all others to core 0-2.
Check this article regarding interrupt affinity:
https://docs.kernel.org/core-api/irq/irq-affinity.html

Then call the interrupt routine at least once in the init sequence. You would also have to make sure that no other program can run on the 3 core because this might again affect caching.

However, in general, I don’t recommend to rely on this mechanism. It will never be deterministic. If you need this to work properly better move the time-critical parts to the M7 core. There is nothing much we can do about this, Linux was not designed to be deterministic.

Regards,
Stefan

Hi @stefan_e.tx,

To check the GPIO setup first rather than the latency, I added following lines to my ioctl open function:

static int avalon_open(struct inode* inod, struct file* fil)
{
  ..
  ..
  
  SET_LATENCY_PIN();
  udelay(100);
  CLEAR_LATENCY_PIN();
  udelay(100);  
  SET_LATENCY_PIN();
  udelay(100);
  CLEAR_LATENCY_PIN();
  udelay(100);
  SET_LATENCY_PIN();
  udelay(100);
  CLEAR_LATENCY_PIN();
  udelay(100);  
  SET_LATENCY_PIN();
  udelay(100);
  CLEAR_LATENCY_PIN();
  udelay(100);

  ..
  ..
}

And here is how they look on the scope:

They look pretty clean with a neglectable deviation on 100us delay.

On the other side about affinity setting, you are right about that somehow only the first CPU core processes the IRQs. I have verified that this is valid also for the toradex-minimal-reference-image.

I have tried to change the affinity using irq_set_affinity(...) function in the kernel module as suggested, but it seems I hit some bug in the kernel such that the function is not exported.

I also try to change it using

echo "8" > /proc/irq/65/smp_affinity      (to process IRQ65 at CPU3)
or
echo "4" > /proc/irq/221/smp_affinity   (to process IRQ221 at CPU2)

All I got is invalid parameter error. I’m not sure if it is supported by NXP or I did everything correct. I will try to investigate further and let you know about the result.

Thank you for your suggestions.
Fide.

Hi @Fide

Assigning the CPU affinity should work. I verified it on my system. Do you have root rights?
Can you run the following commands and send the exact output?

id
cd /proc/irq/65
ls -hal
cat  smp_affinity
echo 8 > smp_affinity

Regards,
Stefan

Hi @stefan_e.tx ,

Here is my outputs to the commands you asked:

root@verdin-imx8mp:~# id
uid=0(root) gid=0(root) groups=0(root)

root@verdin-imx8mp:~# cd /proc/irq/65

root@verdin-imx8mp:/proc/irq/65# ls -hal
total 0
dr-xr-xr-x 10 root root 0 Feb 15 08:34 .
dr-xr-xr-x 50 root root 0 Feb 15 08:34 ..
-r--r--r--  1 root root 0 Feb 15 08:34 affinity_hint
-r--r--r--  1 root root 0 Feb 15 08:34 effective_affinity
-r--r--r--  1 root root 0 Feb 15 08:34 effective_affinity_list
dr-xr-xr-x  2 root root 0 Feb 15 08:34 fpga_interrupt_pin
-r--r--r--  1 root root 0 Feb 15 08:34 node
-rw-r--r--  1 root root 0 Feb 15 08:34 smp_affinity
-rw-r--r--  1 root root 0 Feb 15 08:34 smp_affinity_list
-r--r--r--  1 root root 0 Feb 15 08:34 spurious

root@verdin-imx8mp:/proc/irq/65# cat smp_affinity
f

root@verdin-imx8mp:/proc/irq/65# echo 8 > smp_affinity
-sh: echo: write error: Input/output error

I get the similar error for 221 which is PCIe MSI interrupt:

root@verdin-imx8mp:/proc/irq/221# echo 8 > smp_affinity
-sh: echo: write error: Invalid argument

But when I try to change IRQ 42 which is for eth0 interface, it works, no error:

Before changing:

root@verdin-imx8mp:~# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  1:          0          0          0          0     GICv3  25 Level     vgic
  3:     212306     166789     167270     170036     GICv3  30 Level     arch_timer
  4:          0          0          0          0     GICv3  27 Level     kvm guest vtimer
  6:      98236      50809      22163      48317     GICv3  79 Level     timer@306a0000
  7:          0          0          0          0     GICv3 130 Level     imx8_ddr_perf_pmu
  9:          0          0          0          0     GICv3  23 Level     arm-pmu
 20:          0          0          0          0     GICv3 110 Level     30280000.watchdog
 21:          0          0          0          0     GICv3  52 Level     caam-snvs
 22:          0          0          0          0     GICv3  51 Level     rtc alarm
 23:          0          0          0          0     GICv3  36 Level     30370000.snvs:snvs-powerkey
 26:          0          0          0          0     GICv3  63 Level     30820000.spi
 27:          0          0          0          0     GICv3  58 Level     30860000.serial
 28:        493          0          0          0     GICv3  60 Level     30880000.serial
 29:          0          0          0          0     GICv3  59 Level     30890000.serial
 32:     187449          0          0          0     GICv3  67 Level     30a20000.i2c
 33:       5872          0          0          0     GICv3  68 Level     30a30000.i2c
 34:         72          0          0          0     GICv3  70 Level     30a50000.i2c
 36:          0          0          0          0     GICv3 108 Level     30ad0000.i2c
 37:          0          0          0          0     GICv3  55 Level     mmc1
 38:       6740          0          0          0     GICv3  56 Level     mmc2
 39:          0          0          0          0     GICv3 139 Level     30bb0000.spi
 40:          0          0          0          0     GICv3  34 Level     sdma
 41:          0          0          0          0     GICv3 166 Level     eth0
 42:      12052          0          0          0     GICv3 167 Level     eth0
 43:          0          0          0          0     GICv3 135 Level     sdma
 48:          0          0          0          0     GICv3 180 Level     32f10100.usb
 49:          0          0          0          0     GICv3 181 Level     32f10108.usb
 50:          0          0          0          0     GICv3  39 Level     hantrodec
 51:          0          0          0          0     GICv3  40 Level     hantrodec
 52:          0          0          0          0     GICv3  62 Level     hx280enc
 53:       1519          0          0          0     GICv3  72 Level     dwc3
 54:     335147          0          0          0     GICv3  73 Level     xhci-hcd:usb1
 55:        663          0          0          0     GICv3 137 Level     30901000.jr
 56:          4          0          0          0     GICv3 138 Level     30902000.jr
 57:          0          0          0          0     GICv3 146 Level     30903000.jr
 61:          0          0          0          0  gpio-mxc   3 Edge      pca9450
 64:      20574          0          0          0  gpio-mxc   6 Edge      (null)
 65:          0          0          0          0  gpio-mxc   7 Level     fpga_interrupt_pin
 68:          3          0          0          0  gpio-mxc  10 Level     stmmac-0:07
100:          0          0          0          0  gpio-mxc  10 Edge      usb_1_id
102:          0          0          0          0  gpio-mxc  12 Edge      30b50000.mmc cd
154:          0          0          0          0  gpio-mxc   0 Edge      Wake-Up
218:          0          0          0          0   pca9450   0 Level     pca9450-pmic
220:          0          0          0          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
221:          0          0          0          0   PCI-MSI 524288 Edge      avalon_dma
IPI0:     50806     198200     178174     114317       Rescheduling interrupts
IPI1:       165        226        262        249       Function call interrupts
IPI2:         0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0       CPU stop (for crash dump) interrupts
IPI4:     33328      13646      47663      11890       Timer broadcast interrupts
IPI5:    106682      39683      18124      15741       IRQ work interrupts
IPI6:         0          0          0          0       CPU wake-up interrupts
Err:          0

After changing the affinity, the eth0 interrupts started being processed by CPU3:

root@verdin-imx8mp:/proc/irq/42# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  1:          0          0          0          0     GICv3  25 Level     vgic
  3:     216566     170032     168865     176209     GICv3  30 Level     arch_timer
  4:          0          0          0          0     GICv3  27 Level     kvm guest vtimer
  6:     101322      52462      22816      51752     GICv3  79 Level     timer@306a0000
  7:          0          0          0          0     GICv3 130 Level     imx8_ddr_perf_pmu
  9:          0          0          0          0     GICv3  23 Level     arm-pmu
 20:          0          0          0          0     GICv3 110 Level     30280000.watchdog
 21:          0          0          0          0     GICv3  52 Level     caam-snvs
 22:          0          0          0          0     GICv3  51 Level     rtc alarm
 23:          0          0          0          0     GICv3  36 Level     30370000.snvs:snvs-powerkey
 26:          0          0          0          0     GICv3  63 Level     30820000.spi
 27:          0          0          0          0     GICv3  58 Level     30860000.serial
 28:       1250          0          0          0     GICv3  60 Level     30880000.serial
 29:          0          0          0          0     GICv3  59 Level     30890000.serial
 32:     187687          0          0          0     GICv3  67 Level     30a20000.i2c
 33:       6400          0          0          0     GICv3  68 Level     30a30000.i2c
 34:         96          0          0          0     GICv3  70 Level     30a50000.i2c
 36:          0          0          0          0     GICv3 108 Level     30ad0000.i2c
 37:          0          0          0          0     GICv3  55 Level     mmc1
 38:       6760          0          0          0     GICv3  56 Level     mmc2
 39:          0          0          0          0     GICv3 139 Level     30bb0000.spi
 40:          0          0          0          0     GICv3  34 Level     sdma
 41:          0          0          0          0     GICv3 166 Level     eth0
 42:      12079          0          0       2591     GICv3 167 Level     eth0
 43:          0          0          0          0     GICv3 135 Level     sdma
 48:          0          0          0          0     GICv3 180 Level     32f10100.usb
 49:          0          0          0          0     GICv3 181 Level     32f10108.usb
 50:          0          0          0          0     GICv3  39 Level     hantrodec
 51:          0          0          0          0     GICv3  40 Level     hantrodec
 52:          0          0          0          0     GICv3  62 Level     hx280enc
 53:       1581          0          0          0     GICv3  72 Level     dwc3
 54:     351602          0          0          0     GICv3  73 Level     xhci-hcd:usb1
 55:        665          0          0          0     GICv3 137 Level     30901000.jr
 56:          4          0          0          0     GICv3 138 Level     30902000.jr
 57:          0          0          0          0     GICv3 146 Level     30903000.jr
 61:          0          0          0          0  gpio-mxc   3 Edge      pca9450
 64:      23480          0          0          0  gpio-mxc   6 Edge      (null)
 65:       1665          0          0          0  gpio-mxc   7 Level     fpga_interrupt_pin
 68:          3          0          0          0  gpio-mxc  10 Level     stmmac-0:07
100:          0          0          0          0  gpio-mxc  10 Edge      usb_1_id
102:          0          0          0          0  gpio-mxc  12 Edge      30b50000.mmc cd
154:          0          0          0          0  gpio-mxc   0 Edge      Wake-Up
218:          0          0          0          0   pca9450   0 Level     pca9450-pmic
220:          0          0          0          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
221:       2492          0          0          0   PCI-MSI 524288 Edge      avalon_dma
IPI0:     52281     210765     181800     120047       Rescheduling interrupts
IPI1:       165        226        262        249       Function call interrupts
IPI2:         0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0       CPU stop (for crash dump) interrupts
IPI4:     34417      15068      50058      11976       Timer broadcast interrupts
IPI5:    111272      41320      18623      16469       IRQ work interrupts
IPI6:         0          0          0          0       CPU wake-up interrupts

I will look further why affinity setting is accepted for IRQ42 (eth0) but not for 65 (pin interrupt) and 221 (pcie-msi interrupt)

PS: I read on somewhere says that only IRQs at GIC level (those have GICv3 on the fifth column in /proc/interrupts) can be set for affinity.

Thank you,
Fide.

Hi

NXP GPIO drivers don’t implement .set_interrupt_affinity method, that’s why you can’t change affinity over /proc/irq/*/smp_affinity.

As it was told on NXP forums, NXP isn’t going to implement it. Though you can find there a patch for some version of GPIO driver to set affinity of whole group of GPIO pins.

As an alternative, you can try changing affinity using devmem2, overwriting one of GIC GICD_ITARGETSRn registers, which start at offset 0x800 of the first reg range of your GIC in DT. If I’m correct it should be 0x38800000+0x800 for imx8mp, I see this in dtsi:

	gic: interrupt-controller@38800000 {
		compatible = "arm,gic-v3";
		reg = <0x38800000 0x10000>,
		      <0x38880000 0xc0000>;

For example on iMX7 I can change affinity of Colibri Ethernet interrupt to 3 with
devmem2 0x31001898 w 0x01010103

^^ here 898 is 0x800 + 0x98. Ethernet IRQ is 120 as shown in /proc/interrupts:

282: 14274 4857 GPCV2 120 Level 30be0000.ethernet

120+32 = 152 or 0x98.
0x010101xx in my case is default affinities of IRQ’s 123 to 121.
Please note, though, that you won’t be able to figure GIC GPIO IRQ number from /proc/interrupts, please find it in RM or DT.

BTW /proc/irq/*/smp_affinity is yet another case demonstrating how Linux (at least for iMX) is broken. I’m fooled that all affinities are set to all CPUs (3 for iMX7D), yet only CPU0 is enabled in GICD_TARGETn registers. Looking at GIC and GICv3 code, it just confirms that these drivers are made to route IRQ to single CPU, which has least affinity mask bit set…

As told by Marc Zyngier _ here

The GICv2 1:N feature is really nasty, actually. It places a overhead
on all CPUs (they will all take an interrupt and only one will
actually service it, while the others may only see a spurious
interrupt). So in practice, you don’t really gain anything, unless your
CPUs are completely idle.

On a busy system, you see an actual performance reduction of your
overall throughput, for a very small benefit in latency. That’s why I
refuse to support this feature in Linux. This may be useful on latency
sensitive systems where the software is too primitive to do an
effective balancing, but Linux is a bit better than that. Thankfully,
GICv3 got rid of this misfeature, and I’m making sure it won’t come
back.

So according to Marc for better latency we should set affinity to all CPUs, though actually we can’t do this because Linux gurus refuse to allow it and actually fool us showing affinity is set to all CPUs, while IRQ is enabled only on CPU0.

Well, spurious interrupt unnecessary waking up idle CPU, perhaps will cost few extra mWh’s, but are they really so bad and unacceptable? If better latency would cost us spurious interrupts, then perhaps it’s decent price for better latency? Any numbers proving gurus know better than we? Perhaps in Linux Documentation? Any Linux means to read GIC spurious interrupt events? There are spurious files in /proc/irq/*/, but they are not about this and I think they are not usable on ARM at all.

Yes, it’s bad to interrupt CPU for nothing while it is executing foreground code. But that CPU could be in the middle of some long instruction, it could have interrupts temporary disabled or perhaps already servicing some other IRQ(s), and so chances to trigger single interrupt on all CPUs should be less than 100%. How much less? Perhaps not less at all?

I patched gic driver a bit to show number of spurious interrupts via /proc/interrupts “Err:” incrementing irq_err_count, and run speedtest-cli several times on iMX7D with Ethernet IRQ affinity really set to 3 with devmem2. In my case single speedtest-cli run generated about 9k Ethernet IRQ events. With affinity set to 3 each CPU handled from 1/3 to 1/2 of those 9k events and led to total amount of 4k spurious interrupt events. So chances to generate spurious interrupt seem being less than 50%. That’s with summary light ~13+13% CPU load. On higher load spurious/real ratio should be smaller, I guess.

OT: the same policy seems being applied by gurus to FIQ interrupt handler. You won’t be able to use FIQ, unless you patch GIC driver enabling interrupt grouping… There where patches in the past for GIC to enable interrupt grouping and allow FIQ. It seems patch authors tired to break the wall years ago.

2 Likes

Hi @Edward,

I will try changing the affinity in the device-tree as you suggested. Thank you for the instructions.

In the meantime, I have enabled ftrace in the kernel and used KernelShark to see what is going on between fpga pin interrupt (irq=65) and msi pcie interrupt (irq=221)

According to tracer log below, there is sched/sched_switch call takes 316us between two consecutive interrupts which prevents dma interrupt to be served on time.

Another case which takes 312us. In this case It seems sched/sched_wakeup is the responsible one:

I will dig the issue further.
Thank you for your effort and understanding.

Hi @Fide

I don’t understand why you are now checking the time difference between irq-65 and irq-221? I thought the test was toggling a gpio?

I think you should reconsider your design. I don’t think this will ever work reliably, to be honest. Linux is just not made for such use cases.

Regards,
Stefan

1 Like

Hi @stefan_e.tx,

According to the scope outputs above, the latency available at the beginning with the pin interrupt is also available between pin interrupt and pcie msi interrupt which we use as a DMA transfer completion signal. DMA transfer is usually completed within 10us after the pin interrupt, and as soon as pin interrupt handled, the cpu should jump to DMA interrupt but we see same 300us latency pattern there as well. Therefore, also it is easier to trace, I decided to check what is happening between those two interrupts by assuming that whatever the reason for the dma latency should be the reason also for pin interrupt latency. According to ftrace logs, sched_switch and wakeup functions most of the time are completed within 30us but sometimes they take 300us which cause the issue for us.

As a permanent solution, our FPGA team is working in parallel to increase the FIFO size in the FPGA about 1000 times, therefore it will be able to store the data for about 50ms before any overflow occurs and Linux will have plenty of time to read the data.

Thank you.

Hi @Fide,

The kernel checks if it has to reschedule something when an interrupt occurs. Most likely the behavior would be different if you disable CONFIG_PREEMPT and enable CONFIG_PREEMPT_NONE. However, this configuration is not tested and it might have other side effects. Even then there will be situations where rescheduling is going on and your interrupt is not served within your expected time window.

Regards,
Stefan