Spurious interrupts in imx8 ethernet MAC IEEE1588 events core (drivers/net/ethernet/freescale/fec_ptp.c)

We have done some testing on 1PPS generation from the Freescale Ethernet Controller (FEC) core using the IEEE1588 event functionality. Below is the contents of a linux kernel trace ringbuffer from a modified version of the fec_ptp driver. This log shows the contents of various variables and registers in the interrupt handler (which manages 1PPS generation), in the fec_time_keep 1Hz thread, and in the fec_ptp_adjfreq function:

 <idle>-0       [000] d.H1  4019.134805: fec_pps_interrupt: Update PPS counter 2a76ec29 3341a71a
           ptp4l-704     [001] d..1  4019.144413: fec_ptp_adjfreq: Set ATIME_INC 90a 6ac3
           ptp4l-704     [001] d..1  4019.207961: fec_ptp_adjfreq: Set ATIME_INC 90a 6ad7
           ptp4l-704     [002] d..1  4019.270849: fec_ptp_adjfreq: Set ATIME_INC 90a 6e90
           ptp4l-704     [002] d..1  4019.334350: fec_ptp_adjfreq: Set ATIME_INC 90a 6b3b
           ptp4l-704     [002] d..1  4019.397134: fec_ptp_adjfreq: Set ATIME_INC 90a 6aa6
           ptp4l-704     [000] d..1  4019.460869: fec_ptp_adjfreq: Set ATIME_INC 90a 6ab3
           ptp4l-704     [001] d..1  4019.524047: fec_ptp_adjfreq: Set ATIME_INC 90a 6ac9
     kworker/2:1-33      [002] d..1  4019.586853: fec_time_keep: Read ns 17a8da30e270b80a
           ptp4l-704     [001] d..1  4019.588091: fec_ptp_adjfreq: Set ATIME_INC 90a 6aad
           ptp4l-704     [001] d..1  4019.651044: fec_ptp_adjfreq: Set ATIME_INC 90a 6adb
           ptp4l-704     [000] d..1  4019.714924: fec_ptp_adjfreq: Set ATIME_INC 90a 6ae6
           ptp4l-704     [001] d..1  4019.778404: fec_ptp_adjfreq: Set ATIME_INC 90a 6ac6
           ptp4l-704     [001] d..1  4019.841837: fec_ptp_adjfreq: Set ATIME_INC 90a 6ad3
           ptp4l-704     [001] d..1  4019.905513: fec_ptp_adjfreq: Set ATIME_INC 90a 6abc
           ptp4l-704     [001] d..1  4019.969035: fec_ptp_adjfreq: Set ATIME_INC 90a 6ab8
           ptp4l-704     [001] d..1  4020.032870: fec_ptp_adjfreq: Set ATIME_INC 90a 6aa3
           ptp4l-704     [002] d..1  4020.096327: fec_ptp_adjfreq: Set ATIME_INC 90a 6aba
          <idle>-0       [000] d.h1  4020.135108: fec_pps_interrupt: Update PPS counter 6611b629 6ee0a8dc
           ptp4l-704     [002] d..1  4020.159852: fec_ptp_adjfreq: Set ATIME_INC 90a 6a93
           ptp4l-704     [001] d..1  4020.223303: fec_ptp_adjfreq: Set ATIME_INC 90a 6af5
           ptp4l-704     [001] d..1  4020.287059: fec_ptp_adjfreq: Set ATIME_INC 90a 6af0
           ptp4l-704     [003] d..1  4020.350293: fec_ptp_adjfreq: Set ATIME_INC 90a 6abf
           ptp4l-704     [003] d..1  4020.412766: fec_ptp_adjfreq: Set ATIME_INC 90a 6ab9
          <idle>-0       [000] d.h1  4020.422387: fec_pps_interrupt: Update PPS counter 21ac8029 1445

The symptom that we see is spurious interrupts. In this log, we see 2 true interrupts at kernel ticks 4019.134805 and 4020.135108 and a spurious interrupt at 4020.422387. The first two interrupts were approximately 1second apart, but the spurious interrupt is about 190ms after the previous interrupt. When the spurious interrupt fires, we see that the counter time register ATIME was 0x1445, not just after 0x2a76ec29 as was programmed into the match register at time 4019.134805.

The spurious interrupt also triggers the event pin, so we see a spurious pulse on an oscilloscope (with infinite persistence triggered from GPS 1PPS) and on a SRS FS740 Time Frequency system.

To me, this looks like a silicon issue.

Has anyone come across something similar?

Hi @jrsharp67 , sorry for the late reply. Which OS is installed on your Verdin iMX8M Mini now?

Hi Benjamin, the Linux kernel version is 5.4.193-5.7.1-devel+git.f78299297185

Hi @jrsharp67 , how often does this spurious interrupt happen? What is the Verdin iMX8M Mini SoM version? I test it for 2 hours and don’t observe unexpected pulse.

Hi Benjamin,

it occurs on the order of once per day. Some times once every two days.

James

Hi Benjamin, the SOM version is Verdin iMX8M Mini Quad 2GB IT (0059) v1.1C

Hi James, this issue can be reproduced on both Linux BSP 5 and Linux BSP 6. Please allow us to investigate.

Hi @jrsharp67 , is this issue fixed on your side? We tried to find out the root cause behind it but it wasn’t easy to reproduce. We only captured the glitch a few times and recently even didn’t see it in a one-week-long test. Do you have any suggestions on how to reproduce it more frequently? Is the clean and precise PPS function critical for your application? Thanks.

Hi @benjamin.tx, I haven’t looked into the issue since I reported it. Stable 1PPS is pretty critical for our application, but we may be able to work around it. I agree it is hard to track down the root cause and seems to be a hardware glitch, as far as I can tell.