MSELECT error detected! status=0x100

Hi,

to my Ixora board, I connected a PCIe device via an adapter cable. The device exposes some config registers on a PCIe BAR which I write to via /dev/mem driver. This had worked before on another brand TK1 board (Edit: confirmed, works with another brand TK1 board).

When I write something to the PCIe device which makes it engage in network data transfer, the serial console is flooded with this message: “MSELECT error detected! status=0x100”, until at some point the system decides to auto-reboot, as the CPU gets stalled by that.
Ah, and sometimes in between there will be this: “mmc0: Timeout waiting for hardware interrupt” + register dump of sdhci. (see attached quote at end of message) I’m not sure that message is in there every time, though.
(Weird, I’m not doing anything with mmc on that board, there is no card inserted in any slot)

I can, however, see, that the PCIe device does get those values that are written via the memory mapping (via a debug tool of the device connected to my PC) - so in principle, that works (not mapping wrong address ranges or anything).

I have found this thread, where that error line is mentioned - not much else in common with the overall failure, though, at least I can’t make much of it.
nvidia forum

Any idea what this could be about?


[  691.063180] MSELECT error detected! status=0x100
[  691.067927] mmc0: Timeout waiting for hardware interrupt.
[  691.067931] sdhci: =========== REGISTER DUMP (mmc0)===========
[  691.067942] sdhci: Sys addr: 0x00000000 | Version:  0x00000303
[  691.067948] sdhci: Blk size: 0x00007200 | Blk cnt:  0x00000000
[  691.067954] sdhci: Argument: 0x00c4d3b0 | Trn mode: 0x00000023
[  691.067960] sdhci: Present:  0x01fb00f0 | Host ctl: 0x00000031
[  691.067966] sdhci: Power:    0x0000000b | Blk gap:  0x00000000
[  691.067971] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  691.067975] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  691.067980] sdhci: Int enab: 0x02ff000b | Sig enab: 0x02fc000b
[  691.067984] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[  691.067990] sdhci: Caps:     0x376fd080 | Caps_1:   0x10002f77
[  691.067995] sdhci: Cmd:      0x0000193a | Max curr: 0x00000000
[  691.067998] sdhci: Host ctl2: 0x00003004
[  691.068002] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xae840030
[  691.068004] sdhci: ===========================================

Edit:
lspci -vvv output with BSP 2.8b5 mainline:

lspci -vvv
00:01.0 PCI bridge: NVIDIA Corporation TegraK1 PCIe x4 Bridge (rev a1) (prog-if 00 [Normal decode])
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 396
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 0000f000-00000fff [empty]
        Memory behind bridge: 13000000-13bfffff [size=12M]
        Prefetchable memory behind bridge: 0000000020000000-0000000020ffffff [size=16M]
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity+ SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Subsystem: NVIDIA Corporation TegraK1 PCIe x4 Bridge
        Capabilities: [48] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable+ Count=1/2 Maskable- 64bit+
                Address: 00000000ae3ab000  Data: 0000
        Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
                Mapping Address Base: 00000000fee00000
        Capabilities: [80] Express (v2) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0
                        ExtTag+ RBE+
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s, Exit Latency L0s <512ns
                        ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
                        Slot #0, PowerLimit 0.000W; Interlock- NoCompl-
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
                        Control: AttnInd Off, PwrInd On, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
                        Changed: MRL- PresDet+ LinkState+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
                RootCap: CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
                         AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
                         AtomicOpsCtl: ReqEn- EgressBlck-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Kernel driver in use: pcieport

00:02.0 PCI bridge: NVIDIA Corporation TegraK1 PCIe x1 Bridge (rev a1) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 397
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
        I/O behind bridge: 00001000-00001fff [size=4K]
        Memory behind bridge: 13c00000-13cfffff [size=1M]
        Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [empty]
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity+ SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Subsystem: NVIDIA Corporation TegraK1 PCIe x1 Bridge
        Capabilities: [48] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable+ Count=1/2 Maskable- 64bit+
                Address: 00000000ae3ab000  Data: 0001
        Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
                Mapping Address Base: 00000000fee00000
        Capabilities: [80] Express (v2) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0
                        ExtTag+ RBE+
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s, Exit Latency L0s <512ns
                        ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
                        Slot #0, PowerLimit 0.000W; Interlock- NoCompl-
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
                        Control: AttnInd Off, PwrInd On, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
                        Changed: MRL- PresDet+ LinkState+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
                RootCap: CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
                         AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
                         AtomicOpsCtl: ReqEn- EgressBlck-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Kernel driver in use: pcieport

01:00.0 RAM memory: Xilinx Corporation Device 7011
        Subsystem: Xilinx Corporation Device 0007
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 255
        Region 0: Memory at 13000000 (64-bit, non-prefetchable) [disabled] [size=8M]
        Region 2: Memory at 13800000 (64-bit, non-prefetchable) [disabled] [size=64K]
        Region 4: Memory at 20000000 (64-bit, prefetchable) [disabled] [size=16M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s unlimited
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range B, TimeoutDis-, LTR-, OBFF Not Supported
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v1] Device Serial Number 00-00-00-00-00-00-00-00

02:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
        Subsystem: Intel Corporation I210 Gigabit Network Connection
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 22
        Region 0: Memory at 13c00000 (32-bit, non-prefetchable) [size=128K]
        Region 2: I/O ports at 1000 [disabled] [size=32]
        Region 3: Memory at 13c20000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
                Vector table: BAR=3 offset=00000000
                PBA: BAR=3 offset=00002000
        Capabilities: [a0] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <16us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [140 v1] Device Serial Number 00-a0-c9-ff-ff-00-00-00
        Capabilities: [1a0 v1] Transaction Processing Hints
                Device specific mode supported
                Steering table in TPH capability structure
        Kernel driver in use: igb

root@apalis-tk1-mainline:~# dmesg | grep BAR
[    3.848236] pci 0000:00:01.0: BAR 9: assigned [mem 0x20000000-0x20ffffff 64bit pref]
[    3.855986] pci 0000:00:01.0: BAR 8: assigned [mem 0x13000000-0x13bfffff]
[    3.862797] pci 0000:00:02.0: BAR 8: assigned [mem 0x13c00000-0x13cfffff]
[    3.869608] pci 0000:00:02.0: BAR 7: assigned [io  0x1000-0x1fff]
[    3.875714] pci 0000:01:00.0: BAR 4: assigned [mem 0x20000000-0x20ffffff 64bit pref]
[    3.883506] pci 0000:01:00.0: BAR 0: assigned [mem 0x13000000-0x137fffff 64bit]
[    3.890864] pci 0000:01:00.0: BAR 2: assigned [mem 0x13800000-0x1380ffff 64bit]
[    3.917797] pci 0000:02:00.0: BAR 0: assigned [mem 0x13c00000-0x13c1ffff]
[    3.924594] pci 0000:02:00.0: BAR 3: assigned [mem 0x13c20000-0x13c23fff]
[    3.931411] pci 0000:02:00.0: BAR 2: assigned [io  0x1000-0x101f]

What exact module hardware version are we talking about? I assume that your power supply is up to the task. I would suggest for you to try the latest BSP 2.8b5 either the regular downstream aka NVIDIA Linux for Tegra based variant of the mainline one. Easiest is to use the Toradex Easy Installer to effortlessly install any such:

https://developer.toradex.com/software/toradex-easy-installer

I updated the module info (Apalis TK1 2GB V1.2A), I also got hold of the older, other brand TK1 based board and re-tested - confirmed, it does not have this problem. It is also using a L4T based Linux image.
I will look at the newer BSP.

I installed 2.8b5 mainline. Now, whenever I have the PCIe device pluggeg in, the LAN connector’s LEDs are off and ethernet does not work anymore. So I cannot repeat the test desdcribed in my original post, which requires the network.
When the PCIe device is not inserted, the LAN works… but i need both :wink:

We tried it on our side with a PCIe Card and both LAN and PCIe Cards are working perfectly.

Hmm, and you also used the 2.8b5 mainline? With the 2.8b3 L4T I did not see this (just the other problems, perhaps related but popping up differently).
I talked to the HW guy and he re-checked that this PCIe card version only uses 1x lane (other board had 2), and it does.
Any ideas what could cause such behavior? (e.g. some HW design setting in the fpga on the PCIe card… but the card, which is a dev board, supports that stuff natively with out-of-the-box support, and it does work on the other board, I wonder what can go wrong)

I added lspci output to the post. Interestingly, it’s missing the BAR address I usually get for the config stuff as BAR0, 0x32800000, with another TK1-based board.
Anything suspicious there?

yeah, we tried using mainline 2.8b5 and this was working fine.

I don’t see anything suspicious. Could you provide some Information about your PCIe card? Thanks.

The PCIe card is a FPGA devkit by Xilinx, namely the AC701 board. The FPGA uses one of Xilinx standard blocks to expose memory mapped areas to read/write from/to (IIRC, the AXI4-Lite PCIe Bridge, FPGA guy not here to ask ATM).

On ARM platforms the BAR stuff is usually quite limited. I fear that this might cause issues. What exact other TK1 based board are you referring to? What exact software are you running on there?

The other TK1 board is Avionic Design’s “Meerkat” board. (they don’t have your nice system of largely pin-compatible swap-out daughter boards ;)) They also apparently took the L4T kernel (albeit uname -r showed the third part of the kernel number slightly different than your L4T based one) and put it together with stuff for their board into a buildroot. That’s what I used. It is a pretty bare bones installation, based on busybox. Nothing else on there besides own little C++ programs to try trings out.

Hey, maybe you can answer this, though: I just saw that the Apalis-TK1 on-board Ethernet uses PCIe, not USB3. And that lspci -vv shows that both the Ethernet and the FPGA card are “routed to IRQ 130”. Is that a problem? If so, how can this be changed? (that’s with the 3.10 L4T based kernel again, btw, the lspci output in the topic was with mainline and does not show this, but it’s perhaps a starting point…)

hi

Sorry for the delayed answer.

Unfortunately it is difficult to look further into this issue, since you have custom hardware and software.

Best regards,
Jaski

Sorry, I don’t know the answer. Maybe you can have a look here.

Best regards,
Jaski