I’m running mainline kernel on the Verdin imx8mm.
We tried to update from 6.5.11 to 6.6.1.
On our board we have connected the following SPI devices
TPM2 (infineon,slb9670)
SPI-RAM (microchip,mchp23lcv1024)
With 6.5.11 everything was fine but with the update to 6.6.1 the access to all SPI-devices failed.
When we did measurements on the SPI Bus we recognized, that the send and received data length was not matching with the expected size and the transfer failed.
I think that we identified the commit, that makes the SPI fail.
When we revert the commits then SPI is running again.
It seems that the enabling of the busts breaks the imx8mm.
I did some measurements with the SPI-RAM. When I write data to the mtd device
we see a 4Byte write command followed by the number of bytes to write
With the Burst-Mode enabled I see that the send length is too long and in many tests I see that the data that I tried to send is interleaved with 0. The send Pattern should be incrementing from 0x01 (0x01, 0x02, 0x03,…)
We see 0x00,0x00,0x01 0x00,0x00,0x00,0x02 0x00,0x00,0x00,0x03
Question:
Does Toradex also has recognized such behavior on imx8m or even on imx6 (the change applies for both)?
Do you have any clue what is going wrong?
I tired to understand the behavior tracing the code but I do not see any obvious bugs.
PS: I posted here the bad behavior with 127Byte, we see different errors with other length and other alignment.
I contacted the team here and they’re not aware of this behavior in kernel 6.6 upstream for the Verdin iMX8M Mini.
I took a look at our latest automated tests (kernel 6.7.0 and 6.6.0) related to our future BSP reference multimedia images for the Verdin iMX8M Mini and Colibri iMX6, and they don’t check SPI burst-mode specifically, so that’s probably why we didn’t reproduce it.
As for your second question, it’s hard to tell. You can try bringing up this issue directly to the upstream kernel to see if the committer you referenced and the related maintainers have a better idea on why this is happening.
Thanks for your reply.
I like to state, that you do not have to do special Burst-Testing to detect the SPI-Error.
A probe of the TPM2 already show the error, see below
The failing selftest is a normal behavior, but the request to do a firmware update shows
the problem. The capabilities of the TPM cannot be read successfully.
[ 15.182512] tpm_tis_spi spi3.1: 2.0 TPM (device-id 0x1B, rev-id 22)
[ 15.191079] tpm tpm0: A TPM error (256) occurred attempting the self test
[ 15.197928] tpm tpm0: starting up the TPM manually
[ 15.777121] tpm tpm0: TPM in field failure mode, requires firmware upgrade
I also run the test with a colibri-imx6dl and encountered the same bad behavior.
As you requested I will post it on the mailing linux mainling list
Regards Stefan
I discussed the issue on the mailing list but without finding the problem
The regression testing for SPI is done with a loopback (MOSI → MISO).
My test showed sending and receiving has the same problem and mask the mistake.
So the data on the line is too long but the received data is fine again.
The maintainers were not able to reproduce the problem (also with the scope).
I do not know what imx8 type they used.
Updating to the latest imx-sdma firmware 4.6 from NXP did not helped either.
My analysis showed that the problem only occurs if the transmit length >= 64Bytes and the DMA is used for the transfer. But the reason for the misbehavior is unclear.
So I reverted the commits for me and I hope I find someone else also having the problem.
Thank you for the update. I’m sure the information you posted here will be useful to other people.
The regression testing for SPI is done with a loopback (MOSI → MISO).
My test showed sending and receiving has the same problem and mask the mistake.
Our automated SPI tests only do a simple loopback test with spidev_test, so this could explain why we didn’t notice it.
Given that currently our most recent BSP (6 at the time of writing this) doesn’t support kernel 6.6 yet, we cannot guarantee we’ll actively look at this issue right now.
Let us know if you need anything else on our side. Otherwise, feel free to continue posting updates about this issue.
I will post the news, if there is any, on this ticket.
By the way, with your new mallow carrier board you would have the possibility see the problem with the TPM2.
Thanks
Stefan