Colibri iMX6ULL: strange PHY behaviour on custom carrier board

After a while, Hello to all in Toradex .

I’d like to consult strange behavior of Colibri imx6ull we use in our new product carrier board. To be more specific, Colibri imx6ull 512MB IT without Wifi + Yocto 6.8.

I found similar topics here, but with different possible solution.

On few modules we found problem on PHY startup, Micrel KSZ8041.

Sometimes, usually on cold start of our carrier board, PHY is

1] not available

 dmesg | grep fec
 [ 002.817703] fec 20b4000.ethernet eth0: registered PHC device 0
 [ 0020.391388] fec 20b4000.ethernet eth0: Unable to connect to phy

2] available, but eth0 does not start

PHY is found Ok

[21.473686] Micrel KSZ8041 20b4000.ethernet-1:02: attached PHY driver (mii_bus:phy_addr=20b4000.ethernet-1:02, irq=POLL)

But it never starts, even we are able to test it over ethtool or phytool.

While on Toradex Colibri evaluation it work without problem all the time, we found out warning in Colibri carrier board schematics of Ethernet part about ETH_LINK and ETH_SPEED signals.

As comes out, those signals on our board were not connected. After connection to proper load ( LED + R ), PHY starts to behave correctly.

So, Id like to ask

1] Is it possible to confirm such behavior, when we did not put proper load on LED outputs ? If so, fix is on our side in BSP, we found problem and that’s good.

2] Is it possible to do any change in configuration of device tree ( or other software change ) to fix this issue ?

I presume LED signals comes from PHY KSZ8041 and not from imx6ull itself, so its not – but I have to ask :slightly_smiling_face:

Thanks for any response or advice, it will be handy for others too.

Have a nice day.

Hi, I’m the Toradex AI Assistant. The Toradex Support team will follow up with you shortly!

Did you like my answer?

:+1: :-1:


Hello,

Thank you for reaching out with your question about the Colibri iMX6ULL module on your custom carrier board. The behavior you’re observing with the PHY, particularly related to the ETH_LINK and ETH_SPEED signals, is indeed likely influenced by how these signals are loaded.

  1. Hardware Considerations: The primary solution is to ensure that your hardware design aligns with the reference designs provided by Toradex. The ETH_LINK and ETH_SPEED signals should be properly loaded with the recommended resistors and LEDs, as specified in the reference schematics. This ensures that the PHY can correctly detect and indicate link status and speed.

  2. Software Adjustments: While the hardware fix is crucial, you can also verify the device tree configuration to ensure the PHY is correctly defined. This includes setting the correct PHY address and any specific configurations required by the Micrel KSZ8041. However, please note that software changes alone may not resolve issues caused by hardware design flaws.

In summary, while some software adjustments can be made, the primary solution lies in ensuring that your hardware design adheres to the recommended practices, particularly concerning the loading of the ETH_LINK and ETH_SPEED signals. This should help resolve the PHY initialization issues you’re encountering.

I hope this helps, and please feel free to reach out if you have further questions.

Best regards,
Toradex AI Assistant.

Please, also check if these links can help you solve your problem:

Colibri iMX6ULL: On module PHY not detected.

Hello @HoracekPetr,

Can you send me the schematics for your board so I can have a closer look?
If you do not want to send it publicly, please send it via email with support.eu@toradex.com.

Also, to better understand the problem, can you clarify the following topics:

  • Does the problem also manifest with the Reference Images provided by Toradex?
  • Is it dependent on the SoM or the carrier board? If you put affected module A in different custom carrier boards, does it always have the issue?
  • How frequent is the problem on affected setups?

Best Regards,
Bruno

Hello sir,

Thanks for response.

Well, send schematics its not problem at all, but im not sure how to do it not publicly.

•	Does the problem also manifest with the Reference Images provided by Toradex?

Yes it does, tested on minimal reference 6.x and 7.x

•	Is it dependent on the SoM or the carrier board? If you put affected module A in different custom carrier boards, does it always have the issue?

Right now we selected three SoMs from cca 20 already deployed, which behaves oddly. One of them cant find PHY at all, second one is able to find PHY but link and interface never comes up, eve if we are able to communicate over MDIO. Third on sometimes works, sometimes not.

First two SoMs behaves in same way on many carrier boards. These boards works well with another modules.

•	How frequent is the problem on affected setups?

First two modules mentioned above, tested on 4 or 5 carrier boards, make problems almost every cold start.

Right now all points to problemss with KSZ8041 and LED which are not connected in out scheme. As we know more, we were able to find in Colibri imx6 errata:

If the Ethernet PHY is not accessible, try to power
cycle the Ethernet PHY or the complete Colibri SoM. Disabling the RMII clock turns off the Ethernet
power rails. Try waiting at least 1 second before reenabling the RMII clock and initializing the
Ethernet PHY.

Is there any simple way to disable and enable RMII clock from user space or runtime ?

Thanks for attention, have a nice day

PH

Hello @HoracekPetr,

You can just send them in an email to support.eu@toradex.com.


Thanks for the clarifications.
Upon further investigation, it is likely that the issue is caused by Errata #3: HAR-8982 – KSZ8041 Errata 2 can cause Ethernet not working at certain temperatures.

From your schematics snippet, I see that you have the option to use the ethernet LEDs on your board.
From our testing, this problem only occurred at very low temperatures when using the ethernet LEDs.
Therefore, if you can assemble the resistors R24 and R25, the problem should be mitigated.

If this is not an option, please let me know so we can look into the possibility of a workaround by power-cycling the Ethernet PHY.

Best Regards,
Bruno

Hello Bruno,

Sending original schematics without resistors and LEDs on PHY LED pins to given mail. Unfixed Ethernet part is on second sheet.

Right now, we are testing solution on multiple boards with these connected, with success.

Based on our testing, this occurs on normal temperatures as well.
Power cycling helps, sw reboot of whole SoM fixes issue. It would be great to have another possibility to fix this by power cycle by RMII clock, because we have 15 devices for testing on sites and we need to make them reliable, due to monitoring tests by customer.

Once again thanks for support, have a nice day

PH

Hello @HoracekPetr,

Thanks for sending the schematics.

I had a look at them and there does not seem to be any problem with the ethernet circuit.
Therefore the issue must be related to the Colibri iMX6ULL errata mentioned previously.

When the ethernet LEDs are not connected, we have seen that as well.
However, if the LEDs are connected, this should only manifest at very low temperatures (less than -30C).
Can you confirm if you see the problem when using boards with the ethernet LEDs?

Regarding the possible workaround of disabling the RMII clock, this is not something we tested.
This would likely require writing to registers of the SoC, as it does not seem that an interface is exposed for that.
Therefore, if you want to go that way, please check the i.MX6ULL reference manual and the entry on the Colibri iMX6ULL device tree for the clocks which are used by the MAC: imx6ul.dtsi « imx « nxp « dts « boot « arm « arch - kernel/git/stable/linux.git - Linux kernel stable tree

Best Regards,
Bruno

Hello bruno,

As we start fixing our boards and board on test field deployment, we did not seen this problem. Yet at least.

On few modules we saw this in normal room temperatures, on out custom carrier board ( derivation fo Colibri evaluation board ). after adding rezistor on one LED, it looks fine.

While on Carrier board, we were not able to produce such error, even with same problematic test Colibri module.

Disabling of RMII would be nice testing hack, or emergency solution. Digging into device tree was already solution I went thru, but right now its not possible to lack of time.

Right now we are ramping up production, so we have to wait, if we detect any accurence of stucked MAC interface.

Either way, thaks very much fo attention and configmation, we will see in few days or weeks more data.

Have a niceday,

PH

Hello @HoracekPetr,

This makes sense and is coherent with our testing of this issue.


Noted, we will await further information.

Best Regards,
Bruno