Ethernet RX errors

Hello community/tx,

sometimes rarely i get ethernet RX errors on my Colibri iMX6DL 512MB V1.1A running at v2.7 stable.
(2 of ~100 Devices over time - can’t reproduce it on purpose )

When in this state it shows that the module can send packets but not receive any.
I dumped the traffic on the directly connected next ethernet device and could see outgoung ARP requests and corresponding ARP answers and other traffic which is not received at the module. (ping, etc.)

.

What i was able to test over the serial connection:

  • first trace: arp was 0x0 → flush didn’t help
  • interface down/up didnt help
  • got fec 2188000.ethernet eth0: MDIO read timeout exactly one time, but not sure if this belongs to this issue
  • link and aneg seems to be ok
  • soft reboot didnt help

.

What i will be testing at the next time i am able to serial connect (i’ d be happy for input):

  • deactivate eth0 hardware features ethtool -K eth0 tx off rx off tso off sg off gso off rxvlan off gro off
  • compiled tcpdup on usb stick to debug what the OS receives
  • compiled phytool on usb stick to debug someting in future ?

.

Soft rebooting the system doesent help to get out of this state. Only a electrically reset solves this issue.

  1. Does this behavior sounds familiar and is probably fixed already?

That in mind i found a patch belonging to the mircel phy in the -next brach of my stable kernel release (bsp 2.7).

  1. Can my issue be solved avoiding power down the phy? (knowing the behavior described don’t match. TX still possible)

thinking of a possible hardware issue i followed the Reset pin of the ethernet tranceiver on the imx6 and think that this pin isn’t connected due to population of components.

  1. A phy reset on the given colibri imx6 is not possible?
  1. Is there a layout available showing possible soldering options and functions?

Pardon me for omitting the “one question per post” rule but i think this helps others most coupled togethter.
Best Regards

  1. Yes, this probably has to do with the Micrel PHY errata which does not allow powering down the PHY without also powering down it’s rail which is fixed in subsequent BSPs.

  2. Probably.

  3. No, that’s exactly what Micrel’s errata is about.

  4. I’m not fully sure what exactly you are looking for. Are you planning to solder around on our module?

  • 1-3) I patched my kernel and we run again long term tests against it. I will post the results here.
    1. My aim was to be able to reset the phy if the patch won’t help. By soldering the reset pin and knowing where it is connected to.

As per Micrel’s errata just resting the PHY won’t help you any.

The issue eventually resolved with this patch :slight_smile:
I can not notice any unreliable eth devices anymore.

Thanks for your valuable Input.