U-Boot Driver for Ethernet Controller

Florian_K · July 21, 2017, 3:14pm

Thanks a lot for that hint.

Florian_K · July 24, 2017, 12:07pm

I wrote the script and let it run over the weekend for the warm-start scenario: Within 70 hours the issue was not reproducible.

I am blocked right now because the DC power switch has not arrived yet and which I need for the cold-start scenario. I let you know about any status changes…

Florian_K · July 24, 2017, 12:36pm

FYI: The script to warm-start/reboot the SOM until it is not ping able …

Florian_K · July 25, 2017, 1:07pm

I reproduced the behaviour in Linux and attached the log file of /var/log/kern.log and the log file of /var/log/dmesg and the output of for sudo ethtool eth0. The good case (physical link is present) output is attached here.

Florian_K · July 25, 2017, 3:26pm

If one tries to establish a link in u-boot

setenv autoload false; if env exists ethaddr; then; else setenv ethaddr 00:14:2d:00:00:00; fi; pci enum; dhcp; run setethupdate;

and if it is not established

e1000: no NVM
e1000: e1000#0: ERROR: Valid Link not detected: -8

executing reset multiple times and trying to establish the link again each time does always fail then… A poweroff/on cycle is required to get a link again.

Florian_K · July 25, 2017, 3:32pm

If the link is not established in u-boot the pci devices are listed as usual:

Apalis TK1 # pci
Scanning PCI devices on bus 0
BusDevFun  VendorId   DeviceId   Device Class       Sub-Class
_____________________________________________________________
00.02.00   0x10de     0x0e13     Bridge device           0x04

Apalis TK1 # pci header 00.02.00
  vendor ID =                   0x10de
  device ID =                   0x0e13
  command register ID =         0x0007
  status register =             0x0010
  revision ID =                 0xa1
  class code =                  0x06 (Bridge device)
  sub class code =              0x04
  programming interface =       0x00
  cache line =                  0x08
  latency time =                0x00
  header type =                 0x01
  BIST =                        0x00
  base address 0 =              0x00000000
  base address 1 =              0x00000000
  primary bus number =          0x00
  secondary bus number =        0x01
  subordinate bus number =      0x01
  secondary latency timer =     0x00
  IO base =                     0x11
  IO limit =                    0x11
  secondary status =            0x0000
  memory base =                 0x1300
  memory limit =                0x1300
  prefetch memory base =        0x2001
  prefetch memory limit =       0x1ff1
  prefetch memory base upper =  0x00000000
  prefetch memory limit upper = 0x00000000
  IO base upper 16 bits =       0x0000
  IO limit upper 16 bits =      0x0000
  expansion ROM base address =  0x00000000
  interrupt line =              0x00
  interrupt pin =               0x01
  bridge control =              0x0000

Florian_K · July 26, 2017, 11:18am

But within linux in more rare cases a colleague missed the pci device…

Florian_K · July 26, 2017, 2:08pm

“Energy Efficient Ethernet” (EEE) is mentioned as another possible root cause for the observed behaviour and is enabled in Angström per default:

root@apalis-tk1:~# ethtool --show-eee enp1s0                                                                                                                                                                                                                   
EEE Settings for enp1s0:                                                                                                                                                                                                                                       
        EEE status: enabled - active                                                                                                                                                                                                                           
        Tx LPI: 0 (us)                                                                                                                                                                                                                                         
        Supported EEE link modes:  100baseT/Full                                                                                                                                                                                                               
                                   1000baseT/Full                                                                                                                                                                                                              
        Advertised EEE link modes:  100baseT/Full                                                                                                                                                                                                              
                                    1000baseT/Full                                                                                                                                                                                                             
        Link partner advertised EEE link modes:  100baseT/Full                                                                                                                                                                                                 
                                                 1000baseT/Full

I disabled EEE with ethtool --set-eee enp1s0 eee off termporarily. I connected/reconnected the ethernet cable to from/to the switch over and over again and was not able to reproduce the missing link (the behaviour observed of my colleague).

Unfortunately I am not able to test the same in the warm-start and cold-start scenarios because the EEE configuration is enabled again after every reset or power cycle.

Do you know how to disable EEE persistently over reboots?

Florian_K · July 27, 2017, 8:47am

I open another question because that’s a different topic…

Florian_K · July 28, 2017, 4:44pm

I received the Phidgetes “Digital Output” to control the DC power supply yesterday. I wrote a test script and was able to reproduce the issue with the Apalis TK1 Linux BSP v2.7b3 for the cold-start scenario as well (after 2 1/2 hours cyclic power cycles every approx. 45 seconds). I will be able to reproduce that again. Please let me know what types of log files will be valuable for you. I can provide them to you then.

I will run the test script over the weekend with our image (which has EEE disabled). Hopefully the issue is not reproducible with that change anymore…

Florian_K · July 31, 2017, 10:55am

In an approx. 60 hour run with EEE disabled during kernel boot in our image over the weekend the issue did not occur again. I can run the same script with your image v2.7b3 with EEE disabled in u-boot as well over night till tomorrow morning (approx. 12 hours).

Florian_K · July 31, 2017, 12:50pm

I create new question specific to the user space link establishment issue.

marcel.tx · August 8, 2017, 2:39pm

It turns out that the current PCIe reset implementation in the PCIe board init function is not quite working reliably due to PCIe reset timing violations. Fix this by overriding the tegra_pcie_board_port_reset() function.

Please find resp. patches on our U-Boot -next branch.

Florian_K · August 10, 2017, 8:45am

Great. Thanks a lot for the patch. Does this patch fix issue on the linux (kernel/user) level as well? (related forum question)

marcel.tx · August 10, 2017, 8:54am

I guess that depends. Most possibly yes should one already bring up the link in U-Boot. However a regular boot won’t do that. I’m actually working on an improved solution for Linux as well and will update resp. thread shortly.

BTW: Please note that my -next stuff already went through multiple iterations with the latest one dating back to yesterday evening.

Florian_K · August 10, 2017, 11:48am

Ok. We will figure it out either way when we run the tests again.

marcel.tx · August 10, 2017, 11:50am

I’m in the final stages of testing and will commit the Linux kernel part soon as well.

Florian_K · August 10, 2017, 12:09pm

Great. Thanks.

Florian_K · August 10, 2017, 12:43pm

What do you mean with “Also allow optionally bringing up the PCIe switch as found on the Apalis Evaluation board. Note however that the Apalis PCIe port is also left disabled in the device tree by default.” in the commit message of the bugfix in u-boot exactly?

marcel.tx · August 10, 2017, 1:36pm

It means exactly that. One may optionally bring up the PCIe switch also in U-Boot if desired/required or whatever. But regular booting does not require any of that and in fact regular booting actually does not touch PCIe at all. Basically unless one explicitly does pci enum PCIe won’t be touched.