iMX7 stops on Starting kernel

I am experiencing a stuck on “Starting kernel …” on imx7 1G with torizon’s latest warrior branch commit. We use a custom designed Carrier Board with the reference design of the eval’s board 3.3V rail. The voltage is oscilloscoped stable and not supicious in switching moments.

The OS is slightly adopted via device tree gpio modifications.
U-Boot just have one patch for a gpio state.

I have seen this on 3 out of 40 modules occouring sporadically during reboots under operation. Once happend the module cannot be reactivated.

U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 37C
Reset cause: POR
DRAM:  1 GiB
PMIC:  RN5T567 LSIVER=0x01 OTPVER=0x0d
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
Video: 640x480x18
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX7 Dual 1GB (eMMC) V1.1A, Serial# 06476460
Net:   FEC0
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1...
Found U-Boot script /boot.scr
reading /boot.scr
2173 bytes read in 15 ms (140.6 KiB/s)
## Executing script at 87000000
445 bytes read in 97 ms (3.9 KiB/s)
44411 bytes read in 110 ms (393.6 KiB/s)
7315968 bytes read in 263 ms (26.5 MiB/s)
2810093 bytes read in 168 ms (16 MiB/s)
Kernel image @ 0x81000000 [ 0x000000 - 0x6fa200 ]
## Flattened Device Tree blob at 82000000
   Booting using the fdt blob at 0x82000000
   Loading Device Tree to 8fff2000, end 8ffffd7a ... OK

Starting kernel ...

Is anything known to this behavior?

How would i further debug ?

Is this SW or HW related?

Greetings @m.tellian,

I have a couple of questions to see if I can better understand your situation.

So just to clarify you’re seeing this problem on only 3 out of 40 devices?

If you reflash Torizon on these 3 devices does the problem still exist?

Is there any issue if you use the master branch of Torizon rather than warrior?

Yes, currently 3 out of 40 devices.

I didn’t try to reflash it because of maybe providing helpfull analysis for you. I have one of them available - should I try?

No. But after testing around I considered the warrior branch stable for our development and would not want to have problems coming with new features.

Nevertheless i can’t see any commit which fixes such a behavior. So can you guarantee it is not a hardware problem?

Just to clarify why I suggest re-flashing is I’m curious whether the issue persists on the specific problematic hardware.

If re-flashing fixes it then it might suggest some intermittent software issue. If the problem persists through re flashing and is isolated to those 3 devices then perhaps some kind of hardware issue that is specific to those 3 devices.

Another thing that would reveal some information is whether you can get this problem to appear on other devices outside of the 3, then it would definitely point to more of a SW issue rather than HW.

I tried reflashing with the 3 modules named K4,W1,W2 to keep track of:

K4 (usb problems?):
U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 37C
Reset cause: POR
DRAM:  1 GiB
PMIC:  RN5T567 LSIVER=0x01 OTPVER=0x0d
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
Video: 640x480x18
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX7 Dual 1GB (eMMC) V1.1A, Serial# 06476460
Net:   FEC0
Hit any key to stop autoboot:  0
Colibri iMX7 #
Colibri iMX7 #
Colibri iMX7 #
Colibri iMX7 #
Colibri iMX7 # ls mmc 0:0
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
Colibri iMX7 # ls mmc 0:1
     2173   boot.scr

1 file(s), 0 dir(s)

Colibri iMX7 # ls mmc 0:2
<DIR>       4096 .
<DIR>       4096 ..
<DIR>      16384 lost+found
<DIR>       4096 sys
<DIR>       4096 home
<DIR>       4096 run
<DIR>       4096 boot
<DIR>       4096 tmp
<DIR>       4096 proc
<DIR>       4096 dev
<DIR>       4096 root
<DIR>       4096 ostree
Colibri iMX7 #


U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 23C
Reset cause: POR
DRAM:  1 GiB
PMIC:  RN5T567 LSIVER=0x01 OTPVER=0x0d
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
Video: 640x480x18
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX7 Dual 1GB (eMMC) V1.1A, Serial# 06476460
Net:   FEC0
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1...
Found U-Boot script /boot.scr
reading /boot.scr
2173 bytes read in 14 ms (151.4 KiB/s)
## Executing script at 87000000
445 bytes read in 98 ms (3.9 KiB/s)
44411 bytes read in 110 ms (393.6 KiB/s)
7315968 bytes read in 264 ms (26.4 MiB/s)
2810152 bytes read in 169 ms (15.9 MiB/s)
Kernel image @ 0x81000000 [ 0x000000 - 0x6fa200 ]
## Flattened Device Tree blob at 82000000
   Booting using the fdt blob at 0x82000000
   Loading Device Tree to 8fff2000, end 8ffffd7a ... OK

Starting kernel ...
[HANG]

### USB REFLASH
Colibri iMX7 # usb start
starting USB...
USB0:   USB EHCI 1.00
scanning bus 0 for devices... 1 USB Device(s) found
USB1:
[HANG]

Colibri iMX7 # run usb_boot
starting USB...
USB0:   USB EHCI 1.00
scanning bus 0 for devices... 1 USB Device(s) found
USB1:
[HANG]

W1(dead?):
U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 29C
[HANG]

sometimes but 20/1 no output

W2 (usb problems and could not load kernel. probably on K4 too):
U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 33C
Reset cause: POR
DRAM:  1 GiB
PMIC:  RN5T567 LSIVER=0x01 OTPVER=0x0d
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
Video: 640x480x18
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX7 Dual 1GB (eMMC) V1.1A, Serial# 06475879
Net:   FEC0
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1...
Found U-Boot script /boot.scr
reading /boot.scr
2173 bytes read in 14 ms (151.4 KiB/s)
## Executing script at 87000000
445 bytes read in 98 ms (3.9 KiB/s)
44411 bytes read in 111 ms (390.6 KiB/s)
7315968 bytes read in 264 ms (26.4 MiB/s)
2810152 bytes read in 169 ms (15.9 MiB/s)
Kernel image @ 0x81000000 [ 0x000000 - 0x6fa200 ]
## Flattened Device Tree blob at 82000000
   Booting using the fdt blob at 0x82000000
   Loading Device Tree to 8fff2000, end 8ffffd7a ... OK

Starting kernel ...
[HANG]


Colibri iMX7 # ls mmc 0:1
     2173   boot.scr

1 file(s), 0 dir(s)

Colibri iMX7 # ls mmc 0:2
<DIR>       4096 .
<DIR>       4096 ..
<DIR>      16384 lost+found
<DIR>       4096 sys
<DIR>       4096 home
<DIR>       4096 run
<DIR>       4096 boot
<DIR>       4096 tmp
<DIR>       4096 proc
<DIR>       4096 dev
<DIR>       4096 root
<DIR>       4096 ostree

Colibri iMX7 # ls mmc 0:2 boot/ostree/torizon-005454ea31178f8d4c50de418f5810b4a31052eb59934fad17974eda2c38c6de/
<DIR>       4096 .
<DIR>       4096 ..
         7315968 vmlinuz
         2810152 initramfs
           44411 devicetree-imx7d-colibri-emmc-mccu-v3.dtb
           48090 devicetree-imx7d-colibri-emmc-eval-v3.dtb
           48074 devicetree-imx7d-colibri-eval-v3.dtb
           44951 devicetree-imx7s-colibri-eval-v3.dtb
Colibri iMX7 # ls mmc 0:2 boot/ostree/torizon-005454ea31178f8d4c50de418f5810b4a31052eb59934fad17974eda2c38c6de/vmlinuz
** Can not find directory. **




Colibri iMX7 # usb start
starting USB...
USB0:   Port not available.
USB1:   USB EHCI 1.00
scanning bus 1 for devices... 1 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Colibri iMX7 #
Colibri iMX7 # usb info
1: Hub,  USB Revision 2.0
 - u-boot EHCI Host Controller
 - Class: Hub
 - PacketSize: 64  Configurations: 1
 - Vendor: 0x0000  Product 0x0000 Version 1.0
   Configuration: 1
   - Interfaces: 1 Self Powered 0mA
     Interface: 0
     - Alternate Setting 0, Endpoints: 1
     - Class Hub
     - Endpoint 1 In Interrupt MaxPacket 8 Interval 255ms

@m.tellian

Looking at the logs you provided I’m more inclined to believe it’s some HW issue especially on W1. I’ve also been unable to reproduce anything I’ve seen here using the warrior branch of Torizon with Toradex carrier boards.

One final test to really see if it is a HW issue is, have you tried flashing software other than Torizon on these 3 problem modules? For example maybe try our normal Linux BSP?

I assume these 3 ran stable before using some other software correct?

I tried reflash these three but the USB controller does not find my USB stick anymore respectively itself is not found.
So i assume a HW fault coming from usb.
I’ll investigate here.

Did you flash the regular Linux Bsp? Could you provide some logs for Linux Bsp 2.8b6? Thanks.

since USB is not working anymore i have to try to flash it via LAN.
What logs specioally from bsp2.8b6 would be helpful for you?
Or should i just confirm a newly flashed os will or will not boot on the faulty usb colibri?
As you can see above the FS seems to be good successfully reading it out in uboot.

Yes, or you can use SD card to flash.

What logs specioally from bsp2.8b6 would be helpful for you?

dmesg log, output lsusb and any messages when your plug in and out any usb device

Or should i just confirm a newly flashed os will or will not boot on the faulty usb colibri?

If the regular new flashed Bsp 2.8b6 is not booting up, then you could share the serial boot log.

Thanks.

I tried to load the easy installer2.0b3 from sd card.
To verify these logs are from a healthy module (although the warnings and errors vary from time to time and sometimes it stucks on booting kernel…):

U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 38C
Reset cause: POR
DRAM:  1 GiB
PMIC:  RN5T567 LSIVER=0x01 OTPVER=0x0d
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
Video: 640x480x18
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX7 Dual 1GB (eMMC) V1.1A, Serial# 06476989
Net:   FEC0
Hit any key to stop autoboot:  0
Colibri iMX7 #
Colibri iMX7 #
Colibri iMX7 #
Colibri iMX7 # run bootcmd_mmc1
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot.scr
reading /boot.scr
471 bytes read in 10 ms (45.9 KiB/s)
## Executing script at 87000000
reading /tezi.itb
36423988 bytes read in 1585 ms (21.9 MiB/s)
## Loading kernel from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'kernel@1' kernel subimage
     Description:  Linux Kernel 4.9.87-2.8.3+gdef4502031f6
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x821000f0
     Data Size:    5740192 Bytes = 5.5 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x81000000
     Entry Point:  0x81000000
     Hash algo:    md5
     Hash value:   489f42eaa8bf0282f136bc4bbdf69a83
   Verifying Hash Integrity ... md5+ OK
## Loading ramdisk from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'ramdisk@1' ramdisk subimage
     Description:  SquashFS RAMdisk
     Type:         RAMDisk Image
     Compression:  uncompressed
     Data Start:   0x82679878
     Data Size:    30547968 Bytes = 29.1 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: unavailable
     Entry Point:  unavailable
     Hash algo:    md5
     Hash value:   882e04c342e4bd49379e57c24c84f00c
   Verifying Hash Integrity ... md5+ OK
## Loading fdt from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'fdt@imx7d-emmc' fdt subimage
     Description:  Colibri iMX7 Dual 1GB eMMC Device Tree
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x843b1184
     Data Size:    45385 Bytes = 44.3 KiB
     Architecture: ARM
     Hash algo:    md5
     Hash value:   353b795deb63f90f301ebb17789e21c5
   Verifying Hash Integrity ... md5+ OK
   Booting using the fdt blob at 0x843b1184
   Loading Kernel Image ... OK
   Loading Device Tree to 8fff1000, end 8ffff148 ... OK

Starting kernel ...

[    1.040341] CPU1: failed to come online
[    1.393601] spi_imx 30840000.ecspi: dma setup error -19, use pio
[    1.625416] caam 30900000.caam: failed to acquire DECO 0
[    1.630756] caam 30900000.caam: failed to instantiate RNG
[    1.675017] coresight-etm3x 3007d000.etm: ETM arch init failed
Running /etc/rc.local...
TDX_VER_ID="Colibri-iMX7_ToradexEasyInstaller_2.0b3-20191029"
Starting udev
[    2.621360] udevd[146]: specified group 'kvm' unknown
$Starting haveged: haveged: haveged starting up
[  OK  ]
System time was Thu Jan  1 00:00:03 UTC 1970.
Setting the System Clock using the Hardware Clock as reference...
hwclock: can't open '/dev/misc/rtc': No such file or directory
System Clock set. System local time is now Thu Jan  1 00:00:03 UTC 1970.
Tue Oct 29 19:18:24 UTC 2019
Saving the System Clock time to the Hardware Clock...
hwclock: can't open '/dev/misc/rtc': No such file or directory
Hardware Clock updated to Tue Oct 29 19:18:24 UTC 2019.
Certificates for RDP not found in /var/volatile
Generating RSA private key, 2048 bit long modulus (2 primes)
.................+++++
.......................................................haveged: haveged: ver: 1.9.2; arch: generic; vend: ; build: (gcc 8.2.0 CTV); collect: 128K

haveged: haveged: cpu: (VC); data: 16K (D); inst: 16K (D); idx: 12/40; sz: 15012/57848

haveged: haveged: tot tests(BA8): A:1/1 B:1/1 continuous tests(B):  last entropy estimate 8.00451

haveged: haveged: fills: 0, generated: 0

................................................................................+++++
e is 65537 (0x010001)
Signature ok
subject=C = CH, ST = Luzern, L = Luzern, O = Toradex, CN = (none)
Getting Private key
Certificate for RDP successfully generated

Welcome to the Toradex Easy Installer

This is a Linux based installer for Toradex modules. Currently, the installer
does not have a serial console interface. You can use the Toradex Easy Installer
via any of the available display interfaces using USB mouse/keyboard or via a
network connection using RDP. Use:
  # ip addr show eth0
to display the Ethernet IP address or use USB RNDIS at IP 192.168.11.1.

Check our documentation at:
  https://developer.toradex.com/software/toradex-easy-installer
/ #

on the bad module i got:

U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 34C
Reset cause: POR
DRAM:  1 GiB
PMIC:  RN5T567 LSIVER=0x01 OTPVER=0x0d
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
Video: 640x480x18
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX7 Dual 1GB (eMMC) V1.1A, Serial# 06475694
Net:   FEC0
Hit any key to stop autoboot:  0
Colibri iMX7 #
Colibri iMX7 # usb start
starting USB...
USB0:   USB EHCI 1.00
scanning bus 0 for devices... 1 USB Device(s) found
USB1: [HANGS HERE - POWER CYCLE]

U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 35C
Reset cause: POR
DRAM:  1 GiB
PMIC:  RN5T567 LSIVER=0x01 OTPVER=0x0d
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
Video: 640x480x18
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX7 Dual 1GB (eMMC) V1.1A, Serial# 06475694
Net:   FEC0
Hit any key to stop autoboot:  0
Colibri iMX7 # run bootcmd_mmc1
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot.scr
reading /boot.scr
471 bytes read in 10 ms (45.9 KiB/s)
## Executing script at 87000000
reading /tezi.itb
36423988 bytes read in 1584 ms (21.9 MiB/s)
## Loading kernel from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'kernel@1' kernel subimage
     Description:  Linux Kernel 4.9.87-2.8.3+gdef4502031f6
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x821000f0
     Data Size:    5740192 Bytes = 5.5 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x81000000
     Entry Point:  0x81000000
     Hash algo:    md5
     Hash value:   489f42eaa8bf0282f136bc4bbdf69a83
   Verifying Hash Integrity ... md5+ OK
## Loading ramdisk from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'ramdisk@1' ramdisk subimage
     Description:  SquashFS RAMdisk
     Type:         RAMDisk Image
     Compression:  uncompressed
     Data Start:   0x82679878
     Data Size:    30547968 Bytes = 29.1 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: unavailable
     Entry Point:  unavailable
     Hash algo:    md5
     Hash value:   882e04c342e4bd49379e57c24c84f00c
   Verifying Hash Integrity ... md5+ OK
## Loading fdt from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'fdt@imx7d-emmc' fdt subimage
     Description:  Colibri iMX7 Dual 1GB eMMC Device Tree
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x843b1184
     Data Size:    45385 Bytes = 44.3 KiB
     Architecture: ARM
     Hash algo:    md5
     Hash value:   353b795deb63f90f301ebb17789e21c5
   Verifying Hash Integrity ... md5+ OK
   Booting using the fdt blob at 0x843b1184
   Loading Kernel Image ... OK
   Loading Device Tree to 8fff1000, end 8ffff148 ... OK

Starting kernel ...

[HANGS HERE - POWER CYCLE]

U-Boot 2016.11-1.0b1+g07edca0 (Jan 01 1970 - 00:00:00 +0000)

CPU:   Freescale i.MX7D rev1.3 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 38C
Reset cause: POR
DRAM:  1 GiB
PMIC:  RN5T567 LSIVER=0x01 OTPVER=0x0d
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
Video: 640x480x18
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX7 Dual 1GB (eMMC) V1.1A, Serial# 06475694
Net:   FEC0
Hit any key to stop autoboot:  0
Colibri iMX7 # run bootcmd_mmc1
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot.scr
reading /boot.scr
471 bytes read in 11 ms (41 KiB/s)
## Executing script at 87000000
reading /tezi.itb
36423988 bytes read in 1585 ms (21.9 MiB/s)
## Loading kernel from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'kernel@1' kernel subimage
     Description:  Linux Kernel 4.9.87-2.8.3+gdef4502031f6
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x821000f0
     Data Size:    5740192 Bytes = 5.5 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x81000000
     Entry Point:  0x81000000
     Hash algo:    md5
     Hash value:   489f42eaa8bf0282f136bc4bbdf69a83
   Verifying Hash Integrity ... md5+ OK
## Loading ramdisk from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'ramdisk@1' ramdisk subimage
     Description:  SquashFS RAMdisk
     Type:         RAMDisk Image
     Compression:  uncompressed
     Data Start:   0x82679878
     Data Size:    30547968 Bytes = 29.1 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: unavailable
     Entry Point:  unavailable
     Hash algo:    md5
     Hash value:   882e04c342e4bd49379e57c24c84f00c
   Verifying Hash Integrity ... md5+ OK
## Loading fdt from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'fdt@imx7d-emmc' fdt subimage
     Description:  Colibri iMX7 Dual 1GB eMMC Device Tree
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x843b1184
     Data Size:    45385 Bytes = 44.3 KiB
     Architecture: ARM
     Hash algo:    md5
     Hash value:   353b795deb63f90f301ebb17789e21c5
   Verifying Hash Integrity ... md5+ OK
   Booting using the fdt blob at 0x843b1184
   Loading Kernel Image ... OK
   Loading Device Tree to 8fff1000, end 8ffff148 ... OK

Starting kernel ...

[    1.040334] CPU1: failed to come online
[    1.392941] spi_imx 30840000.ecspi: dma setup error -19, use pio
[HANGS HERE]

Since ESD damage is my best guess i verified our esd protection circuit via 15kV pulses. As of parts it is the nearly same as the reference design and withsand the test quite well.

Mentaly my further steps would be the following:

  1. ask you what to do next
  2. put esd dirctly on the modules usb lines to see if it results in the same errors as my 4 damaged modules.
  3. send the modules to you

jfi :
The stable 1.8 failed like it always does from any source:

Colibri iMX7 # run bootcmd_mmc1
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot.scr
reading /boot.scr
464 bytes read in 11 ms (41 KiB/s)
## Executing script at 87000000
reading /tezi.itb
21374744 bytes read in 938 ms (21.7 MiB/s)
## Loading kernel from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'kernel@1' kernel subimage
     Description:  Linux Kernel 4.1
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x821000dc
     Data Size:    5308472 Bytes = 5.1 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x81000000
     Entry Point:  0x81000000
     Hash algo:    md5
     Hash value:   910fc4bfe6325ca943e1af0824a15957
   Verifying Hash Integrity ... md5+ OK
## Loading ramdisk from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'ramdisk@1' ramdisk subimage
     Description:  SquashFS RAMdisk
     Type:         RAMDisk Image
     Compression:  uncompressed
     Data Start:   0x826101fc
     Data Size:    15929344 Bytes = 15.2 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: unavailable
     Entry Point:  unavailable
     Hash algo:    md5
     Hash value:   bf0a406d6fa572c48f601023ca980bd2
   Verifying Hash Integrity ... md5+ OK
## Loading fdt from FIT Image at 82100000 ...
   Using 'config@imx7d-emmc' configuration
   Trying 'fdt@imx7d-emmc' fdt subimage
     Description:  Colibri iMX7 Dual 1GB eMMC Device Tree
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x83557184
     Data Size:    45372 Bytes = 44.3 KiB
     Architecture: ARM
     Hash algo:    md5
     Hash value:   b34140d2c5400aef8309a51bb7486030
   Verifying Hash Integrity ... md5+ OK
   Booting using the fdt blob at 0x83557184
   Loading Kernel Image ... OK
   Loading Device Tree to 8fff1000, end 8ffff13b ... OK

Starting kernel ...

[    1.000306] CPU1: failed to come online
[    1.296890] imx_usb 30b10000.usb: Can't register ci_hdrc platform device, err=-517
[    1.469152] caam 30900000.caam: failed to acquire DECO 0
[    1.503763] caam 30900000.caam: failed to acquire DECO 0
[    1.536471] caam 30900000.caam: failed to acquire DECO 0
[    1.570209] caam 30900000.caam: failed to acquire DECO 0
[    1.607340] caam 30900000.caam: failed to acquire DECO 0
[    1.641477] caam 30900000.caam: failed to acquire DECO 0
[    1.676604] caam 30900000.caam: failed to acquire DECO 0
[    1.713708] caam 30900000.caam: failed to acquire DECO 0
[    1.747781] caam 30900000.caam: failed to acquire DECO 0
[    1.781859] caam 30900000.caam: failed to acquire DECO 0
[    1.815930] caam 30900000.caam: failed to acquire DECO 0
[    1.850038] caam 30900000.caam: failed to acquire DECO 0
[    1.884111] caam 30900000.caam: failed to acquire DECO 0
[    1.918184] caam 30900000.caam: failed to acquire DECO 0
[    1.952269] caam 30900000.caam: failed to acquire DECO 0
[    1.986344] caam 30900000.caam: failed to acquire DECO 0
[    2.019454] caam 30900000.caam: failed to acquire DECO 0
[    2.054561] caam 30900000.caam: failed to acquire DECO 0
[    2.089671] caam 30900000.caam: failed to acquire DECO 0
[    2.121725] caam 30900000.caam: failed to acquire DECO 0
[    2.153792] caam 30900000.caam: failed to acquire DECO 0
[    2.185863] caam 30900000.caam: failed to acquire DECO 0
[    2.217950] caam 30900000.caam: failed to acquire DECO 0
[    2.251061] caam 30900000.caam: failed to acquire DECO 0
[    2.256385] caam 30900000.caam: failed to instantiate RNG
[    2.262093] Trying to vfree() nonexistent vm area (b4063000)
[    2.269643] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[    2.277745] pgd = 80004000
[    2.280520] [00000004] *pgd=00000000
[    2.284126] Internal error: Oops: 805 [#1] SMP ARM
[    2.288922] Modules linked in:
[    2.292004] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.1.44-1.8.0+gf3637519a6aa #1
[    2.300881] Hardware name: Freescale i.MX7 Dual (Device Tree)
[    2.306634] task: b407c000 ti: b4062000 task.ti: b4062000
[    2.312043] PC is at caam_sm_startup+0x8c/0x3cc
[    2.316580] LR is at device_add+0x150/0x568
[    2.320773] pc : [<804c03dc>]    lr : [<8037504c>]    psr: a0000113
[    2.320773] sp : b4063ec8  ip : 00000000  fp : 00000000
[    2.332256] r10: 00000000  r9 : 808f4624  r8 : 00000000
[    2.337489] r7 : 808dab9c  r6 : b779d838  r5 : b4176a10  r4 : b4211840
[    2.344023] r3 : b5386400  r2 : b5386410  r1 : 00000000  r0 : b5386400
[    2.350559] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[    2.357875] Control: 10c5387d  Table: 8000406a  DAC: 00000015
[    2.363627] Process swapper/0 (pid: 1, stack limit = 0xb4062210)
[    2.369641] Stack: (0xb4063ec8 to 0xb4064000)
[    2.374009] 3ec0:                   804d476c 808dab9c 00000000 b5386400 b4176a10 8090b920
[    2.382198] 3ee0: 8090b920 b53718c0 808dab9c 00000000 808f4624 00000007 00000000 808dabec
[    2.390387] 3f00: 8090b920 80009678 80911c64 808f4600 00000000 80141aac 80911d00 b4113700
[    2.398575] 3f20: 00000000 809105c4 80885748 000000ac b7fffb3f 800453a4 00000000 80822284
[    2.406763] 3f40: 00000006 00000006 b7fffb4a 807d11a4 809105ac 000000ac 80951000 000000ac
[    2.414952] 3f60: 80951000 80951000 808fe0e4 808f461c 808f4624 808b4db8 00000006 00000006
[    2.423140] 3f80: 808b45a0 80654750 00000000 80654750 00000000 00000000 00000000 00000000
[    2.431327] 3fa0: 00000000 80654758 00000000 8000f3a8 00000000 00000000 00000000 00000000
[    2.439515] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    2.447703] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff 10080000
[    2.455899] [<804c03dc>] (caam_sm_startup) from [<808dabec>] (caam_sm_init+0x50/0x58)
[    2.463744] [<808dabec>] (caam_sm_init) from [<80009678>] (do_one_initcall+0x8c/0x1d0)
[    2.471675] [<80009678>] (do_one_initcall) from [<808b4db8>] (kernel_init_freeable+0x144/0x1d4)
[    2.480389] [<808b4db8>] (kernel_init_freeable) from [<80654758>] (kernel_init+0x8/0xe8)
[    2.488495] [<80654758>] (kernel_init) from [<8000f3a8>] (ret_from_fork+0x14/0x2c)
[    2.496076] Code: e59d300c e2832010 e5843008 e5834068 (e58a2004)
[    2.502214] ---[ end trace 5b4cfc5db27edc94 ]---
[    2.506861] mmc0: MAN_BKOPS_EN bit is not set
[    2.511321] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    2.511321]
[    2.520468] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    2.520468]

HI @m.tellian

Thanks for the log.

Since ESD damage is my best guess i verified our esd protection circuit via 15kV pulses. As of parts it is the nearly same as the reference design and withsand the test quite well.

Mentalyy my further steps would be the following:
ask you what to do next

If you are sure that the modules were damaged through ESD damage, then nothing can be done. This modules won’t be under warranty.

Best regards,
Jaski

from my standpoint of view i can eliminate esd damage by our system. esd testing is just a preventive procedure to possibly reproduce this error.

the think the matter is to find somewhere around

Colibri iMX7 # usb start
 starting USB...
 USB0:   USB EHCI 1.00
 scanning bus 0 for devices... 1 USB Device(s) found
 USB1:
    [HANG]

I think the kernel hangs silently on the same point at boot. Waht can cause such a problem?

i would really appreciate your input in this situation.

thanks

HI @m.tellian

So it hangs in U-Boot. Could you reproduce the error on one of the Toradex boards? If yes, is the issue only on one sample or also on other modules?

Best regards,
Jaski

Yes, the behavior is related to the colibri module.
Once happend the modules hang every boot at “starting kernel…” and in the bootloader on “usb start” if executed. (independent of carrier board)

By now i have 5 such modules of which 2 are documented treated esd crrect through the whole production line. (3 were dev modules).

Thanks for your answer.

It seems that these modules have a hardware issue.

Can you file up a RMA and a put the Link to this thread as comment? We apologize that you had this experience with our Hardware.

Best regards,
Jaski