Debugging linux boot failing when running old kernel (Linux-3.14.28) on top of new bootloader (v2.8)

gardarh · September 26, 2019, 8:18am

Hello,

I’m working on updating our manufacturing process to work with the Toradex Colibri imx6DL V1.1A board. Our manufacturing process was previously set up to support the V1.0 board. My initial idea was that updating U-Boot to the latest version (we have custom modifictions to u-boot) was enough and that I wouldn’t have to touch the linux part.

I have updated U-Boot successfully to the latest stable version (v2.8) but when I run my linux image it hangs. This is the topmost U-boot initial output:

Industrial temperature grade DDR3 timings, 64bit bus width.
Trying to boot from MMC1


U-Boot 2016.11-00019-g38333b5-dirty (Sep 25 2019 - 14:29:53 +0000)

CPU:   Freescale i.MX6DL rev1.4 at 792 MHz
Reset cause: POR
I2C:   ready
DRAM:  512 MiB
PMIC:  device id: 0x10, revision id: 0x21, programmed
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
auto-detected panel vga-rgb
Display: vga-rgb (640x480)
In:    serial
Out:   serial
Err:   serial
Model: Toradex Colibri iMX6 DualLite 512MB IT V1.1A, Serial# 10586151
Net:   using PHY at 0

And this is the top output when I run my linux image:

Colibri iMX6-Nox # run bootcmd
Booting from Manufacturing SD card...
switch to partitions #0, OK
mmc1 is current device
42605 bytes read in 121 ms (343.8 KiB/s)
4728640 bytes read in 538 ms (8.4 MiB/s)
## Booting kernel from Legacy Image at 11000000 ...
   Image Name:   Linux-3.14.28
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    4728576 Bytes = 4.5 MiB
   Load Address: 10008000
   Entry Point:  10008000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 12100000
   Booting using the fdt blob at 0x12100000
   Loading Kernel Image ... OK
   Loading Device Tree to 1fff2000, end 1ffff66c ... OK

Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.14.28 (gardarh@yoctodev) (gcc version 4.9.3 20150311 (prerelease) (Linaro GCC 4.9-2015.03) ) #1 SMP Tue Mar 29 11:51:57 GMT 2016
[    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=10c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[    0.000000] Machine model: Toradex Colibri iMX6DL/S on Colibri Evaluation Board V3
[    0.000000] cma: CMA: reserved 256 MiB at 20000000
[    0.000000] Memory policy: Data cache writealloc
[    0.000000] PERCPU: Embedded 8 pages/cpu @8fb3c000 s8512 r8192 d16064 u32768
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 130048
[    0.000000] Kernel command line: galcore.contiguousSize=50331648 user_debug=30 ip=off root=/dev/mmcblk1p1 rw,noatime rootfstype=ext2 rootwait fec_mac=00:14:2d:a1:88:27 consoleblank=0 no_console_suspend=1 console=tty1 console=ttymxc0,115200n8 video=mxcfb0:dev=lcd,640x480M@60,if=RGB666 video=mxcfb1:off fbmem=8M

This is the last output I get in my console:

[    1.813970] ci_hdrc ci_hdrc.1: USB 2.0 started, EHCI 1.00
[    1.820317] hub 2-0:1.0: USB hub found
[    1.824156] hub 2-0:1.0: 1 port detected
[    1.829607] input: gpio-keys.24 as /devices/soc0/gpio-keys.24/input/input1
[    1.836899] snvs_rtc 20cc034.snvs-rtc-lp: setting system clock to 1970-01-01 00:00:00 UTC (0)
[    1.856208] ALSA device list:
[    1.859240]   #0: imx6-colibri-sgtl5000

And then it goes silent.

Please tell me if you can spot any red flags in the output above.

Alternatively, can you please direct me to how I go about figuring out what is breaking the boot sequence?

NOTE: I’m running this on an IT board we purchased by error (we generally use non-IT boards). I don’t think this matters but just wanted to mention it in case it mattered.

Thank you,
Gardar

gardarh · September 26, 2019, 8:33am

I managed to boot an old board, these were the following lines during boot:

[    1.896165] ALSA device list:
[    1.899204]   #0: imx6-colibri-sgtl5000
[    1.913708] EXT3-fs (mmcblk1p1): recovery required on readonly filesystem
[    1.920541] EXT3-fs (mmcblk1p1): write access will be enabled during recovery
[    2.614433] kjournald starting.  Commit interval 5 seconds
[    2.622357] EXT3-fs (mmcblk1p1): recovery complete
[    2.627244] EXT3-fs (mmcblk1p1): mounted filesystem with ordered data mode
[    2.634265] VFS: Mounted root (ext3 filesystem) readonly on device 179:1.
[    2.644140] devtmpfs: mounted

I guess I have my answer for the previous question: look at the boot sequence for a working device.

If you have an idea about what’s going on I would really appreciate any suggestions. I noticed that the new U-Boot doesn’t have ext2load anymore but has ext4load instead (so I switched to that). It looks related to this. I’m still surprised that a U-Boot change will affect the linux boot in this way, I thought that once U-Boot handed over control to the OS it wouldn’t matter anymore.

gardarh · September 26, 2019, 8:45am

PS. I compiled ext2load into U-Boot and booted using that but with no success.

gardarh · September 26, 2019, 8:48am

Hmm, let me look at the following and then get back:

https://www.toradex.com/community/questions/16464/image-27b4-fails-to-mount-rootfs.html?smartspace=linux

gardarh · September 26, 2019, 10:05am

I have verified that I’m booting the correct device (I’m booting from sdcard, or “mmcblk1p1”) and I believe the boot parameters are correct (ip=off root=/dev/mmcblk1p1 rw,noatime rootfstype=ext3 rootwait) - I’m the using ext3 filesystem.

Can you think of any attempts to fix this issue?

Btw. here is my fstab:

/dev/root            /                    auto       defaults              1  1
proc                 /proc                proc       defaults              0  0
devpts               /dev/pts             devpts     mode=0620,gid=5       0  0
usbdevfs             /proc/bus/usb        usbdevfs   noauto                0  0
tmpfs                /run                 tmpfs      mode=0755,nodev,nosuid,strictatime 0  0
tmpfs                /var/volatile        tmpfs      defaults              0  0

# uncomment this if your device has a SD/MMC/Transflash slot
#/dev/mmcblk0p1       /media/card          auto       defaults,sync,noauto  0  0

gardarh · September 26, 2019, 1:34pm

I tried updating my manufacturing image to a more recent kernel version and it worked. I guess there must’ve been some filesystem related issue in the 3.14 kernel when running on top of the new U-Boot or something.

Still I find it strange that the u-boot version matters when booting the kernel but well, I’m not 100% sure it was that.

In any case, this is not a problem anymore.

marcel.tx · September 26, 2019, 9:35pm

Unfortunately, those downstream SoC vendor kernels are notorious for some special inter bootloader/kernel dependencies. That said, in your case this may just be an ext3 vs. ext4 incompatibility issue. We migrated to exclusively using ext4 a while ago.