Could you give us more Information to reproduce this issue? Maybe you could also send the one module which is having issues back to Toradex.
Best regards,
Jaski
Could you give us more Information to reproduce this issue? Maybe you could also send the one module which is having issues back to Toradex.
Best regards,
Jaski
Hi. So far the problem only arises on V1.1B modules. Additionally, it seems that only our kernel with the additional initramfs triggers the issue. When just removing the initramfs from that kernel the issue is (so far) not reproducable. I have appended an addtional log. Its quite clear that the eMMC has some kind of “problem”.
[ 16.093822] random: crng init done
[ 615.550096] mmc0: Card stuck in programming state! mmcblk0 card_busy_detect
[ 615.561341] mmc0: cache flush error -110
[ 618.630209] mmc0: tried to reset card, got error -110
[ 618.645262] blk_update_request: I/O error, dev mmcblk0, sector 8192
[ 618.655489] Buffer I/O error on dev mmcblk0p1, logical block 0, lost sync page write
KEZU LOG fsck on /dev/mmcblk0p2 with ext4
[ 628.960128] mmc0: Timeout waiting for hardware interrupt. retries left=0 opcode=12
[ 628.977639] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 628.993938] mmc0: sdhci: Sys addr: 0x134b6000 | Version: 0x00000002
[ 629.010220] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 629.026381] mmc0: sdhci: Argument: 0x00022000 | Trn mode: 0x0000003b
[ 629.042426] mmc0: sdhci: Present: 0x01fd8009 | Host ctl: 0x00000011
[ 629.058479] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 629.074499] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x000010ff
[ 629.090487] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 629.106417] mmc0: sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
[ 629.122431] mmc0: sdhci: AC12 err: 0x00000082 | Slot int: 0x00000003
[ 629.138401] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000a007
[ 629.154372] mmc0: sdhci: Cmd: 0x0000123a | Max curr: 0x00ffffff
[ 629.170372] mmc0: sdhci: Resp[0]: 0x00ff8080 | Resp[1]: 0xffffffff
[ 629.186339] mmc0: sdhci: Resp[2]: 0x320f5913 | Resp[3]: 0x00000900
[ 629.202274] mmc0: sdhci: Host ctl2: 0x00000000
[ 629.216163] mmc0: sdhci: ADMA Err: 0x00000003 | ADMA Ptr: 0x18078204
[ 629.232065] mmc0: sdhci: ============================================
[ 629.249229] mmcblk0: error -110 sending status command, retrying
[ 629.259449] mmcblk0: error -110 sending status command, retrying
[ 629.269657] mmcblk0: error -110 sending status command, aborting
[ 629.281056] mmc0: cache flush error -110
[ 632.350212] mmc0: tried to reset card, got error -110
[ 632.364793] blk_update_request: I/O error, dev mmcblk0, sector 139264
[ 632.375045] blk_update_request: I/O error, dev mmcblk0, sector 139272
[ 632.385203] blk_update_request: I/O error, dev mmcblk0, sector 139280
[ 632.395319] blk_update_request: I/O error, dev mmcblk0, sector 139288
HI @qojote
Thanks for the log.
So you just include the initramfs in your Kernel or also soemthing else?
Best regards,
Jaski
Hi @jaski.tx
Besides including an initramfs we added some kernel configurations to support additional hardware. The same kernel with the extended configuration runs fine without the initramfs part (>10k power cycles). Because the resulting kernel is quite big (~20M) so we needed to changed the RAM layout in u-boot as well. We are trying to narrow the problem down to a minimal example and we really would appriciate your support (we need to deliver some units to customers soon). One more thing: Mounting the bootfs does always work. The error occurs only when mounting rootfs. And what’s really mysterious is the fact that a broken module is “cured” when mounting the bootfs and writing onto this partition (no special content). In that case mounting rootfs will succeed.
Hi @qojote
Thanks for this information. It seems to be a interesting issue.
Please share the kernel config and initramfs configuration and the memory layout? What is the RAM and eMMC layout in U-Boot?
Best regards,
Jaski
We were able to break the problem down to a small minimal example.
It can be created as follows:
.
INITRAMFS_IMAGE = "initramfs-debug-image"
INITRAMFS_IMAGE_BUNDLE = "1"
To produce the error, the system is started and disconnected from the power supply as soon as it is accessible on the network. After a few hundred boots, the system freezes with the following kernel output: boot.log
Hi @patdex
Thanks for the information. How many SoM are showing this issue?
Could you reproduce the issue on Bsp5.1?
Thanks and best regards,
Jaski
Hi @qojote / @jaski.tx , we are also seeing this issue.
Was a root-cause ever identified?
We have hundreds of devices in the field, so are quite concerned.
( we’re using Model: Toradex Apalis iMX6 Dual 1GB IT V1.1B )
Hi @mik,
In our initramfs all partitions were mounted to do some update stuff. We had to mount the rootfs statically and before every other partition to resolve our specific issue.
BR