Greetings,
We had an application based on LinuxImageV2.7. After a increasing the rate we write to the flash, it started to kernel panic (ubifs errors) upon booting. The module itself could be recovered after reflash.
Then after recommendation from Toradex we upgraded to ImageV2.8 (kernel 4.9.87 3bb6e32). Then, upon git clone our application, we got some warnings like this:
[ 385.144017] Workqueue: writeback wb_workfn (flush-ubifs_0_3)
[ 385.149734] [<8010e904>] (unwind_backtrace) from [<8010b5f4>] (show_stack+0x10/0x14)
[ 385.157493] [<8010b5f4>] (show_stack) from [<804100b0>] (dump_stack+0x78/0x8c)
[ 385.164734] [<804100b0>] (dump_stack) from [<80377d40>] (ubifs_log_start_commit+0x324/0x38c)
[ 385.173187] [<80377d40>] (ubifs_log_start_commit) from [<803789e4>] (do_commit+0x1c4/0x4e0)
[ 385.181552] [<803789e4>] (do_commit) from [<80362454>] (make_reservation+0x260/0x3a8)
[ 385.189395] [<80362454>] (make_reservation) from [<80362e94>] (ubifs_jnl_write_data+0xe8/0x26c)
[ 385.198105] [<80362e94>] (ubifs_jnl_write_data) from [<803652c8>] (do_writepage+0x94/0x200)
[ 385.206471] [<803652c8>] (do_writepage) from [<801c4184>] (__writepage+0x14/0x40)
[ 385.213969] [<801c4184>] (__writepage) from [<801c4858>] (write_cache_pages+0x180/0x3a8)
[ 385.222073] [<801c4858>] (write_cache_pages) from [<801c4ac0>] (generic_writepages+0x40/0x60)
[ 385.230613] [<801c4ac0>] (generic_writepages) from [<80229064>] (__writeback_single_inode+0x28/0x190)
[ 385.239847] [<80229064>] (__writeback_single_inode) from [<80229860>] (writeback_sb_inodes+0x1f8/0x3c8)
[ 385.249252] [<80229860>] (writeback_sb_inodes) from [<80229aa4>] (__writeback_inodes_wb+0x74/0xac)
[ 385.258222] [<80229aa4>] (__writeback_inodes_wb) from [<80229c84>] (wb_writeback+0x1a8/0x1c0)
[ 385.266758] [<80229c84>] (wb_writeback) from [<8022a1a0>] (wb_workfn+0x190/0x2b8)
[ 385.274255] [<8022a1a0>] (wb_workfn) from [<8013a774>] (process_one_work+0x120/0x3e8)
[ 385.282099] [<8013a774>] (process_one_work) from [<8013aa74>] (worker_thread+0x38/0x580)
[ 385.290202] [<8013aa74>] (worker_thread) from [<8013ff10>] (kthread+0xf8/0x110)
[ 385.297525] [<8013ff10>] (kthread) from [<801078d0>] (ret_from_fork+0x14/0x24)
Other error:
[ 166.644109] UBIFS assert failed in ubifs_log_start_commit at 412 (pid 204)
[ 166.667562] CPU: 0 PID: 204 Comm: ubifs_bgt0_4 Tainted: G O 4.9.87-dirty #1
[ 166.675666] Hardware name: Freescale i.MX7 Dual (Device Tree)
[ 166.681443] [<8010e904>] (unwind_backtrace) from [<8010b5f4>] (show_stack+0x10/0x14)
[ 166.689204] [<8010b5f4>] (show_stack) from [<804100b0>] (dump_stack+0x78/0x8c)
[ 166.696445] [<804100b0>] (dump_stack) from [<80377d40>] (ubifs_log_start_commit+0x324/0x38c)
[ 166.704899] [<80377d40>] (ubifs_log_start_commit) from [<803789e4>] (do_commit+0x1c4/0x4e0)
[ 166.713266] [<803789e4>] (do_commit) from [<80378e88>] (ubifs_bg_thread+0x188/0x18c)
[ 166.721022] [<80378e88>] (ubifs_bg_thread) from [<8013ff10>] (kthread+0xf8/0x110)
[ 166.728519] [<8013ff10>] (kthread) from [<801078d0>] (ret_from_fork+0x14/0x24)
Another one:
[ 193.872544] UBIFS assert failed in ubifs_log_start_commit at 412 (pid 279)
[ 193.901673] CPU: 0 PID: 279 Comm: kworker/u2:2 Tainted: G O 4.9.87-dirty #1
[ 193.909778] Hardware name: Freescale i.MX7 Dual (Device Tree)
[ 193.915543] Workqueue: writeback wb_workfn (flush-ubifs_0_4)
[ 193.921261] [<8010e904>] (unwind_backtrace) from [<8010b5f4>] (show_stack+0x10/0x14)
[ 193.929021] [<8010b5f4>] (show_stack) from [<804100b0>] (dump_stack+0x78/0x8c)
[ 193.936262] [<804100b0>] (dump_stack) from [<80377d40>] (ubifs_log_start_commit+0x324/0x38c)
[ 193.944715] [<80377d40>] (ubifs_log_start_commit) from [<803789e4>] (do_commit+0x1c4/0x4e0)
[ 193.953080] [<803789e4>] (do_commit) from [<80362454>] (make_reservation+0x260/0x3a8)
[ 193.960924] [<80362454>] (make_reservation) from [<80362e94>] (ubifs_jnl_write_data+0xe8/0x26c)
[ 193.969634] [<80362e94>] (ubifs_jnl_write_data) from [<803652c8>] (do_writepage+0x94/0x200)
[ 193.977998] [<803652c8>] (do_writepage) from [<801c4184>] (__writepage+0x14/0x40)
[ 193.985496] [<801c4184>] (__writepage) from [<801c4858>] (write_cache_pages+0x180/0x3a8)
[ 193.993600] [<801c4858>] (write_cache_pages) from [<801c4ac0>] (generic_writepages+0x40/0x60)
[ 194.002141] [<801c4ac0>] (generic_writepages) from [<80229064>] (__writeback_single_inode+0x28/0x190)
[ 194.011376] [<80229064>] (__writeback_single_inode) from [<80229860>] (writeback_sb_inodes+0x1f8/0x3c8)
[ 194.020781] [<80229860>] (writeback_sb_inodes) from [<80229aa4>] (__writeback_inodes_wb+0x74/0xac)
[ 194.029751] [<80229aa4>] (__writeback_inodes_wb) from [<80229c84>] (wb_writeback+0x1a8/0x1c0)
[ 194.038287] [<80229c84>] (wb_writeback) from [<8022a1a0>] (wb_workfn+0x190/0x2b8)
[ 194.045784] [<8022a1a0>] (wb_workfn) from [<8013a774>] (process_one_work+0x120/0x3e8)
[ 194.053628] [<8013a774>] (process_one_work) from [<8013aa74>] (worker_thread+0x38/0x580)
[ 194.061731] [<8013aa74>] (worker_thread) from [<8013ff10>] (kthread+0xf8/0x110)
[ 194.069054] [<8013ff10>] (kthread) from [<801078d0>] (ret_from_fork+0x14/0x24)
And another after a sync:
root@iMX7:~# sync
[ 40.931610] INFO: task sync:529 blocked for more than 10 seconds.
[ 40.938084] Not tainted 4.9.87-dirty #1
[ 40.945608] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 40.954540] sync D 0 529 520 0x00000000
[ 40.960790] [<807a3d38>] (__schedule) from [<807a4144>] (schedule+0x4c/0xac)
[ 40.969121] [<807a4144>] (schedule) from [<80229214>] (wb_wait_for_completion+0x48/0x78)
[ 40.978104] [<80229214>] (wb_wait_for_completion) from [<80229440>] (sync_inodes_sb+0x94/0x204)
[ 40.987711] [<80229440>] (sync_inodes_sb) from [<80204878>] (iterate_supers+0xc8/0x128)
[ 40.996590] [<80204878>] (iterate_supers) from [<8022d894>] (sys_sync+0x34/0xa4)
[ 41.004775] [<8022d894>] (sys_sync) from [<80107800>] (ret_fast_syscall+0x0/0x48)
Should I be worried about the warnings? This is a experimental procedure intended for testing, but should be more steps to follow or another kernel / image?
We are also using the kernel features:
Preemptible Kernel (Low-latency Desktop)
Timer Frequency (1000 Hz)
EDIT:
We are trying to achieve a more stable file system. We had several kernel panic from field where the boards have been reseted a few times. For that we changed from LinuxImageV2.7.4 to V2.8, and it seems to respond better, but these warnings are shown if we try to write to the flash the first time the module boots.
We are kind of worried from the experience from the kernel panic. A following step would be to have a read-only file system separated from the application partition, but in this step we’re going in the all-in-one.