Context is the following
Hardware: Toradex Colibri iMX8 QuadXPlus 2GB Wi-Fi / BT IT V1.0D
OS: TorizonCore 5.4.0+build.10
Frequency and steps to reproduce: always, when deploying an image to the board with VS Code Torizon Extension, device connected via network (LAN/Ethernet).
I’m not interested for the issue itself, I could easily try to recover this development board and reinstall Torizon, but I’d like to understand a little better if it can be recovered and/or could happen “in production environment”. What do you think about?
Thanks for your suggestions and best regards,
ldvp
You say this always happens to you? That’s very strange I’ve never seen such an error before with the VS Code extensions. It’s hard to say what could be going on here. Though I can attest that such similar errors are very rare in my experience.
I do have a couple of questions however that may help clarify things.
Do you have any other Colibri i.MX8X modules? Does this issue always happen to them as well, or just this specific module?
If you do recover and reflash the module does the issue still always happen?
At the moment I can’t seem to reproduce this issue myself. So my ability to investigate this on my side is a bit limited.
Hi @jeremias.tx,
sorry if I was not clear enough: I’m having the issue only on one single board used for several months without issues, all the other boards are working correctly.
I could easily recover and reflash the board with the issue, but before doing that I’d like to understand better the problem, it’s root cause and if it could be recovered in a less invasive way.
I understand you cannot reproduce it, but I can and I am available to make tests and investigate you think can make sense or you can suggest. To me, it smells like a filesystem corruption or something similar.
Ahh okay I understand now. Filesystem corruption while rare could be possible. Let’s try and tackle more common issues before we jump to filesystem corruption.
Usually a process being blocked for such a long period of time indicates a system overwhelmed. Either this be with not having enough memory, or a process producing a lot of I/O in a short time frame.
The particular task that got blocked is jbd2/mmcblk0p1 according to your kernel panic. jbd2 is the “Journaling Block Device” that sits between the file system and block device driver. As a start I’d suggest to see if a lot of logs anywhere are being produced to see if maybe the blockage is due to high I/O from logs being produced.