I am currently working with the latest Tegra K1 module and its Apalis Evaluation Board. I have installed the Linux4Tegra JetPack Operating System R21.6 that includes support for GPU computation through the CUDA platform. I also added a cooling fan on top of the heatsink.
My plan is to use the GPU to run a deep learning model on a real time video captured by a webcam.
I correctly installed the needed software, however after a seemingly random time after the model is started (on average one or two minutes), the board powers down unexpectedly and reboots the system.
The hardware is under a heavy computational load, however by monitoring all the system resources, nothing should cause such a critical crash:
No board component temperature never rises above 55 C, the CPU usage is fairly small (one or 2 CPUs at 50-60%) and there should be enough free memory for the system to run smoothly (more than 1GB).
This has been extremely hard to troubleshoot as nothing is printed to the system nor kernel logs before the crash (I checked
/var/log/kern.log) and this issue is not caused by the software that I am running as it has been tested on other platforms.
Any help towards understanding or solving this issue is greatly appreciated.
How exactly did you go about installing JetPack resp. L4T R21.6? Or asked differently, what exact version of our BSP is your installation now based on?
I installed the NVIDIA JetPack 3.1-21.6-2.7b5 image though the Toradex Easy Installer V1.4.
Could you share binaries of your application and instruction how to install and run it? So we can try to repro this issue locally.
As alex.tx said binaries of your application will really useful. Could you tell us which hardware is connected to the Apalis Evaulation Board (usb hub, pci or mini sata card, …)?
Could you also check, if the power supply for the board (12-27V) is powerful enough (more than 30W)?
Ok sure I created a public github repository where I posted the code that I am running and instructions to make it work from a clean install of the Apalis TK1 JetPack system.
The only hardware which is connected to the board is a DVI-D screen, USB keyboard and mouse and a 640p USB webcam (you will need a camera to make the demo work, however the same issue happens without a connected webcam).
The development board is powered with the power supply that you provided along with the development board which is 12V 2.5A.
Furthermore we also tried powering the board with a regulated power supply which can output more than 30W but noticed no improvements.
After installing the needed packages, you can run the code and should be able to see a real-time object detection demo, however it will cause a board crash after some time (usually 2 or 3 minutes).
I could also reproduce the issue using other GPU-heavy frameworks, such as tensorflow.
Thanks for the update. I wrote you an E-Mail.