I would like to get your feedback on a CPU overheating problem we are facing on our machine using the Colibri T30 v1.1E with a kernel based on Colibri_T30_LinuxImageV2.5Beta2_20151106.
Our GUI application is made of video screens and simple screens (i.e. sequence of screens playing videos using GPU vs. basic Qt widgets display). On average, we have a CPU usage of around 25%.
We have seen some Colibris suddenly crash afer running the application for some hours (some fail just after 2 hours). After more investigations, we could identify that the crash was due to overheating of the Colibri (internal temperature > 85°C).
We have run some more tests on various Colibris and could find that some would always fail where some others would run for a long time without any issues (CPU temperature remains stable around 60-65°C).
We could also link the overheating of failing CPUs to the playing of videos. Here is the result of our investigations:
- Continuous video playing (H.264, 640x354, 30fps) using GPU on 11 colibris (see chart attached)
- 2 reached a CPU temperature of 85°C in less than 2h and crashed
- 1 reached a CPU temperature of 85°C in 2.5h and crashed
- 3 reached a CPU temperature of > 75°C in 2h which kept increasing slowly
- 5 remained at an internal CPU temperature of less than 70°C for more than 8h
- Running the same application with video playback enabled/disabled on an overheating Colibri:
- when video playback is enabled, the Colibri temperature keeps increasing
- when video playback is disabled, the Colibri temperature remains stable at around 55°C
What we see from those tests is that different Colibris behave very differently under the same conditions. Finally, this does not seem to be linked to some batch of Colibri but spread accross the various ones we have.
We have read your report on Thermal Testing the T30 and your tests are much more extreme in terms of CPU/GPU usage than what we do in our application. Although I agree that using a heatsink would help, I would not expect the Colibri to reach such temperatures with this average CPU load.
Have you heard of such issue in the past? Could this be linked to GPU? I would greatly appreciate your feedback on this issue.