Thermal driver issue on iMX7

I noticed an issue with the iMX7 thermal driver. I periodically read the file /sys/devices/virtual/thermal/thermal_zone0 to fetch the CPU temperature.

While I get the correct value in 99.9% percent of the cases sometimes I get an empty string when reading the file. So this seems to be a race condition in the driver. Is this a known issue? I tried to apply the following patch which didn’t help: linux-toradex.git - Linux kernel for Apalis, Colibri and Verdin modules

Best regards,
Michael

hi @michaelg

Could you provide the hardware version of your module?

Concerning your Issue:

  • How and how often are you fetching the CPU temperature?
  • Have you done any changes to the Toradex regular Bsp image? If yes, can you share these changes?

Thanks and best regards, Jaski

Hi @jaski.tx

The hardware version of the CPU module is V1.1C.

I’m fetching the cpu temperature once a second (1 Hz). I did some changes to the device-tree. In addition to that I did minor changes to the kernel config. Last but not least I added a patch for imx_thermal as described above. Please find all my changes in the attachement.

link text

Best regards,
Michael

I recompiled the thermal driver with #define DEBUG. Then I fetched the logs:

  root@b2qt-colibri-imx7:~# dmesg | grep thermal
    [    1.696804] thermal thermal_zone0: temp measurement never finished
    [    3.691418] thermal thermal_zone0: millicelsius: 45000
    [    5.691185] thermal thermal_zone0: millicelsius: 46000
    [    9.691150] thermal thermal_zone0: millicelsius: 45000
    [   14.441764] thermal thermal_zone0: millicelsius: 46000
    [   22.504455] thermal thermal_zone0: temp measurement never finished
    [   22.504608] thermal thermal_zone0: temp measurement never finished
    [   73.491917] thermal thermal_zone0: millicelsius: 45000
    [   73.678586] thermal thermal_zone0: millicelsius: 46000
    [   74.491390] thermal thermal_zone0: millicelsius: 45000

Best regards

HI @michaelg

I tried reading out the CPU temperature using the command watch -n 1 cat //sys/devices/virtual/thermal/thermal_zone0/temp. Most of the time the CPU temperature is shown. I see rarely returning zero value. The reason for this is following: The kernel is using sysfs to poll the thermal driver to read back the CPU temperature. Sometimes it can happen, that the thermal driver is currently reading back or doing the ADC conversion of the temperature, so you get back NULL Value. As you said this is a race condition. Unfortunately the last valid temperature is not saved by the driver, so you need to do this in your user space Application.

Best regards, Jaski