Temperature testing failure

Hello,

We’ve been having failures with our system during environmental testing. We are testing our system from -40C to 85C. When the chamber temperature reaches 85C the system fails. The system current draw drops from 140mA to 3mA.

Our 5V supply which is controlled by our Verdin imx8m mini drops to 0V so we think the problem lies with the module overheating. The other power supplies on the board (not controlled by the module) do not fail.

The peak temperature of the CPU reaches 94C and the temperature of the carrier board reaches 73 degrees before the failure occurs.

Has anyone had any problems temperature testing the Verdin module?

Can anyone advise any parts on the module that might be over heating and causing the problem?

Any help with this is appreciated.

Thanks
Jon

Hello Craig,

the module is designed to operate from -40 to 85c ambient.
So What is the Problem here. does it fails before the ambient reaches 85C?
And can you provide more information about the setup as well ?

Best Regards,

Matthias

Hi Matthias

The testing we are performing is from -40 to 85C. We do not go higher than 85.

The module fails when the temperature of the chamber is 85C. The temperature on the carrier PCB which is inside an enclosure is 73C.

The test set up consists of our carrier board within its enclosure and placed in an environmental chamber. We use thermocouples to monitor the temperature of the power supplies.

We are logging data output via ethernet from the module which includes pcb temperature and cpu temperature. At 73C PCB temperature this link drops and the supply current draw drops to 3mA.

Edit: Can you share the results of when you temperature tested the Verdin board? Did you use a heatsink at all?

Thanks

@JCraig911 ,

It is not clear which SOM are you using. IT version or non-IT. IT version CPUs have critical junction temperature of 105C, non-IT - 95C. If you indeed have IT version, perhaps thermal zone is set up for non IT version, CPU reaches 95 and shuts down. Please check thermal trip points in /sys/class/thermal

Thermal Management - Toradex System on Modules

Do you have any active cooling enabled when CPU junction reaches passive cooling trip point? Though IT SOMs are up to 85C, you still need to cool CPU down so it doesn’t cross critical junction T, which is always higher than ambient T.

Edward

Hi @Edward,
I just checked and it looks like 85 and 95 degC:

CB20300014:/sys/class/thermal/thermal_zone0$ cat trip_point_0_hyst
2000
CB20300014:/sys/class/thermal/thermal_zone0$ cat trip_point_0_type
passive
CB20300014:/sys/class/thermal/thermal_zone0$ cat trip_point_1_temp
95000
CB20300014:/sys/class/thermal/thermal_zone0$ cat trip_point_1_hyst
2000
CB20300014:/sys/class/thermal/thermal_zone0$ cat trip_point_1_type
critical

Hi @edwaugh ,

Still no information which SOM it is. According to critical trip point it should be non-IT variant, which can’t work well at 85C ambient.

iMX7/6 use different thermal driver compared to iMX8. iMX8 again seem using different thermal drivers for different iMX8 variants. On iMX7/6 it seems that trip point settings are taken from OTP. On iMX8 it seems DT is where trip points are specified. If that’s the case, then I think U-Boot should determine is CPU IT or not IT and alter trip point settings in DTB just before launching kernel. I don’t know how it is in reality, haven’t tried iMX8 yet.

Edward

Yes it is the IT version, that’s why we are testing that temperature range.

Hello edwaugh,

Which module and which version of it?

Best Regards.,

Matthias Gohlke

[00591101] Verdin iMX8M Mini Quad 2GB IT V1.1B

Hi @edwaugh

Thanks for the response.

The peak temperature of the CPU reaches 94C and the temperature of the carrier board reaches 73
degrees before the failure occurs.

  • How and where exactly do you measure these temperatures?
  • What is your application?
  • Could you provide the CPU Load in your test setup?
  • Are you using a Heatsink?

Can you provide some pictures of your Test Setup?

Thanks and best regards,
Jaski

Hi @jaski.tx,
We measure the CPU temperature using the value reported by the processor

/sys/class/thermal/thermal_zone0/temp'

The board temperature is just measured by an IMU mounted to the carrier.
CPU load is around 20% with a powersave governor
No heatsink, inside our plastic enclosure

Thanks

Ed

Hi @jaski.tx
Looking at the responses here it sounds like this is the expected behaviour for the board. Can you confirm the thresholds are set correctly for this part?
Can you share the results of your temperature testing? Under what conditions do you guarantee operation if it is not 85 degC 20% CPU load?

Thanks

Ed
@walter.tx @gauravks @matthias.tx

Hi @edwaugh ,

Consumer CPU junction temperature (Tj) range for different i.MX’es is 0…95C, industrial -40…105. You may confirm it looking at chip marking and checking what temperature grade it is. I think thermal zone critical limit should match device limit. At least it is so on i.MX6ULL and i.MX7D.

Regarding 85C ambient and 20% CPU load. Do you know how much power does your CPU consume at this load? It highly depends not only on CPU load, but as well on GPU load. With known power consumption and having CPU datasheet, in which at least junction to ambient thermal resistance should be specified, you may estimate Tj at specific ambient T. Say junction to ambient thermal resistance is specified as 15W/C (please check) and CPU consumes 1.5W, then junction T may be as high as 85 + 1.5W * 15W/C = 107.5C…

I think specified SOM ambient temperature limits apply to all SOM components used. With CPU=heater operating, you are rising ambient temperature as well, some near to CPU components may cross their limits… So I think you should try keeping at least CPU Tj within allowed limits in all conditions (using active cooling / turn of some big consumers like GPU when too hot / reduce CPU clock / sleep-wake periodically to let CPU cool down, etc, always easy to say, often hard to implement :slight_smile: ).

To change critical limit, I think you should add something like this to your iMX8M DT:

&cpu_crit0 {
	temperature = <105000>;
};

Edward

Hi @edwaugh
@Edward made a good point. The can measure and guarantee the SOC’s temperature with a ambient temperature of 85C. Some other components of SoM can have much higher temperature and they could fail.

For your application at 85C ambient’s temperature and a CPU Load of 20%, you will need a customised cooling solution (heatsink, sleep cycle, …).

Best regards,
Jaski