Qt eglfs and spontaneous reboots

Hi, I built a qt5 image (fsl-image-qt5-validation-imx) for the IMX8QM eval board I was sent.

The build goes fine (Qt is version 5.8). However, any attempts to run QML applications, even very simple ones causes the board to reboot.

Sometimes a few frames are rendered correctly first. In one instance the simple test application ran for a couple of minutes before rebooting. But 99.9% of the cases reboot immediately after drawing. I also tried some of the gpu test utilities in /opt. These also cause immediate reboots (the ones that tried to draw anyway).

I am using the eglfs (EGL/GLES) backend. There are no error messages on the serial console, nor anything in the logs when this happens.

A previous post here (which seems to have been deleted) mentioned a possible issue with the GPU and power starvation, but I have not been able to find any mention of that in the docs I downloaded. Pointers would be very welcome.

If there is any other information I can provide, or tests to perform, please let me know.

If there is some sort of workaround, also please let me know. If this is a known hw problem that cannot be fixed without new hw, I’d like to know that too.

In all other respects the board seems to run fine, and fast, but I need Qt eglfs for my application.

One other thing I should have mentioned - I am using a Dell ST2220T 1920x1080 HDMI monitor with builtin USB touchscreen. Touch works fine.

hi @jtrulson

As I wrote before here, could you check the 3.3V Power Supply, if it is stable during the run of your QML application.

Thanks and best regards, Jaski

Hi @jaski.tx, I do not seem to have access to that post - I get a big “Access Denied” Toradex page when I attempt to go to it…

hi @jtrulson

You are right. The post is set to private by the customer. Here is the text i wrote in this post:

This seems to be a too high power consumption issue. Which carrier board are you using? Could you measure the 3.3V Power Supply and check if you a shutdown when something appears on the screen.

So I came back here and I see a couple of references to this GPU errata, but I cannot seem to find it in any of the docs I’ve downloaded.

Which documents have you downloaded?

Is there any workaround?

As I said before, the Problem might be the 3.3V. Please check this.

I have tried very simple programs for this testing (TrafficLight, multipointtoucharea), but the real software I’d like to try is much more complex and demanding.

We are still working on the current BSP and alpha hardware, the list of already tested features can be found here. We would be happy, if you can share your experience.

Hi @jaski.tx,

This seems to be a too high power consumption issue. Which carrier board are you using? Could you measure the 3.3V Power Supply and check if you a shutdown when something appears on the screen.

I am using the Ixora v1.1A board that came with the module. As for measuring the 3.3v pin, yes, I will do so when I get a little more time (couple days).

To note, sometimes the simple QML programs actually run for a minute or 5. Mostly they die shortly after starting. I will connect a scope to the relevant pins and let you know what I see when the module reboots.

Which documents have you downloaded?

All of the ones on the “Technical Documents (Datasheets, Pinout Designer, block diagram, etc.)” page. Though I noticed that there is now an errata PDF there. Looking at that, I saw no mention of this type of problem - at least not an obvious one, as of this date.

I see a reference to this here but is it related?

Thanks for your response.

@jtrulson

Could you also check if the carrier board is already patched (sensing resistor set to 0R)?

We would be highly interested to know more about this case. The best way to verify whether the board still fails due to over current is measuring the main 3.3V which poweres the module. Use a oscilloscope and set a falling edge trigger to 3.1V. Check whether the voltage drops below this value and please share the scope images with us.

The reference you mentioned is not related to you issue.

I made my thread public. I didn’t mean to hide it from other customers…

Could you also check if the carrier board is already patched (sensing resistor set to 0R)?

I’m sorry, I am not familiar with the details of this hardware. How would I go about determining that?

We would be highly interested to know more about this case. The best way to verify whether the board still fails due to over current is measuring the main 3.3V which poweres the module. Use a oscilloscope and set a falling edge trigger to 3.1V. Check whether the voltage drops below this value and please share the scope images with us.

I assume I can measure this from the X27 header, pin 29?

Thanks!

On pin 29 of X27, you have the switched 3.3V rail (3.3V_SW). The module is sourced by the non-switched rail (3.3V). Even thought, you would probably see a dip in the 3.3V also in the 3.3V_SW, it would be better to measure directly the 3.3V rail. The best way to measure the rail is at the capacitors C21, C22, C23, or C24.

@peter.tx

With some digging I was able to find the relevant caps. I first tested with the pins on X27. Seeing no issue there, I then tried measuring at the capacitors. Also, no voltage drop there either.

So, it does not appear to be a voltage drop at all. I saw no diff in the traces (triggered at 3.1v) when the target rebooted due to whatever is causing this.

I have attached the simple qml file to this post in case it’s helpful. 90% of the time it will reboot within 10 seconds or so. Though there have been a few occasions where it has run for up to 5 minutes before rebooting. You can touch the screen (if you have a touchscreen) or click the mouse to advance the state faster than the normal 1 second. TrafficLight.qml

It’s about as simple an example as I have. You will need the qmlscene or qml programs installed on the target to run it.

Thanks!

One other thing I thought I should mention… Even though the target is not doing anything (after a boot, no apps started, etc) it seems to run pretty hot for a machine which should be mostly idle.

cat /sys/class/thermal/thermal_zone0/temp
68002

(68.002C).

Is this normal? top shows 99.8% idle.

Just FYI, in case it matters.

Thanks for doing that.

Hi @jtrulson,

Thank you for the additional information. I did not reply you because I was very busy investigating another case in which rebooting of the Apalis iMX8 was reported. The outcome of the investigation was that the patches we have implemented on the Ixora and Apalis Evaluation Board are faulty. The patches are setting the current limit to a low value of around 3A which means some module trigger the over current limit already at low CPU/GPU load and restart. More information to this issue including a guide how to patch it, you can find here.

With this new information, I highly recommend you to modify your carrier board accordingly even tough, you have not seen voltage drops on your site.

Regarding the temperature read out. It is indeed relatively high, but I doubt that the temperature monitoring is causing the reboot. In the current BSP, the temperature monitor is set to shut down the module if 127°C are reached in zone0. The monitor will also not reboot the module, it will only shut down the system.

Hi @peter.tx

Sorry for the long delay – my boss told me to stop and work on other things for a bit. However, I’ve come back to this issue.

With this new information, I highly
recommend you to modify your carrier
board accordingly even tough, you have
not seen voltage drops on your site.

This does seem to solve the problem for me. I’m sure you already knew it would, but I just wanted to close the issue and add a confirmation. Thank you very much for the help and information.

Perfect that it works. Thanks for the feedback.