I was able to reproduce this issue in BSP 6 by toggling the HPD line, and now on BSP 7 I can’t. I agree with your concern about something that might fail and wanting to make sure the app is robust in any situation. However, I think we must always balance the amount of effort it takes to reach any solution with the potential of a bad outcome. I’m not saying you have to agree with me about the outcome of this specific case, but in my perspective we’re dealing with a mostly theoretic failure at this point.
On another point, would it be possible to restart your application to cover the now unlikely possibility of weston crashing again?
I executed some tests in my setup with your script, and I reproduced the issue 2 or 3 times after running the script for a long time, usually more than 1 hour. I configured the system to capture the core dump from weston, and with it I discovered the crash seems to be related to the g2d renderer:
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `weston --backend=drm-backend.so'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000007fa3344c90 in ?? () from /usr/lib/libweston-12/g2d-renderer.so
[Current thread is 1 (Thread 0x7fa445d020 (LWP 10786))]
(gdb) bt
#0 0x0000007fa3344c90 in ?? () from /usr/lib/libweston-12/g2d-renderer.so
#1 0x0000007fa383dc78 in ?? () from /usr/lib/libweston-12/drm-backend.so
#2 0x0000007fa383dee0 in ?? () from /usr/lib/libweston-12/drm-backend.so
#3 0x0000007fa41e5d84 in weston_output_disable () from /usr/lib/libweston-12.so.0
#4 0x0000007fa440b4c8 in ?? () from /usr/lib/weston/libexec_weston.so.0
#5 0x0000007fa41dd064 in ?? () from /usr/lib/libweston-12.so.0
#6 0x0000007fa415baa4 in wl_event_loop_dispatch_idle () from /usr/lib/libwayland-server.so.0
#7 0x0000007fa415bc00 in wl_event_loop_dispatch () from /usr/lib/libwayland-server.so.0
#8 0x0000007fa41590f4 in wl_display_run () from /usr/lib/libwayland-server.so.0
#9 0x0000007fa440d174 in wet_main () from /usr/lib/weston/libexec_weston.so.0
#10 0x0000007fa42784f4 in ?? () from /usr/lib/libc.so.6
#11 0x0000007fa42785cc in __libc_start_main () from /usr/lib/libc.so.6
#12 0x0000005580ad07b0 in ?? ()
I remember we had some issues in the past with the g2d renderer in weston (on older BSP versions), and at some point NXP even recommended to turn it off. I decided to just try disabling it and see if the issue was still reproducible. I just commented use-g2d=true
in /etc/xdg/weston/weston.ini
:
[core]
#use-g2d=true
After that, I couldn’t reproduce the issue anymore, even with the script (at least it got much harder since I left it running for 2 or 3 hours). Could you try this on your side and see if it makes a difference?
Hope you don’t mind me asking, but your FAE’s mentioned a potential fix for BSP6 a few times.
Not at all, and that’s true. We had one team member testing a possible solution, but we found that this proposed solution was also reverting the solution of another potential crash related to HDMI connections. I went through it and noticed it was a collection of 3 or 4 patches that were backported from newer NXP kernel versions.
I can try to get more details at a later point.
Best regards,
Rafael