RTSP server crash on i.MX8MP Verdin after 2 streams disconnect (EGL / libdrm / gstreamer issue)

Hello,

I am working on an RTSP streaming application running on a Toradex Verdin i.MX8MP module using the NXP Yocto BSP (imx-gpu-viv for EGL).
The application uses GStreamer with the NXP hardware-accelerated pipeline (vpuenc_h264) to serve multiple H.264 streams via gst-rtsp-server.

Issue:
When two or more RTSP clients are connected and two disconnect (one after the other), the application crashes inside eglTerminate() while tearing down the pipeline.
The backtrace shows that the crash happens inside drmFreeDevice() in libdrm, which is called indirectly from eglTerminate().
In GDB, the device pointer in drmFreeDevice sometimes contains a small integer value (e.g., 0x21) instead of a valid pointer, which triggers an invalid memory access.

The crash only occurs when using gldeinterlace (OpenGL-based deinterlacing) in the pipeline with glupload and gldownload. If I switch to CPU-based deinterlacing (deinterlace element), the crash does not happen. It uses DMABuf.

The application uses 4 cameras, and one client connects to one stream. 2 clients cannot connect to the same camera stream.
It seems to be related to GstGLDisplayEGL. This suggests that eglTerminate is being called with an already-freed or corrupted DRM device pointer, possibly due to improper cleanup of GPU resources in multi-stream scenarios.

In the images, you can see the backtrace for 2 clients disconnecting:

  • First client disconnects and there is one call for drmFreeDevice

  • Second client disconnects and there are 2 calls for drmFreeDevice

Summary of output from tdx-info:
Software summary

Bootloader: U-Boot
Kernel version: 6.6.84-0 #1 SMP PREEMPT Mon Mar 31 08:47:48 UTC 2025

Hardware info

HW model: Toradex Verdin iMX8M Plus on ICOR QuadSD Carrier
Toradex version: 0063 V1.1B
Processor arch: aarch64

Hi, I’m the Toradex AI Assistant.

I might make mistakes. Our support team will step in if needed. If I answered your question, please tick the “Solution” checkbox. For follow-up questions to me, please use @ToradexAI in your post.

Did you like my answer? :+1: :-1:


EGL/GPU Resource Management Issue with Multi-Stream RTSP

Based on your description, this appears to be a complex issue with EGL/GPU resource management in multi-stream scenarios when using OpenGL-based deinterlacing. While the specific crash you’re experiencing isn’t directly covered in the available sources, I can provide some relevant guidance based on Toradex documentation.

Hardware Acceleration Context

Your Verdin iMX8M Plus module features hardware acceleration capabilities (1). The i.MX 8M Plus SoC includes a Video Processing Unit (VPU) for accelerating video decoding and encoding, and the GC7000 UltraLite 3D Graphics Processing Unit (GPU) from Vivante® (1).

GStreamer Pipeline Recommendations

For i.MX 8M Plus modules, the recommended hardware-accelerated H.264 encoding pipeline is (2):

# gst-launch-1.0 videotestsrc ! videoconvert ! vpuenc_h264 ! rtph264pay config-interval=1 pt=96 ! udpsink host=<host-machine-ip> port=5000

(2)

Debugging Approach

To investigate GStreamer pipeline issues, you can use debugging tools (3):

export GST_DEBUG=2

(3)

This will enable error and warning messages that might provide more insight into the GPU resource management issues you’re experiencing (3).

Alternative Pipeline Considerations

Since you mentioned the crash only occurs with gldeinterlace and not with CPU-based deinterlacing, you might want to explore the available encoders and elements (2):

# gst-inspect-1.0 | grep -e "encoder"
# gst-inspect-1.0 | grep -e "rtp" -e " payloader"

(2)

Known Issues Context

There are documented issues with VPU encoders in certain BSP versions. For example, there have been reported issues with vpuenc_h264 in BSP 7 that were traced to NXP Downstream patches (4). While your issue is different (related to EGL/GPU rather than VPU encoding), it suggests that GPU/VPU resource management can be problematic in multi-stream scenarios.

Recommendations

  1. Pipeline Modification: Consider testing with alternative deinterlacing methods or simplified pipelines to isolate the GPU resource management issue
  2. Debug Output: Enable detailed GStreamer debugging to capture more information about resource allocation/deallocation
  3. Resource Sequencing: Investigate if the order of client disconnections affects resource cleanup

Unfortunately, the specific EGL termination and DRM device pointer corruption issue you’re experiencing isn’t directly addressed in the available documentation. This appears to be a lower-level GPU driver interaction problem that may require investigation at the NXP BSP level.

Best regards, Toradex AI Assistant.

Hi @SeroshD

My (extremely limited) understand here is that you may need to implement a shared factory. As it was explained to me this is needed and has a property that allows it to be shared. You can see a Python example in our samples repo but it is also true for any C code doing something similar.

Hi @drew.tx
Thank you for your answer.
It’s already the case:

static void rtsp_add_endpoint(GstRTSPMountPoints *mounts,
                                   const char *launch_str,
                                   const char *rtsp_ep) {
  GstRTSPMediaFactory *factory = gst_rtsp_media_factory_new();
  gst_rtsp_media_factory_set_launch(factory, launch_str);
  gst_rtsp_media_factory_set_shared(factory, TRUE);
  gst_rtsp_media_factory_set_enable_rtcp(factory, !disable_rtcp);
  gst_rtsp_mount_points_add_factory(mounts, rtsp_ep, factory);
}

With:

GstRTSPServer *server;
GstRTSPMountPoints *mounts;

/* create a server instance */
server = gst_rtsp_server_new();
g_object_set(server, "service", port, NULL);

/* get the mount points for this server, every server has a default object
* that can be used to map uri mount points to media factories */
mounts = gst_rtsp_server_get_mount_points(server);

char launch_str[1024];
char rtsp_ep[32];

snprintf(launch_str, sizeof(launch_str),
         "v4l2src device=/dev/video%i "
         "! glupload ! gldeinterlace method=vfir ! gldownload "
         "! queue2 max-size-buffers=0 "
         "! vpuenc_h264 bitrate=5000 profile=11 level=31 gop-size=30 force-idr=60 "
         "! h264parse config-interval=1 "
         "! rtph264pay name=pay0 pt=96 ",
         cam_id);

snprintf(rtsp_ep, sizeof(rtsp_ep), "/single%i", cam_id);

rtsp_add_endpoint(mounts, launch_str, rtsp_ep);

Where cam_id can be 1, 2, 3 or 4.

Can you try to run these pipelines locally on the device but end them in a fakesink?

This way we can understand if it is an implementation issue or a limitation of the possible concurrency of one of the resources used.

This is the pipeline I used:

gst-launch-1.0 v4l2src device=/dev/video2<cam_id> ! glupload ! gldeinterlace method=vfir ! gldownload ! queue2 max-size-buffers=0 ! vpuenc_h264 bitrate=5000 profile=11 level=31 gop-size=30 force-idr=60 ! h264parse config-interval=1 ! rtph264pay name=pay0 pt=96 ! fakesink

With cam_id = 1, 2, 3 or 4

It still works when I stop 2 pipelines.

I think it comes from the fact that a single RTSP server might make medias share some resources between them.

I found a workaround:
I set media reusable and shared:

static void on_media_configure(GstRTSPMediaFactory *factory,
                               GstRTSPMedia *media,
                               gpointer user_data)
{
  gst_rtsp_media_set_shared(media, TRUE);
  gst_rtsp_media_set_reusable(media, TRUE);
}

static void rtsp_add_endpoint(GstRTSPMountPoints *mounts,
                              const char *launch_str,
                              const char *rtsp_ep) {
  GstRTSPMediaFactory *factory = gst_rtsp_media_factory_new();
  gst_rtsp_media_factory_set_launch(factory, launch_str);
  gst_rtsp_media_factory_set_shared(factory, TRUE);
  gst_rtsp_media_factory_set_enable_rtcp(factory, !disable_rtcp);
  g_signal_connect(factory, "media-configure", G_CALLBACK(on_media_configure), g_strdup(rtsp_ep));
  gst_rtsp_mount_points_add_factory(mounts, rtsp_ep, factory);
}

And I used this commit that is not in 1.24.0 of gstreamer1.0

That’s great to hear. Thanks for reporting back. I’ll pass this onto the dev team in case it makes sense to add that commit into our containers.

1 Like