What is the fastest way to capture 4K video?

We are developing an image processing system using Verdin iMX8M Plus. The kernel is based on imx8mp-verdin-wifi-dev.dts.

The system inputs a 4K camera with MIPI-CSI and captures with a Linux app.
The delay is about 600msec and the capture cycle is 5fps.
Our goal is to have a delay of less than 100 msec and 15 fps.

The app uses OpenCV’s VideoCapture to get framebuffer images from /dev/video0. The number of framebuffers is set to 2.

What is the best way to prompt Verdin iMX8M Plus to capture large images like 4K with MIPI-CSI?
(Faster methods without opencv are also acceptable.)

The V4l status is shown below for reference.

  • v4l2-ctl --all
root@verdin-imx8mp-06965676:/home/app# v4l2-ctl --all
Driver Info:
Driver name      : mxc-isi-cap
Card type        : mxc-isi-cap
Bus info         : platform:32e00000.isi:cap_devic
Driver version   : 5.4.154
Capabilities     : 0x84201000
Video Capture Multiplanar
Extended Pix Format
Device Capabilities
Device Caps      : 0x04201000
Video Capture Multiplanar
Extended Pix Format
Media Driver Info:
Driver name      : mxc-md
Model            : FSL Capture Media Device
Serial           :
Bus info         :
Media version    : 5.4.154
Hardware revision: 0x00000000 (0)
Driver version   : 5.4.154
Interface Info:
ID               : 0x03000014
Type             : V4L Video
Entity Info:
ID               : 0x00000012 (18)
Name             : mxc_isi.0.capture
Function         : V4L2 I/O
Pad 0x01000013   : 0: Sink
Link 0x02000041: from remote pad 0x100000e of entity 'mxc_isi.0': Data, Enabled
Priority: 2
Format Video Capture Multiplanar:
Width/Height      : 3840/2160
Pixel Format      : 'BGR3' (24-bit BGR 8-8-8)
Field             : None
Number of planes  : 1
Flags             :
Colorspace        : JPEG
Transfer Function : Default
YCbCr/HSV Encoding: Default
Quantization      : Default
Plane 0           :
Bytes per Line : 11520
Size Image     : 24883200

Based on the information from NXP documents, the i.MX 8 CSI interface can handle a maximum throughput of 1.5 Gbps per lane. For an image size of 24,883,200 bytes, this translates to a theoretical maximum of approximately 7.5 FPS on a single lane. However, the number of lanes used for your camera connection hasn’t been specified.

Additionally, it’s unclear what is meant by “capturing” in this context. Does it refer to simply acquiring an image from the camera and loading it into RAM, or does it include processing time? The capabilities of your camera regarding image compression also remain unspecified. While compressed images can be transferred more quickly from the camera, decompression and processing might introduce additional delays.

To increase throughput, you could utilize more CSI lanes if your hardware supports it. Moreover, directly accessing the V4L2 (Video for Linux 2) interface may yield better performance compared to using OpenCV’s VideoCapture, which could add unnecessary overhead. Thus, using the V4L2 API for capturing and processing images, although requiring more extensive coding, might prove to be a more efficient approach.
If full 4K resolution isn’t absolutely necessary for every task, consider capturing at a lower resolution or downsampling for processing.

Thank you, @alex.tx, for your response! I’ll try out the V4L API and report back with the results.

Thanks for the support the other day.
I’ve made progress and will share.

I tried implementing video capture in V4L2, but the results were almost the same as OpenCV with no improvement.

But by implementing it with the V4L2-API, the bottleneck became clear.

The bottleneck seems it takes time to copy the data to user space because the V4L2 buffer keeps the data in a non-cached region.
In fact, it takes 130 msec to memcpy() image data of 3840x2160x2 size from mmapped kernel space.

The following link discusses the same issue and shows the solution, so I’ll try that one.

After applying this kernel patch here the delay improved from 130 ms to 10ms. Then on OpenCV

    VideoCapture cap(0);
        return -1;

        Mat frame;
        cap >> frame; //  This improved from 130msec to about 10msec.
        if(waitKey(30) >= 0) break;

Thanks for the confirmation from @p-uchi !