How to Support Multi-Planar Format in Python V4L2 Applications on i.MX8M Plus

Python V4L2 is a library module implemented to support V4L2 ioctl from Python applications. The default library contains the definition/functions/V4L2 MACROS only related to the V4L2 capture method. (ie) V4L2 buffers with the single plane method. To support image capture and streaming with V4L2 multi-planar buffer format, the basic definitions related to buffer, type and memory need to be defined explicitly.
In this blog, you’ll learn how to implement basic definitions which are missing the default library module and capture images in the V4L2 multi-planar format.

What is the Multi-Planar Format?

In the Video4Linux2 (V4L2) API, multi-planar formats are used to handle image data where different components (such as color channels) are stored in separate memory planes. This approach contrasts with single-planar formats, where all image data resides in a contiguous memory buffer. Platforms such as iMX8MPlus use this method in handling video frames via V4L2. Hence, the application should handle the multi-planar format in properly retrieving frames from the camera.

A plane refers to a sub-buffer of the current frame, storing specific components of the image data. For example, in YUV formats, separate planes might store the Y (luminance), U (chrominance blue), and V (chrominance red) components.

The schematic below compares single-planar and multi-planar buffer layouts.

• Single-planar: All image data (e.g., YUV) stored in one contiguous buffer.
• Multi-planar: Image data split across multiple buffers (planes), such as separate buffers for Y, U, and V components.

To support devices that require discontiguous memory buffers for each video frame, V4L2 introduced the multi-planar API. This extension allows applications to manage multiple planes per frame, providing flexibility in handling complex image formats.

The V4L2 API defines specific structures to describe multi-planar formats:

struct v4l2_pix_format_mplane struct v4l2_pix_format_mplane { __u32 width; __u32 height; __u32 pixelformat; __u32 field; __u32 colorspace; struct v4l2_plane_pix_format plane_fmt[VIDEO_MAX_PLANES]; __u8 num_planes; __u8 reserved[11]; } __attribute__ ((packed));

struct v4l2_plane_pix_format struct v4l2_plane_pix_format plane_fmt[VIDEO_MAX_PLANES]; __u8 num_planes; __u8 reserved[11]; } __attribute__ ((packed));

struct v4l2_buffer struct v4l2_buffer { __u32 index; __u32 type; __u32 bytesused; __u32 flags; __u32 field; struct timeval timestamp; struct v4l2_timecode timecode; __u32 sequence;

/* memory location */
__u32 memory;
union {
__u32 offset;
unsigned long userptr;
struct v4l2_plane *planes;
} m;

__u32 length;
__u32 reserved2;
__u32 reserved;
};

Understanding the Python V4L2 Module

In Python, the v4L2 module provides bindings to the Video4Linux2 (V4L2) API, enabling direct interaction with video devices on Linux systems. This allows developers to control and capture video streams from cameras and other video sources programmatically. The captured frames using these V4L2 bindings is then can be used with OpenCV APIs for further processing, such as capturing images, streaming etc. This improves the performance when compared with direct capture using OpenCV APIs.

The general Python v4l2 module implements single planar format support by default. We have added the multi-planar support to the module and tried to capture the frame using the V4L2 platform interface in the iMX8M Plus platform.

The block diagram below illustrates the V4L2 multi-planar data flow on the i.MX8M Plus.

Sample Application for RAW10 Capture Python V4L2 application import cv2 import fcntl import mmap import numpy as np import os import struct import v4l2 import subprocess import time import sys import UART_Comm_v3 as comm import trig_monitor import ctypes

Version= 2

# Manually define missing V4L2 constants if they are not in the module
V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE = 9 # Multiplexed capture type
VIDEO_MAX_PLANES = 8 # Set maximum planes (as in v4l2)
WIDTH = 4208
HEIGHT =3124

# Set these as attributes of v4l2 if they are missing
if not hasattr(v4l2, ‘V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE’):
setattr(v4l2, ‘V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE’, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
if not hasattr(v4l2, ‘VIDEO_MAX_PLANES’):
setattr(v4l2, ‘VIDEO_MAX_PLANES’, VIDEO_MAX_PLANES)

# Define the v4l2_plane structure using ctypes
class v4l2_plane(ctypes.Structure):
_fields_ = [
(“bytesused”, ctypes.c_uint32),
(“length”, ctypes.c_uint32),
(“m”, ctypes.c_uint32 * 2), # union m: [0] -> mem_offset, [1] -> userptr
(“data_offset”, ctypes.c_uint32),
(“reserved”, ctypes.c_uint32 * 11)
]

# Define the anonymous union within v4l2_buffer to include m and planes
class v4l2_buffer_union(ctypes.Union):
_fields_ = [
(“offset”, ctypes.c_uint32 * 2), # Alternative access to m (e.g., offset/userptr)
(“planes”, ctypes.POINTER(v4l2_plane)) # Pointer to v4l2_plane array
]

# Define the v4l2_buffer structure to include the planes pointer
class v4l2_buffer(ctypes.Structure):
_fields_ = [
(“index”, ctypes.c_uint32),
(“type”, ctypes.c_uint32),
(“bytesused”, ctypes.c_uint32),
(“flags”, ctypes.c_uint32),
(“field”, ctypes.c_uint32),
(“timestamp”, ctypes.c_uint64),
(“timecode”, ctypes.c_uint32 * 6), # Simplified
(“sequence”, ctypes.c_uint32),
(“memory”, ctypes.c_uint32),
(“m”, v4l2_buffer_union), #
(“length”, ctypes.c_uint32),
(“input”, ctypes.c_uint32),
(“reserved”, ctypes.c_uint32)
]

class V4L2Capture:
def __init__(self, device=’/dev/video3′):
self.device = device
self.device_path = “/dev/video3”
self.device_id = 3
self.fd = os.open(device, os.O_RDWR | os.O_NONBLOCK)

# Initialize the device
self.init_device()

def init_device(self):
cp = v4l2.v4l2_capability()
fcntl.ioctl(self.fd, v4l2.VIDIOC_QUERYCAP, cp)
print(‘Driver:’, cp.driver.decode())
print(‘Card:’, cp.card.decode())
print(‘Bus:’, cp.bus_info.decode())

fmt = v4l2.v4l2_format()
fmt.type = v4l2.V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE
fmt.fmt.pix.width = WIDTH
fmt.fmt.pix.height = HEIGHT
fmt.fmt.pix.pixelformat = v4l2.V4L2_PIX_FMT_SGRBG10
fmt.fmt.pix.field = v4l2.V4L2_FIELD_NONE
fcntl.ioctl(self.fd, v4l2.VIDIOC_S_FMT, fmt)

self.expected_frame_size = WIDTH * HEIGHT * 2

req = v4l2.v4l2_requestbuffers()
req.count = 4
req.type = v4l2.V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE
req.memory = v4l2.V4L2_MEMORY_MMAP
fcntl.ioctl(self.fd, v4l2.VIDIOC_REQBUFS, req)
if req.count == 0:
raise RuntimeError(“Buffer request failed, no buffers allocated.”)

self.buffers = []
for i in range(req.count):
buf = v4l2_buffer()
buf.type = v4l2.V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE
buf.memory = v4l2.V4L2_MEMORY_MMAP
buf.index = i
buf.length = 1

self.planes = (v4l2_plane * 1)() # Multi-plane buffer setup
buf.m.planes = self.planes

fcntl.ioctl(self.fd, v4l2.VIDIOC_QUERYBUF, buf)

plane_mmaps = []
plane_length = buf.m.planes[0].length
plane_offset = buf.m.planes[0].m[0]
mapped_plane = mmap.mmap(self.fd, plane_length, mmap.PROT_READ | mmap.PROT_WRITE, mmap.MAP_SHARED, offset=plane_offset)

self.buffers.append((buf, mapped_plane))

fcntl.ioctl(self.fd, v4l2.VIDIOC_QBUF, buf)

buf_type = v4l2.V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE
try:
fcntl.ioctl(self.fd, v4l2.VIDIOC_STREAMON, struct.pack(‘I’, buf_type))
except OSError as e:
print(f”Error starting stream: {e}”)
self.release()
raise

def capture_frame(self):
buf = v4l2_buffer()
buf.type = v4l2.V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE
buf.memory = v4l2.V4L2_MEMORY_MMAP
buf.m.planes = self.planes
buf.length = 1

while True:
try:
fcntl.ioctl(self.fd, v4l2.VIDIOC_DQBUF, buf)
break # Exit loop if DQBUF succeeds
except BlockingIOError:
continue # Retry if DQBUF raises BlockingIOError

if buf.m.planes[0].bytesused == self.expected_frame_size:
self.buffers[buf.index][1].seek(0)
plane_data = self.buffers[buf.index][1].read(buf.m.planes[0].bytesused)
#print(f”captured frame on plane {p}, size: {len(plane_data)} bytes”)

print(f”bytes used {buf.m.planes[0].bytesused} expected frame size:{self.expected_frame_size}. plane_data len:{len(plane_data)}”)
if plane_data and len(plane_data) == self.expected_frame_size:
frame = np.frombuffer(plane_data, dtype=np.uint8).reshape(3124, 4208, 2)
else:
print(“Error: Received empty or improper frame data”)
frame = None
else:
print(“Error: Received framesize does not match the expected size”)
bgr_image = None
frame = None

fcntl.ioctl(self.fd, v4l2.VIDIOC_QBUF, buf)
return frame

def release(self):
buf_type = v4l2.V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE
fcntl.ioctl(self.fd, v4l2.VIDIOC_STREAMOFF, struct.pack(‘I’, buf_type))
for _, mapped_plane in self.buffers:
mapped_plane.close()
os.close(self.fd)

def yuyv_to_rgb(yuyv_frame):
return cv2.cvtColor(yuyv_frame, cv2.COLOR_YUV2BGR_UYVY)

def main():
cap = V4L2Capture(‘/dev/video3’)

for i in range(1, 3):
frame = cap.capture_frame()
if frame is not None:
with open(f”test_frame-{i:002d}.raw”, ‘wb’) as f:
f.write(frame.tobytes())
else:
print(“Failed to capture frame”)

cap.release()

if __name__ == “__main__”:
main()

The application implements the attributes such as V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE, VIDEO_MAX_PLANES. These attributes are used as inputs for S_FMT IOCTL.

# Set these as attributes of v4l2 if they are missing if not hasattr(v4l2, 'V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE'): setattr(v4l2, 'V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE', V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) if not hasattr(v4l2, 'VIDEO_MAX_PLANES'): setattr(v4l2, 'VIDEO_MAX_PLANES', VIDEO_MAX_PLANES)

The v4l2 buffer for the multi-planar format is implemented. This structure is used for requesting, querying and mapping frame buffers for the multi-planar capture.

# Define the v4l2_plane structure using ctypes class v4l2_plane(ctypes.Structure): _fields_ = [ ("bytesused", ctypes.c_uint32), ("length", ctypes.c_uint32), ("m", ctypes.c_uint32 * 2), # union m: [0] -> mem_offset, [1] -> userptr ("data_offset", ctypes.c_uint32), ("reserved", ctypes.c_uint32 * 11) ]

In the application code, unimplemented v4l2 multi-planar structures and attributes are defined manually. The defined attributes are used while setting the format, requesting buffers, mapping memory and capturing the frame. The application simply captures the frame using DQBUF IOCTL and reshapes the data to frame before saving it as a raw file.

e-con Systems Offers i.MX8 Processor Series-Based Cameras

Since 2003, e-con Systems has been designing, developing, and manufacturing OEM camera solutions. We have a powerful ecosystem of partners for i.MX8 System on Module and carrier boards, including Toradex and Variscite. Over the years, we have empowered clients to accelerate their time-to-market with world-class cameras that are powered by the i.MX8 processor series.

See our complete camera portfolio on e-con Systems’ Camera Selector Page.

So, if you need help finding and integrating the right camera into your embedded vision application, please write to us at camerasolutions@e-consystems.com.

Prabu Kumar

Prabu is the Chief Technology Officer and Head of Camera Products at e-con Systems, and comes with a rich experience of more than 15 years in the embedded vision space. He brings to the table a deep knowledge in USB cameras, embedded vision cameras, vision algorithms and FPGAs. He has built 50+ camera solutions spanning various domains such as medical, industrial, agriculture, retail, biometrics, and more. He also comes with expertise in device driver development and BSP development. Currently, Prabu’s focus is to build smart camera solutions that power new age AI based applications.

How to Support Multi-Planar Format in Python V4L2 Applications on i.MX8M Plus

The default Python V4L2 library module contains critical details related to the V4L2 capture method. Learn how to implement basic definitions (missing the default library module) and capture images in the V4L2 multi-planar format.

What is the Multi-Planar Format?

Understanding the Python V4L2 Module

e-con Systems Offers i.MX8 Processor Series-Based Cameras

categories

Useful links

Find the right camera

Connect with us

What is the Multi-Planar Format?

Understanding the Python V4L2 Module

e-con Systems Offers i.MX8 Processor Series-Based Cameras

An Engineer’s Guide on Data and Control Buses of Imaging Systems

What Is The Role of Embedded Cameras in Smart Warehouse Automation?

Related posts

How to Use e-con Systems’ GigE Camera – RouteCAM with ONVIF and Other Video Applications

Unlocking the Power of 100,000 fps: How Quad-Pixel Shutter Control Works

How Does ROI-Based Exposure Benefit Embedded Vision Applications?

How to stream a USB camera via Android and iPhone with a Raspberry Pi board

How to use e-con Systems’ USB cameras with MATLAB

Get notified of new articles