Video class#

This page documents the pyneon.Video class, which provides an interface for working with video data in PyNeon. It also documents the pyneon.find_homographies() function, which computes homography transformations for video frames based on marker and contour layouts.

class pyneon.Video(video_file: Path, timestamps: ndarray, info: dict | None = None)#

Bases: object

OpenCV VideoCapture wrapper that pairs a video with frame timestamps and camera metadata.

The video is accessed via the cap property, frame timestamps via timestamps, and camera metadata via info.

Parameters:
video_filepathlib.Path

Path to the video file on disk.

timestampsnumpy.ndarray

Frame timestamps in nanoseconds. Must match the number of frames.

infodict or None

Camera metadata, typically including camera_matrix and distortion_coefficients.

Attributes:
infodict

Camera metadata dictionary.

Methods

close()

Close and release the video handle.

compute_frame_brightness([step, ...])

Compute per-frame mean grayscale brightness.

detect_contour([step, processing_window, ...])

Detect bright rectangular regions (e.g., projected surfaces or monitors) in video frames using luminance-based contour detection.

detect_markers([marker_family, step, ...])

Detect fiducial markers (AprilTag or ArUco) in the video frames.

get(propId)

Query a video property value.

grab()

Grabs the next frame from video file or capturing device.

isOpened()

Check if the video file is open.

overlay_detections(detections[, show_ids, ...])

Overlay detections on the video frames.

overlay_scanpath(scanpath[, circle_radius, ...])

Overlay scanpath fixations on the video frames.

plot_detections(detections, frame_index[, ...])

Visualize detections on a frame.

plot_frame([frame_index, ax, show])

Plot a frame from the video on a matplotlib axis.

read()

Grabs, decodes and returns the next video frame.

read_frame_at(frame_index)

Read a single frame by index.

release()

Close and release the video handle (alias for close()).

reset()

Reopen the video file and reset to the first frame.

retrieve()

Decodes and returns the grabbed video frame.

set(propId, value)

Set a video property.

timestamp_to_frame_index(timestamp)

Map timestamps to frame indices (nearest frame).

undistort_frame(frame)

Undistort a single frame (color or grayscale).

undistort_video([show, output_path])

Undistort a video using the known camera matrix and distortion coefficients.

property timestamps: ndarray#

Timestamps of the video frames in nanoseconds.

Returns:
numpy.ndarray

Array of timestamps in nanoseconds (Unix time).

property ts: ndarray#

Alias for timestamps for convenience.

Returns:
numpy.ndarray

Array of timestamps in nanoseconds (Unix time).

isOpened() bool#

Check if the video file is open.

Returns:
bool

True if open, False otherwise.

grab() bool#

Grabs the next frame from video file or capturing device.

Returns:
bool

True if successful, False otherwise.

retrieve() tuple[bool, ndarray | None]#

Decodes and returns the grabbed video frame.

Returns:
successbool

True if successful, False otherwise.

framenumpy.ndarray or None

Frame as a 3D BGR array if successful, None otherwise.

read() tuple[bool, ndarray | None]#

Grabs, decodes and returns the next video frame. (equivalent to grab() + retrieve()).

Returns:
successbool

True if successful, False otherwise.

framenumpy.ndarray or None

Frame as a 3D BGR array if successful, None otherwise.

set(propId: int, value: Number) bool#

Set a video property.

Parameters:
propIdint

OpenCV property ID (e.g., cv2.CAP_PROP_POS_FRAMES).

valueNumber

New value for the property.

Returns:
bool

True if successful, False otherwise.

Warning

Seeking frames with cv2.CAP_PROP_POS_FRAMES is unreliable for variable frame rate (VFR) videos. Neon recordings typically have non-constant FPS, making direct seeking unpredictable. See opencv/opencv#9053.

For safe random frame access, use read_frame_at() instead, which maintains internal frame counting and handles VFR timing correctly:

# Good: Safe frame reading in VFR videos
frame = video.read_frame_at(100)

# Risky: May not work reliably on VFR videos
video.set(cv2.CAP_PROP_POS_FRAMES, 100)
ret, frame = video.read()
get(propId: int) float#

Query a video property value.

property first_ts: int#

First frame timestamp in nanoseconds.

property last_ts: int#

Last frame timestamp in nanoseconds.

property ts_diff: ndarray#

Difference between consecutive timestamps.

property times: ndarray#

Timestamps converted to seconds relative to video start.

property duration: float#

Duration of the video in seconds.

property fps: float#

Frames per second of the video.

property width: int#

Width of the video frames in pixels.

property height: int#

Height of the video frames in pixels.

property camera_matrix: ndarray#

Camera matrix of the video camera.

property distortion_coefficients: ndarray#

Distortion coefficients of the video camera.

property undistort_cache: tuple[ndarray, ndarray, ndarray]#

Cached undistortion maps and the optimal new camera matrix.

property map1: ndarray#

First undistortion map for use with cv2.remap.

property map2: ndarray#

Second undistortion map for use with cv2.remap.

property undistortion_matrix: ndarray#

Optimal new camera matrix used for undistortion.

property current_frame_index: int#

Current frame index based on the video position.

property cap: VideoCapture#

The underlying OpenCV VideoCapture object.

Access this property for OpenCV operations not covered by PyNeon convenience methods.

For documentation of the cv.VideoCapture API, see: https://docs.opencv.org/master/d8/dfe/classcv_1_1VideoCapture.html

Returns:
cv2.VideoCapture

The underlying video capture object.

Warning

Direct manipulation of this object may interfere with PyNeon’s frame synchronization and state management. Use with caution.

See also

read_frame_at

Recommended method for safe frame reading

reset

Reset video state after direct cap manipulation

Notes

PyNeon provides convenience methods for most common video operations: read_frame_at(), detect_markers(), undistort_frame(). These handle frame indexing, timestamp mapping, and state management automatically. For standard use cases, prefer these over direct cap access.

For advanced OpenCV operations, common access patterns include:

  • video.cap.get(cv2.CAP_PROP_*): Query video properties

  • video.cap.get(cv2.CAP_PROP_FRAME_WIDTH): Get frame width

  • video.cap.grab() and video.cap.retrieve(): Read frames directly

Direct frame seeking via video.cap.set(cv2.CAP_PROP_POS_FRAMES, ...) may break VFR timestamp alignment. Use read_frame_at() instead. After direct cap manipulation, consider calling reset() to resynchronize internal state.

timestamp_to_frame_index(timestamp: int | int64 | ndarray) ndarray#

Map timestamps to frame indices (nearest frame).

Parameters:
timestampint or numpy.ndarray

Single timestamp or array of timestamps in nanoseconds.

Returns:
numpy.ndarray

Frame index or array of frame indices (always returned as array).

Raises:
ValueError

If any timestamp is earlier than the first video timestamp or later than the last video timestamp.

read_frame_at(frame_index: int) ndarray | None#

Read a single frame by index.

For forward jumps, grabs intermediate frames to maintain timestamp alignment in VFR videos. For backward jumps, resets to start.

Parameters:
frame_indexint

Zero-based frame index to read.

Returns:
numpy.ndarray or None

Frame as a 3D array (BGR), or None if the frame cannot be read.

Raises:
ValueError

If frame_index is out of bounds.

Notes

Recommended for all frame access in Neon videos with VFR. Automatically handles frame positioning and timestamp alignment, managing internal state correctly.

reset()#

Reopen the video file and reset to the first frame.

Recreates the internal VideoCapture object. Use this after direct cap manipulation or to restart from the beginning.

Notes

This discards any pending frame operations and reloads from disk. Useful after cap manipulation to resynchronize state.

release()#

Close and release the video handle (alias for close()).

close() None#

Close and release the video handle.

plot_frame(frame_index: int = 0, ax: Axes | None = None, show: bool = True)#

Plot a frame from the video on a matplotlib axis.

Parameters:
frame_indexint

Index of the frame to plot.

axmatplotlib.axes.Axes or None

Axis to plot on. If None, a new figure is created. Defaults to None.

Returns:
figmatplotlib.figure.Figure

Figure instance containing the plot.

axmatplotlib.axes.Axes

Axis instance containing the plot.

undistort_video(show: bool = False, output_path: Path | str | None = None) None#

Undistort a video using the known camera matrix and distortion coefficients.

Parameters:
output_pathpathlib.Path or str or None

Path to save the undistorted output video. If “default”, saves undistorted_video.mp4 to the derivatives folder under the recording directory. If None, no output video is written.

Returns:
None
undistort_frame(frame: ndarray) ndarray#

Undistort a single frame (color or grayscale).

Parameters:
framenumpy.ndarray

Input frame as a 2D (grayscale) or 3D (color) array.

Returns:
numpy.ndarray

Undistorted frame with the same shape as the input.

compute_frame_brightness(step: int = 1, processing_window: tuple[int | float, int | float] | None = None, processing_window_unit: Literal['frame', 'time', 'timestamp'] = 'frame')#

Compute per-frame mean grayscale brightness.

Each frame is converted to grayscale and averaged to yield a single brightness value per frame.

Parameters:
stepint, optional

Process every Nth frame. For example, step=5 processes frames 0, 5, 10, 15, …. Defaults to 1 (process all frames).

processing_windowtuple[int | float, int | float] or None

Start and end of the processing window. Interpretation depends on processing_window_unit. Defaults to None (full duration).

processing_window_unit{“frame”, “time”, “timestamp”}, optional

Unit for values in processing_window. Possible values are:

  • “timestamp”: Unix timestamps in nanoseconds

  • “time”: Seconds relative to video start

  • “frame”: video frame indices (0-based)

Defaults to “frame”.

Returns:
Stream

Stream indexed by timestamp [ns] with a single column brightness containing mean grayscale brightness values.

detect_markers(marker_family: str | list[str] = '36h11', step: int = 1, processing_window: tuple[int | float, int | float] | None = None, processing_window_unit: Literal['frame', 'time', 'timestamp'] = 'frame', detector_parameters: DetectorParameters | None = None, undistort: bool = False) Stream#

Detect fiducial markers (AprilTag or ArUco) in the video frames.

Parameters:
marker_familystr or list[str], optional

AprilTag family/ArUco dictionary to detect. Accepts a single family string (e.g., ‘36h11’) or a list of families (e.g., [‘36h11’, ‘6x6_250’]).

stepint, optional

Process every Nth frame. For example, step=5 processes frames 0, 5, 10, 15, …. Defaults to 1 (process all frames).

processing_windowtuple[int | float, int | float] or None

Start and end of the processing window. Interpretation depends on processing_window_unit. Defaults to None (full duration).

processing_window_unit{“frame”, “time”, “timestamp”}, optional

Unit for values in processing_window. Possible values are:

  • “timestamp”: Unix timestamps in nanoseconds

  • “time”: Seconds relative to video start

  • “frame”: video frame indices (0-based)

Defaults to “frame”.

detector_parameterscv2.aruco.DetectorParameters, optional

Detector parameters to use for all marker families. If None, a default DetectorParameters instance is created. Defaults to None.

undistortbool, optional

If True, undistorts frames before detection, which can improve detection performance, then redistorts detected points. Returned coordinates remain in the original (distorted) video frame. Defaults to False.

Returns:
Stream

Stream of detected markers. Each row corresponds to a detected marker in a video frame indexed by “timestamp [ns]” and contains the following columns:

Column

Description

frame index

Frame number of the marker detection.

marker family

AprilTag family or ArUco dictionary of the detected marker (e.g., “36h11”, “6x6_250”).

marker id

ID of the detected marker within its family (e.g., 0, 1, 2).

marker name

Full identifier combining family and id (e.g., “36h11_0”, “36h11_1”).

top left x [px]

X coordinate of the top-left corner of the detected marker.

top left y [px]

Y coordinate of the top-left corner of the detected marker.

top right x [px]

X coordinate of the top-right corner of the detected marker.

top right y [px]

Y coordinate of the top-right corner of the detected marker.

bottom right x [px]

X coordinate of the bottom-right corner of the detected marker.

bottom right y [px]

Y coordinate of the bottom-right corner of the detected marker.

bottom left x [px]

X coordinate of the bottom-left corner of the detected marker.

bottom left y [px]

Y coordinate of the bottom-left corner of the detected marker.

center x [px]

X coordinate of the center of the detected marker.

center y [px]

Y coordinate of the center of the detected marker.

See also

detect_contour()

Alternative method to detect rectangular contours instead of fiducial markers.

plot_detections()

Visualize marker detections on video frames.

overlay_detections()

Create a video with detected markers overlaid.

pyneon.find_homographies()

Compute homographies from detections.

detect_contour(step: int = 1, processing_window: tuple[int | float, int | float] | None = None, processing_window_unit: Literal['frame', 'time', 'timestamp'] = 'frame', min_area_ratio: float = 0.01, max_area_ratio: float = 0.98, brightness_threshold: int = 180, adaptive: bool = True, morph_kernel: int = 5, decimate: float = 1.0, mode: str = 'largest', report_diagnostics: bool = False, undistort: bool = False) Stream#

Detect bright rectangular regions (e.g., projected surfaces or monitors) in video frames using luminance-based contour detection.

Parameters:
stepint, optional

Process every Nth frame. For example, step=5 processes frames 0, 5, 10, 15, …. Defaults to 1 (process all frames).

processing_windowtuple[int | float, int | float] or None

Start and end of the processing window. Interpretation depends on processing_window_unit. Defaults to None (full duration).

processing_window_unit{“frame”, “time”, “timestamp”}, optional

Unit for values in processing_window. Possible values are:

  • “timestamp”: Unix timestamps in nanoseconds

  • “time”: Seconds relative to video start

  • “frame”: video frame indices (0-based)

Defaults to “frame”.

min_area_ratiofloat, optional

Minimum contour area relative to frame area. Contours smaller than this ratio are ignored. Default is 0.01 (1 percent of frame area).

max_area_ratiofloat, optional

Maximum contour area relative to frame area. Contours larger than this ratio are ignored. Default is 0.98.

brightness_thresholdint, optional

Fixed threshold for binarization when adaptive=False. Default is 180.

adaptivebool, optional

If True (default), use adaptive thresholding to handle varying illumination across frames.

morph_kernelint, optional

Kernel size for morphological closing (default 5). Use 0 to disable morphological operations.

decimatefloat, optional

Downsampling factor for faster processing (e.g., 0.5 halves resolution). Detected coordinates are automatically rescaled back. Default is 1.0.

mode{“largest”, “best”, “all”}, optional

Selection mode determining which contours to return per frame:

  • “largest” : Return only the largest valid rectangular contour. Useful when the surface is the outermost bright region. (Default)

  • “best” : Return the contour that most closely resembles a perfect rectangle (lowest corner-angle variance and balanced aspect ratio).

  • “all” : Return all valid rectangular contours (outer and inner overlapping rectangles). Useful when both surface and inner projected content need to be distinguished.

report_diagnosticsbool, optional

If True, includes “area_ratio” and “score” columns in the output. Defaults to False.

undistortbool, optional

If True, undistorts frames before detection, which can improve detection performance, then redistorts detected points. Returned coordinates remain in the original (distorted) video frame. Defaults to False.

Returns:
Stream

Stream of detected contour coordinates. Each row corresponds to a detected contour in a video frame indexed by “timestamp [ns]” and contains the following columns:

Column

Description

frame index

Frame number of the contour detection.

contour name

Identifier for the detected contour (e.g., “contour_0”).

top left x [px]

X coordinate of the top-left corner of the detected contour.

top left y [px]

Y coordinate of the top-left corner of the detected contour.

top right x [px]

X coordinate of the top-right corner of the detected contour.

top right y [px]

Y coordinate of the top-right corner of the detected contour.

bottom right x [px]

X coordinate of the bottom-right corner of the detected contour.

bottom right y [px]

Y coordinate of the bottom-right corner of the detected contour.

bottom left x [px]

X coordinate of the bottom-left corner of the detected contour.

bottom left y [px]

Y coordinate of the bottom-left corner of the detected contour.

center x [px]

X coordinate of the center of the detected contour.

center y [px]

Y coordinate of the center of the detected contour.

area_ratio

Area of the detected contour relative to frame area (if report_diagnostics is True).

score

Diagnostic score indicating how closely the detected contour resembles a perfect rectangle (if report_diagnostics is True).

See also

detect_markers()

Alternative method to detect fiducial markers.

plot_detections()

Visualize contour on a video frame.

overlay_detections()

Create a video with the detected contour overlaid.

pyneon.find_homographies()

Compute homographies from detections.

plot_detections(detections: Stream, frame_index: int, show_ids: bool = True, color: str = 'magenta', ax: Axes | None = None, show: bool = True)#

Visualize detections on a frame.

Parameters:
detectionsStream

Stream containing marker or contour detections.

frame_indexint

Frame index to plot.

show_idsbool

Display detection IDs at their centers. Defaults to True.

colorstr

Matplotlib color for overlay. Defaults to “magenta”.

axmatplotlib.axes.Axes or None

Axis to plot on. If None, a new figure is created. Defaults to None.

showbool

Show the figure if True. Defaults to True.

Returns:
figmatplotlib.figure.Figure

Figure instance containing the plot.

axmatplotlib.axes.Axes

Axis instance containing the plot.

overlay_detections(detections: Stream, show_ids: bool = True, color: tuple[int, int, int] = (255, 0, 255), show_video: bool = False, output_path: Path | str | None = None) None#

Overlay detections on the video frames. The resulting video can be displayed and/or saved.

Parameters:
detectionsStream

Stream containing marker or contour detections.

show_idsbool

Whether to overlay IDs at their centers when available. Defaults to True.

colortuple[int, int, int]

BGR color tuple for overlays. Defaults to (255, 0, 255) which is magenta.

show_videobool, optional

Whether to display the video with overlays in real-time. Press ‘q’ to quit early. Defaults to False.

output_pathpathlib.Path or str or None, optional

Path to save the output video with overlays. If None, the video is not saved. Either this or show_video=True must be provided.

If “default”, saves detections.mp4 to the derivatives folder under the recording directory. If None, no output video is written.

Returns:
None
overlay_scanpath(scanpath: DataFrame, circle_radius: int = 10, line_thickness: int = 2, max_fixations: int = 10, show_video: bool = False, output_path: Path | str = None) None#

Overlay scanpath fixations on the video frames.

The resulting video can be displayed and/or saved.

Parameters:
scanpathpandas.DataFrame

DataFrame containing the fixations and gaze data.

circle_radiusint

Radius of the fixation circles in pixels. Defaults to 10.

line_thicknessint or None

Thickness of the lines connecting fixations. If None, no lines are drawn. Defaults to 2.

max_fixationsint

Maximum number of fixations to plot per frame. Defaults to 10.

show_videobool

Whether to display the video with fixations overlaid. Defaults to False.

output_pathpathlib.Path or str or None

Path to save the video with fixations overlaid. If None, the video is not saved. If “default”, saves scanpath.mp4 to the derivatives folder under the recording directory.

Returns:
None
pyneon.find_homographies(detections: Stream, layout: DataFrame | ndarray, min_markers: int = 2, method: int = 4, ransacReprojThreshold: float = 3.0, maxIters: int = 2000, confidence: float = 0.995) Stream#

Compute a per-frame homography (3x3 matrix) from detections to a surface coordinate system.

Parameters:
detectionsStream

Stream containing per-detection marker/contour coordinates returned by Video.detect_markers() or Video.detect_contour().

layoutpd.DataFrame or np.ndarray

Layout of markers/contour to provide reference surface coordinates for homography computation. The expected format depends on the type of detections:

Marker detections: provide a DataFrame (can be visually checked with pyneon.plot_marker_layout()) with following columns:

Column

Description

marker name

Full marker name (family_id, e.g., “36h11_1”)

size

Size of the marker in surface units

center x

x-coordinate of the marker center in surface coordinates

center y

y-coordinate of the marker center in surface coordinates

Contour detections: provide a 2D numpy array of shape (4, 2) containing the surface coordinates of the contour corners in the following order: top-left, top-right, bottom-right, bottom-left.

min_markersint, optional

Minimum number of marker detections required in a frame to compute a homography when using marker detections. Frames with fewer detections are skipped. Defaults to 2.

methodint, optional

Method used to compute a homography matrix. The following methods are possible:

  • 0 - a regular method using all the points, i.e., the least squares method

  • cv2.RANSAC - RANSAC-based robust method

  • cv2.LMEDS - Least-Median robust method

  • cv2.RHO - PROSAC-based robust method

Defaults to cv2.LMEDS.

ransacReprojThresholdfloat, optional

Maximum allowed reprojection error to treat a point pair as an inlier (used in the RANSAC and RHO methods only). Defaults to 3.0.

maxItersint, optional

The maximum number of RANSAC iterations. Defaults to 2000.

confidencefloat, optional

Confidence level, between 0 and 1. Defaults to 0.995.

Returns:
homographiesStream

Stream indexed by “timestamp [ns]” with columns “homography (0,0)” through “homography (2,2)”, corresponding to the 9 elements of the estimated 3x3 homography matrix for each retained frame.

Examples

Compute homographies from marker detections:

>>> detections = video.detect_markers("36h11")
>>> layout = pd.DataFrame({
...     "marker name": ["36h11_0", "36h11_1"],
...     "size": [100, 100],
...     "center x": [200, 400],
...     "center y": [200, 200],
... })
>>> homographies = find_homographies(detections, layout)

Compute homographies from contour detections:

>>> detections = video.detect_contour()
>>> layout = np.array([[0, 0], [1, 0], [1, 1], [0, 1]])
>>> homographies = find_homographies(detections, layout)