VideoFile

VideoFile extends File and provides additional methods for working with video files.

VideoFile instances are created when a DataChain is initialized from storage with the type="video" parameter:

import datachain as dc

chain = dc.read_storage("s3://bucket-name/", type="video")

There are additional models for working with video files:

VideoFrame - represents a single frame of a video file, including its video stream index, frame index, and timestamp.
VideoFragment - represents a fragment of a video file.

video_stream_index arguments are zero-based indexes among video streams, matching FFmpeg v:N and PyAV container.streams.video[N] selectors.

VideoFile.get_frame() returns a single frame reference with a timestamp estimated from FPS metadata. Pixel access may seek to a nearby keyframe and decode forward; use VideoFile.get_frames() for sequential access and decoded frame timestamps when available.

These are virtual models that do not create physical files. Instead, they are used to represent the data in the VideoFile these models are referring to. If you need to save the data, you can use the save method of these models, allowing you to save data locally or upload it to a storage service.

VideoFile

VideoFile(**kwargs)

Bases: File

A data model for handling video files.

This model inherits from the File model and provides additional functionality for reading video files, extracting video frames, and splitting videos into fragments.

The video_stream_index argument used by video methods is the zero-based index among video streams, matching FFmpeg v:N and PyAV container.streams.video[N] selectors.

Source code in datachain/lib/file.py

def __init__(self, **kwargs):
    super().__init__(**kwargs)
    self._video_info_cache: dict[int, Any] = {}

get_fragment

get_fragment(start: float, end: float) -> VideoFragment

Returns a video fragment from the specified time range.

Parameters:

start (float) –

The start time of the fragment in seconds.
end (float) –

The end time of the fragment in seconds.

Returns:

VideoFragment ( VideoFragment ) –

A Model representing the video fragment.

Source code in datachain/lib/file.py

def get_fragment(self, start: float, end: float) -> "VideoFragment":
    """
    Returns a video fragment from the specified time range.

    Args:
        start (float): The start time of the fragment in seconds.
        end (float): The end time of the fragment in seconds.

    Returns:
        VideoFragment: A Model representing the video fragment.
    """
    if start < 0 or end < 0 or start >= end:
        raise ValueError(
            f"Can't get video fragment for '{self.path}', "
            f"invalid time range: ({start:.3f}, {end:.3f})"
        )

    return VideoFragment(video=self, start=start, end=end)

get_fragments

get_fragments(
    duration: float,
    start: float = 0,
    end: float | None = None,
) -> Iterator[VideoFragment]

Splits the video into multiple fragments of a specified duration.

Parameters:

duration (float) –

The duration of each video fragment in seconds.
start (float, default: 0 ) –

The starting time in seconds (default: 0).
end (float, default: None ) –

The ending time in seconds. If None, the entire remaining video is processed (default: None).

Returns:

Iterator[VideoFragment] –

Iterator[VideoFragment]: An iterator yielding video fragments.

Note

If end is not specified, duration will be taken from the video file, which means video metadata needs to be read.

Source code in datachain/lib/file.py

def get_fragments(
    self,
    duration: float,
    start: float = 0,
    end: float | None = None,
) -> "Iterator[VideoFragment]":
    """
    Splits the video into multiple fragments of a specified duration.

    Args:
        duration (float): The duration of each video fragment in seconds.
        start (float): The starting time in seconds (default: 0).
        end (float, optional): The ending time in seconds. If None, the entire
                               remaining video is processed (default: None).

    Returns:
        Iterator[VideoFragment]: An iterator yielding video fragments.

    Note:
        If end is not specified, duration will be taken from the video file,
        which means video metadata needs to be read.
    """
    if duration <= 0:
        raise ValueError("duration must be a positive float")
    if start < 0:
        raise ValueError("start must be a non-negative float")

    if end is None:
        end = self.get_info().duration

    if end < 0:
        raise ValueError("end must be a non-negative float")
    if start >= end:
        raise ValueError("start must be less than end")

    while start < end:
        yield self.get_fragment(start, min(start + duration, end))
        start += duration

get_frame

get_frame(
    frame: int, video_stream_index: int = 0
) -> VideoFrame

Returns a specific video frame by its frame number.

This returns a frame reference without decoding or validating that the frame exists. The returned timestamp is estimated from FPS metadata. Pixel access methods decode the requested frame; use get_frames() for sequential access and decoded frame timestamps when available.

Parameters:

frame (int) –

The frame number to read.
video_stream_index (int, default: 0 ) –

Zero-based index among video streams to read. Defaults to 0.

Returns:

VideoFrame ( VideoFrame ) –

Video frame model.

Source code in datachain/lib/file.py

def get_frame(self, frame: int, video_stream_index: int = 0) -> "VideoFrame":
    """
    Returns a specific video frame by its frame number.

    This returns a frame reference without decoding or validating that the
    frame exists. The returned timestamp is estimated from FPS metadata.
    Pixel access methods decode the requested frame; use ``get_frames()``
    for sequential access and decoded frame timestamps when available.

    Args:
        frame (int): The frame number to read.
        video_stream_index: Zero-based index among video streams to read.
            Defaults to 0.

    Returns:
        VideoFrame: Video frame model.
    """
    if frame < 0:
        raise ValueError("frame must be a non-negative integer")

    from .video import video_frame

    return video_frame(self, frame, video_stream_index=video_stream_index)

get_frames

get_frames(
    start: int = 0,
    end: int | None = None,
    step: int = 1,
    video_stream_index: int = 0,
) -> Iterator[VideoFrame]

Returns video frames from the specified range in the video.

Parameters:

start (int, default: 0 ) –

The starting frame number (default: 0).
end (int, default: None ) –

The ending frame number (exclusive). If None, frames are read until the end of the video (default: None).
step (int, default: 1 ) –

The interval between frames to read (default: 1).
video_stream_index (int, default: 0 ) –

Zero-based index among video streams to read. Defaults to 0.

Returns:

Iterator[VideoFrame] –

Iterator[VideoFrame]: An iterator yielding video frames.

Note

If end is not specified, number of frames will be taken from the video file, this means video metadata needs to be read.

Source code in datachain/lib/file.py

def get_frames(
    self,
    start: int = 0,
    end: int | None = None,
    step: int = 1,
    video_stream_index: int = 0,
) -> "Iterator[VideoFrame]":
    """
    Returns video frames from the specified range in the video.

    Args:
        start (int): The starting frame number (default: 0).
        end (int, optional): The ending frame number (exclusive). If None,
                             frames are read until the end of the video
                             (default: None).
        step (int): The interval between frames to read (default: 1).
        video_stream_index: Zero-based index among video streams to read.
            Defaults to 0.

    Returns:
        Iterator[VideoFrame]: An iterator yielding video frames.

    Note:
        If end is not specified, number of frames will be taken from the video file,
        this means video metadata needs to be read.
    """
    from .video import validate_frame_range, video_frames

    start, end, step = validate_frame_range(
        self, start, end, step, video_stream_index=video_stream_index
    )

    yield from video_frames(
        self, start, end, step, video_stream_index=video_stream_index
    )

get_info

get_info(video_stream_index: int = 0) -> Video

Retrieves metadata and information about the video file.

Metadata is read through File.open(), so it can stream when caching is disabled. When caching is enabled, opening the file may populate the local cache first.

Parameters:

video_stream_index (int, default: 0 ) –

Zero-based index among video streams to inspect. Defaults to 0.

Returns:

Video ( Video ) –

A Model containing video metadata such as duration, resolution, frame rate, and codec details.

Source code in datachain/lib/file.py

def get_info(self, video_stream_index: int = 0) -> "Video":
    """
    Retrieves metadata and information about the video file.

    Metadata is read through ``File.open()``, so it can stream when caching
    is disabled. When caching is enabled, opening the file may populate the
    local cache first.

    Args:
        video_stream_index: Zero-based index among video streams to inspect.
            Defaults to 0.

    Returns:
        Video: A Model containing video metadata such as duration,
               resolution, frame rate, and codec details.
    """
    from .video import video_info

    if video_stream_index not in self._video_info_cache:
        self._video_info_cache[video_stream_index] = video_info(
            self, video_stream_index=video_stream_index
        )
    return self._video_info_cache[video_stream_index]

save

save(
    destination: str, client_config: dict | None = None
) -> VideoFile

Writes its content to destination

Source code in datachain/lib/file.py

def save(  # type: ignore[override]
    self,
    destination: str,
    client_config: dict | None = None,
) -> "VideoFile":
    """Writes its content to destination"""
    result = super().save(destination, client_config=client_config)
    vf = VideoFile(**result.model_dump())
    vf._set_stream(self._catalog)
    return vf

VideoFrame

Bases: DataModel

A data model for representing a video frame.

This model inherits from the VideoFile model and adds a frame attribute, which represents a specific frame within a video file. It allows access to individual frames and provides functionality for reading and saving video frames as image files.

Attributes:

video (VideoFile) –

The video file containing the video frame.
frame (int) –

The frame number referencing a specific frame in the video file.
video_stream_index (int) –

Zero-based index among video streams containing the frame.
timestamp (float) –

Frame timestamp in seconds. Frames returned by VideoFile.get_frame() use FPS metadata. Frames yielded by VideoFile.get_frames() use decoded frame timestamps when available.

get_np

get_np() -> ndarray

Returns a video frame from the video file as a NumPy array.

For seekable constant-FPS streams, this seeks near the requested frame and decodes forward from the previous keyframe. Otherwise, it may decode from the start. Use VideoFile.get_frames() when reading many frames sequentially.

Returns:

ndarray ( ndarray ) –

A NumPy array representing the video frame, in the shape (height, width, channels).

Source code in datachain/lib/file.py

def get_np(self) -> "ndarray":
    """
    Returns a video frame from the video file as a NumPy array.

    For seekable constant-FPS streams, this seeks near the requested frame
    and decodes forward from the previous keyframe. Otherwise, it may decode
    from the start. Use ``VideoFile.get_frames()`` when reading many frames
    sequentially.

    Returns:
        ndarray: A NumPy array representing the video frame,
                 in the shape (height, width, channels).
    """
    from .video import video_frame_np

    return video_frame_np(
        self.video,
        self.frame,
        video_stream_index=self.video_stream_index,
    )

read_bytes

read_bytes(format: str = 'jpg') -> bytes

Returns a video frame from the video file as image bytes.

For seekable constant-FPS streams, this seeks near the requested frame and decodes forward from the previous keyframe. Otherwise, it may decode from the start. Use VideoFile.get_frames() when reading many frames sequentially.

Parameters:

format (str, default: 'jpg' ) –

The desired image format (e.g., 'jpg', 'png'). Defaults to 'jpg'.

Returns:

bytes ( bytes ) –

The encoded video frame as image bytes.

Source code in datachain/lib/file.py

def read_bytes(self, format: str = "jpg") -> bytes:
    """
    Returns a video frame from the video file as image bytes.

    For seekable constant-FPS streams, this seeks near the requested frame
    and decodes forward from the previous keyframe. Otherwise, it may decode
    from the start. Use ``VideoFile.get_frames()`` when reading many frames
    sequentially.

    Args:
        format (str): The desired image format (e.g., 'jpg', 'png').
                      Defaults to 'jpg'.

    Returns:
        bytes: The encoded video frame as image bytes.
    """
    from .video import video_frame_bytes

    return video_frame_bytes(
        self.video,
        self.frame,
        format,
        video_stream_index=self.video_stream_index,
    )

save

save(
    destination: str,
    format: str = "jpg",
    client_config: dict | None = None,
) -> ImageFile

Saves the current video frame as an image file.

For seekable constant-FPS streams, this seeks near the requested frame and decodes forward from the previous keyframe. Otherwise, it may decode from the start. Use VideoFile.get_frames() when reading many frames sequentially.

If destination is a remote path, the image file will be uploaded to remote storage.

Parameters:

destination (str) –

Output directory path or URI (e.g. s3://…, gs://…).
format (str, default: 'jpg' ) –

Image format (e.g., 'jpg', 'png'). Defaults to 'jpg'.
client_config (dict | None, default: None ) –

Optional client configuration (e.g. credentials).

Returns:

ImageFile ( ImageFile ) –

A Model representing the saved image file.

Source code in datachain/lib/file.py

def save(
    self,
    destination: str,
    format: str = "jpg",
    client_config: dict | None = None,
) -> "ImageFile":
    """
    Saves the current video frame as an image file.

    For seekable constant-FPS streams, this seeks near the requested frame
    and decodes forward from the previous keyframe. Otherwise, it may decode
    from the start. Use ``VideoFile.get_frames()`` when reading many frames
    sequentially.

    If ``destination`` is a remote path, the image file will be uploaded
    to remote storage.

    Args:
        destination: Output directory path or URI (e.g. ``s3://…``, ``gs://…``).
        format: Image format (e.g., 'jpg', 'png'). Defaults to 'jpg'.
        client_config: Optional client configuration (e.g. credentials).

    Returns:
        ImageFile: A Model representing the saved image file.
    """
    from .video import save_video_frame

    return save_video_frame(
        self.video,
        self.frame,
        destination,
        format,
        client_config=client_config,
        video_stream_index=self.video_stream_index,
    )

VideoFragment

Bases: DataModel

A data model for representing a video fragment.

This model inherits from the VideoFile model and adds start and end attributes, which represent a specific fragment within a video file. It allows access to individual fragments and provides functionality for reading and saving video fragments as separate video files.

Attributes:

video (VideoFile) –

The video file containing the video fragment.
start (float) –

The starting time of the video fragment in seconds.
end (float) –

The ending time of the video fragment in seconds.

save

save(
    destination: str,
    format: str | None = None,
    client_config: dict | None = None,
    timeout: float | None = None,
) -> VideoFile

Saves the video fragment as a new video file.

If destination is a remote path, the video file will be uploaded to remote storage.

Parameters:

destination (str) –

Output directory path or URI (e.g. s3://…, gs://…).
format (str | None, default: None ) –

Output video format (e.g., 'mp4', 'avi'). If None, inferred from the file extension.
client_config (dict | None, default: None ) –

Optional client configuration (e.g. credentials).
timeout (float | None, default: None ) –

FFmpeg subprocess timeout in seconds. If None, a timeout is computed from the fragment duration. Set to 0 to disable.

Returns:

VideoFile ( VideoFile ) –

A Model representing the saved video file.

Source code in datachain/lib/file.py

def save(
    self,
    destination: str,
    format: str | None = None,
    client_config: dict | None = None,
    timeout: float | None = None,
) -> "VideoFile":
    """
    Saves the video fragment as a new video file.

    If ``destination`` is a remote path, the video file will be uploaded
    to remote storage.

    Args:
        destination: Output directory path or URI (e.g. ``s3://…``, ``gs://…``).
        format: Output video format (e.g., 'mp4', 'avi').
                If None, inferred from the file extension.
        client_config: Optional client configuration (e.g. credentials).
        timeout: FFmpeg subprocess timeout in seconds. If None, a timeout is
            computed from the fragment duration. Set to 0 to disable.

    Returns:
        VideoFile: A Model representing the saved video file.
    """
    from .video import save_video_fragment

    return save_video_fragment(
        self.video,
        self.start,
        self.end,
        destination,
        format,
        client_config=client_config,
        timeout=timeout,
    )

Video

Bases: DataModel

A data model representing metadata for a video file.

Attributes:

width (int) –

The width of the video in pixels. Defaults to -1 if unknown.
height (int) –

The height of the video in pixels. Defaults to -1 if unknown.
fps (float) –

The frame rate of the video (frames per second). Defaults to -1.0 if unknown.
duration (float) –

The total duration of the video in seconds. Defaults to -1.0 if unknown.
frames (int) –

The total number of frames in the video. Defaults to -1 if unknown.
format (str) –

The format of the video file (e.g., 'mp4', 'avi'). Defaults to an empty string.
codec (str) –

The codec used for encoding the video. Defaults to an empty string.