Transform raw video into structured, timestamped data. Define the types of segments you want to detect and the fields you want to extract, such as editorial narratives, sports plays, speaker changes, or brand appearances, and the platform automatically identifies segment boundaries and returns custom metadata for each segment in JSON format.
Key features:
Use cases:
On the Free plan, video segmentation hours count toward a shared limit that also covers indexing - the number of segment definitions does not affect this limit. On paid plans, you pay based on how much video you process and how many segment definitions you include - see the Frequently asked questions page for examples.
For details on how your usage is measured and billed, see the Pricing page.
string, boolean, number, integer, or array), and a description that specifies what to extract. You can define up to 20 fields per segment definition.media_sources array, then reference them by name in your description using angle brackets (Example: <@product_logo>). You can attach up to 4 media sources per segment definition.This guide shows how to define segment definitions, create an asynchronous analysis task with Pegasus 1.5, and parse the timestamped metadata from the results.
The examples in this section show different segment definitions for common use cases.
Detect scene changes and extract descriptive metadata for each scene, including camera angles and activities.
To provide visual context for segment detection, add a media_sources array with up to 4 images to your segment definition and reference them by name in your description using angle brackets (Example: <@product_logo>).
The result.data field is a JSON-encoded string. Parse it with json.loads() in Python or JSON.parse() in Node.js before accessing the data. Every response follows this general structure:
Note the following about the general structure:
id field from your segment definitions.start_time and end_time fields are in seconds (Example: 45.0).metadata object for each segment.These responses correspond to the examples above, trimmed to the first two segments for brevity:
description field controls what the platform extracts. Be specific about the expected content and format.enum for categorical fields: When a field has a known set of values, list them with enum so that responses use only those values.min_segment_duration to avoid very short segments and max_segment_duration to limit how long a single segment can be.id value becomes the top-level key in your response JSON, so use descriptive identifiers.finish_reason field in the response is length instead of stop, the JSON may be incomplete. Increase max_tokens or reduce the number of fields and definitions.Problem: You receive a 400 error when including the prompt parameter.
Cause: The prompt parameter is not supported when the analysis_mode parameter is set to time_based_metadata. Segment definitions and field descriptions serve as instructions instead.
Solution: Remove the prompt parameter from your request. Use the description field in your segment definitions and on individual fields to specify what to extract.
Problem: You receive a validation error about a missing field description.
Cause: Every field in a segment definition requires a description field that specifies what to extract.
Solution: Add a description field to every entry in your fields array:
Problem: Calls to the json.loads() function in Python or the JSON.parse() function in Node.js fail when parsing the result.data field.
Cause: The response exceeded max_tokens or the context window. When finish_reason is length, the returned JSON may be incomplete. A warning appears in the error field.
Solution: Increase the maximum response length if your response needs more room. You can also reduce the number of segment definitions or fields, clip the video with start_time, end_time, or time_ranges, or split your analysis across multiple requests.