Create video embeddings
The following table lists the available models for generating video embeddings and their key characteristics:
Model | Description | Dimensions | Clip length | Similarity metric |
---|---|---|---|---|
Marengo-retrieval-2.6 | Use this model to create embeddings that you can use in various downstream tasks | 1024 | 2 to 10 seconds | Cosine similarity |
The “Marengo-retrieval-2.6” video understanding engine generates embeddings for all modalities in the same latent space. This shared space enables any-to-any searches across different types of content.
Create video embeddings
To create video embeddings, you must first upload your videos, and the platform must finish processing them. Uploading and processing videos require some time. Consequently, creating embeddings is an asynchronous process comprised of three steps:
- Upload and process a video: Creates a video embedding task that uploads and processes your video. The platform returns the unique identifier of your task.
- Monitor the status of your video embedding task: Use the unique identifier of your task to check its status periodically until it's completed.
- Retrieve the embeddings: After the video embedding task is completed, retrieve the video embeddings by providing the task identifier.
The platform allows the creation of a single embedding for the entire video and multiple embeddings for specific segments. See the Customize your embeddings section for examples of using these parameters.
Prerequisites
- You’re familiar with the concepts that are described on the Platform overview page.
- You have an API key. To retrieve your API key, navigate to the API Key page and log in with your credentials. Then, select the Copy icon to the right of your API key to copy it to your clipboard.
- The videos for which you wish to generate embeddings must meet the following requirements:
- Video resolution: Must be at least 480x360 or 360x480, and not exceed 4K (3840x2160).
- Video and audio formats: The video files must be encoded in the video and audio formats listed on the FFmpeg Formats Documentation page. For videos in other formats, contact us at [email protected].
- Duration: Must be between 4 seconds and 2 hours (7,200s).
- File size: Must not exceed 2 GB.
Procedure
Follow the steps in the sections below to create video embeddings.
1. Upload and process a video
Use the create
method of the embed.task
object to create a video embedding task. This method processes a single video and generates embeddings based on the specified parameters. This function takes the following parameters:
- The name of the video understanding engine to be used. The examples in this section use "Marengo-retrieval-2.6"
- The video you want to upload, provided as a local file or a publicly accessible URL.
- (Optional) Any additional parameters for customizing the timing and length of your embeddings. See the Customize your embeddings section below for details.
Upload a video from a publicly accessible URL
Notes:
- The platform supports uploading video files that can play without additional user interaction or custom video players. Ensure your URL points to the raw video file, not a web page containing the video. Links to third-party hosting sites, cloud storage services, or videos requiring extra steps to play are not supported.
- Youtube URLs are not supported for Embed API at this time.
The following example code uploads a video from a publicly accessible URL. Ensure you replace the placeholders surrounded by <>
with your values.
from twelvelabs import TwelveLabs
from typing import List
from twelvelabs.models.embed import EmbeddingsTask, SegmentEmbedding
client = TwelveLabs(api_key="<YOUR_API_KEY>")
task = client.embed.task.create(
engine_name="Marengo-retrieval-2.6",
video_url="<YOUR_VIDEO_URL>", # Example: https://sample-videos.com/video321/mp4/720/big_buck_bunny_720p_2mb.mp4
)
print(
f"Created task: id={task.id} engine_name={task.engine_name} status={task.status}"
)
import { TwelveLabs, EmbeddingsTask, SegmentEmbedding } from 'twelvelabs-js';
const client = new TwelveLabs({ apiKey: '<YOUR_API_KEY>' });
let task = await client.embed.task.create(
'Marengo-retrieval-2.6',
{
url: '<YOUR_VIDEO_URL>' // Example: https://sample-videos.com/video321/mp4/720/big_buck_bunny_720p_2mb.mp4
}
);
console.log(`Created task: id=${task.id} engineName=${task.engineName} status=${task.status}`);
The output should look similar to the following one:
Created task: id=6659784ff24ade84c6f50e8f engine_name=Marengo-retrieval-2.6 status=processing
Note that the response contains a field named id
, which represents the unique identifier of your video embedding task. For a description of each field in the request and response, see the API Reference > Create a video embedding task page.
Upload a video from the local file system
To upload a video from the local file system, provide the video_file
parameter, as shown below:
task = client.embed.task.create(
engine_name="Marengo-retrieval-2.6",
video_file="<YOUR_VIDEO_PATH>"
)
let task = await client.embed.task.create(
'Marengo-retrieval-2.6',
{
file: '<YOUR_VIDEO_PATH>'
}
);
console.log(`Created task: id=${task.id} engineName=${task.engineName} status=${task.status}`);
2. Monitor the status of your video embedding task
The Twelve Labs Video Understanding Platform requires some time to process videos. You can retrieve the video embeddings only after the processing is complete. To monitor the status of your video embedding task, call the wait_for_done
method of the task
object with the following parameters:
-
sleep_interval
: A number specifying the time interval, in seconds, between successive status checks. In this example, the method checks the status every two seconds. Adjust this value to control how frequently the method checks the status. -
callback
: A callback function that the SDK executes each time it checks the status. In this example,on_task_update
is the callback function. Note that the callback function takes a parameter of typeEmbeddingsTask
. Use this parameter to display the status of your video processing task.def on_task_update(task: EmbeddingsTask): print(f" Status={task.status}") status = task.wait_for_done( sleep_interval=2, callback=on_task_update ) print(f"Embedding done: {status}")
const status = await task.waitForDone(2000, (task: EmbeddingsTask) => { console.log(` Status=${task.status}`); }); console.log(`Embedding done: ${status}`);
The output should look similar to the following one:
Status=processing
Status=processing
Status=ready
After a video has been successfully uploaded and processed, the task
object contains, among other information, a field named id
, representing the unique identifier of your video embedding task. For a description of each field in the response, see the API Reference > Retrieve the status of a video embedding task page.
3. Retrieve the embeddings
Once the platform has finished processing your video, you can retrieve the embeddings by invoking the retrieve
method of the task
object:
def print_segments(segments: List[SegmentEmbedding]):
for segment in segments:
print(
f" embedding_scope={segment.embedding_scope} start_offset_sec={segment.start_offset_sec} end_offset_sec={segment.end_offset_sec}"
)
print(f" embeddings: {', '.join(str(segment.embeddings_float))}")
task = task.retrieve()
if task.video_embedding is not None and task.video_embedding.segments is not None:
print_segments(task.video_embedding.segments)
const printSegments = (segments: SegmentEmbedding[]) => {
segments.forEach((segment) => {
console.log(
`embeddingScope=${segment.embeddingScope} startOffsetSec=${segment.startOffsetSec} endOffsetSec=${segment.endOffsetSec}`
);
console.log("embeddings: ", segment.embeddingsFloat);
});
};
task = await task.retrieve();
if (task.videoEmbedding) {
if (task.videoEmbedding.segments) {
printSegments(task.videoEmbedding.segments);
}
}
Note the following about the response:
-
When you use the default behavior of the platform, and no additional parameters are specified, the response should look similar to the following one:
embedding_scope=clip start_offset_sec=0.0 end_offset_sec=6.0 embeddings: -0.06261667, -0.012716668, 0.024836386 ... embedding_scope=clip start_offset_sec=6.0 end_offset_sec=12.0 mbeddings: -0.050863456, -0.014198959, 0.038503144 ... embedding_scope=clip start_offset_sec=156.0 end_offset_sec=160.52 embeddings: -0.00094736926, -0.010648306, 0.054438476 ...
In this example response, each object of the
video_embeddings
array corresponds to a segment and includes the following fields:start_offset_sec
: Start time of the segment.end_offset_sec
: End time of the segment.embedding_scope
: The value of theembedding_scope
field is set toclip
. This specifies that the embedding is for a clip.values
: An array of floats that represents the embedding.
-
When you create embeddings for specific video clips and the entire video simultaneously by setting the value of the
video_embedding_scopes
parameter to["clip", "video"]
, the response should look similar to the following one:embedding_scope=clip start_offset_sec=0.0 end_offset_sec=6.0 embeddings: -0.06261667, -0.012716668, 0.024836386 ... embedding_scope=clip start_offset_sec=6.0 end_offset_sec=12.0 mbeddings: -0.050863456, -0.014198959, 0.038503144 ... embedding_scope=clip start_offset_sec=156.0 end_offset_sec=160.52 embeddings: -0.00094736926, -0.010648306, 0.054438476 ... embedding_scope=video start_offset_sec=0.0 end_offset_sec=160.52 embeddings: -0.023929736, -0.012013472, 0.043946236
Note the following about this example response:
- The first three embeddings have the
embedding_scope
field set toclip
. Each corresponds to a specific segment of the video you provided. - The fourth embedding has the
embedding_scope
field set tovideo
. This embedding corresponds to the entire video.
- The first three embeddings have the
For a description of each field in the request and response, see the API Reference > Retrieve video embeddings page.
Customize your embeddings
The default behavior is to create multiple embeddings, each 6 seconds long, for each video. You can modify the default behavior as follows:
-
Embedding scope: The optional
video_embedding_scopes
parameter determines the scope of the generated embeddings. It is an array of strings, and valid values are the following:["clip"]
: Creates embeddings for multiple clips, as specified by thevideo_start_offset_sec
,video_end_offset_sec
,video_clip_length
parameters described below. This is the default value.["clip", "video"]
: Creates embeddings for specific video segments and the entire video in a single request.
-
Embedding settings: The following optional parameters customize the timing and length of the embeddings:
video_start_offset_sec
: Specifies the start offset in seconds from the beginning of the video where processing should begin.video_end_offset_sec
: Specifies the end offset in seconds from the beginning of the video where processing should end.video_clip_length
: Specifies the desired duration in seconds for each clip for which the platform generates an embedding. It can be between 2 and 10 seconds.
Note that the platform automatically truncates video segments shorter than 2 seconds. For a 31-second video divided into 6-second segments, the final 1-second segment will be truncated. This truncation only applies to the last segment if it does not meet the minimum length requirement of 2 seconds.
Below are examples of how you can customize the timing and length of your embeddings:
-
To split the video into multiple 5-second segments and create an embedding for each:
task = client.embed.task.create( engine_name="Marengo-retrieval-2.6", video_url="<YOUR_VIDEO_URL>", video_clip_length=5 )
let task = await client.embed.task.create( 'Marengo-retrieval-2.6', { url: '<YOUR_VIDEO_URL>', clipLength: 5 } );
-
To split a video into multiple 5-second segments from the 30-second mark to the 60-second mark and create an embedding for each:
task = client.embed.task.create( engine_name="Marengo-retrieval-2.6", video_url="<YOUR_VIDEO_URL>", video_clip_length=5, video_start_offset_sec=30, video_end_offset_sec=60, )
let task = await client.embed.task.create( 'Marengo-retrieval-2.6', { url: '<YOUR_VIDEO_URL>', clipLength: 5 startOffsetSec: 30, endOffsetSec: 60, } );
-
To create a single embedding for a video segment from the 2-second mark to the 12-second mark:
task = client.embed.task.create( engine_name="Marengo-retrieval-2.6", video_url="<YOUR_VIDEO_URL>, video_start_offset_sec=2, video_end_offset_sec= 12, video_embedding_scopes=["video"] )
let task = await client.embed.task.create( 'Marengo-retrieval-2.6', { url: '<YOUR_VIDEO_URL>', startOffsetSec: 2, endOffsetSec: 12 video_embedding_scopes=["video"] } );
-
To split the video into multiple 6-second segments and create embeddings for each segment as well as the entire video, set the value of the
video_embedding_scopes
parameter to["clip", "video"]
:task = client.embed.task.create( engine_name="Marengo-retrieval-2.6", video_url="<YOUR_VIDEO_URL>", video_embedding_scopes=["clip", "video"] )
let task = await client.embed.task.create( 'Marengo-retrieval-2.6', { url: '<YOUR_VIDEO_URL>', scopes: ['clip', 'video'] } );
Retrieve embeddings for indexed videos
The platform allows you to retrieve embeddings for videos you've already uploaded and indexed. The embeddings are generated using video scene detection. Video scene detection enables the segmentation of videos into semantically meaningful parts. It involves identifying boundaries between scenes, defined as a series of frames depicting a continuous action or theme. Each segment is between 2 and 10 seconds.
Prerequisites
Your video must be indexed with the Marengo video understanding engine version 2.6 or later. For details on enabling this engine for an index, see the Create indexes page.
Procedure
Call the retrieve
method of the index.video
object with the following parameters:
index_id
: The unique identifier of your index.video_id
: The unique identifier of your video.embed
: Set this parameter toTrue
to retrieve the embeddings.
from twelvelabs import TwelveLabs
from typing import List
from twelvelabs.models.embed import SegmentEmbedding
def print_segments(segments: List[SegmentEmbedding], max_elements: int = 5):
for segment in segments:
print(
f" embedding_scope={segment.embedding_scope} start_offset_sec={segment.start_offset_sec} end_offset_sec={segment.end_offset_sec}"
)
print(f" embeddings: {segment.embeddings_float[:max_elements]}")
client = TwelveLabs(api_key="<YOUR_API_KEY>")
video = client.index.video.retrieve(
index_id="<YOUR_INDEX_ID>", id="<YOUR_VIDEO_ID>", embed=True)
if video.embedding:
print(f"Engine_name={video.embedding.engine_name}")
print("Embeddings:")
print_segments(video.embedding.video_embedding.segments)
import { TwelveLabs, SegmentEmbedding } from "twelvelabs-js";
const printSegments = (segments: SegmentEmbedding[], maxElements = 5) => {
segments.forEach((segment) => {
console.log(
` embedding_scope=${segment.embeddingScope} start_offset_sec=${segment.startOffsetSec} end_offset_sec=${segment.endOffsetSec}`
);
console.log(
" embeddings: ",
segment.embeddingsFloat.slice(0, maxElements)
);
});
};
const client = new TwelveLabs({ apiKey: "<YOUR_API_KEY>" });
const video = await client.index.video.retrieve(
"<YOUR_INDEX_ID>",
"<YOUR_VIDEO_ID>",
{ embed: true }
);
if (video.embedding) {
console.log(`Engine name: ${video.embedding.engineName}`);
console.log("Embeddings:");
printSegments(video.embedding.videoEmbedding.segments);
}
For details about each field in the response, see the Retrieve video information page.
Updated 3 days ago