This guide shows how you can create video embeddings.

The following table lists the available models for generating video embeddings and their key characteristics:

Model	Description	Dimensions	Clip length	Similarity metric
Marengo-retrieval-2.7	Use this model to create embeddings that you can use in various downstream tasks	1024	2 to 10 seconds	Cosine similarity

The “Marengo-retrieval-2.7” video understanding model generates embeddings for all modalities in the same latent space. This shared space enables any-to-any searches across different types of content.

The platform allows the creation of a single embedding for the entire video and multiple embeddings for specific segments. The default behavior is to create multiple embeddings, each 6 seconds long, for each video. You can modify the default behavior when you upload a video. For details, see the Step 2: Upload a video section below.

Prerequisites

To use the platform, you need an API key:

1
If you don’t have an account, sign up for a free account.
2
Go to the API Key page.
3
Select the Copy icon next to your key.

Ensure the TwelveLabs SDK is installed on your computer:

$ pip install twelvelabs

The videos you wish to use must meet the following requirements:
- Video resolution: Must be at least 360x360 and must not exceed 3840x2160.
- Aspect ratio: Must be one of 1:1, 4:3, 4:5, 5:4, 16:9, 9:16, or 17:9.
- Video and audio formats: Your video files must be encoded in the video and audio formats listed on the FFmpeg Formats Documentation page. For videos in other formats, contact us at support@twelvelabs.io.
- Duration: Must be between 4 seconds and 2 hours (7,200s).
- File size: Must not exceed 2 GB.
  If you require different options, contact us at support@twelvelabs.io.

Complete example

This complete example illustrates creating video embeddings. Upload your videos and wait for the platform to process them, which takes some time. As a result, embedding creation is an asynchronous process. Ensure you replace the placeholders surrounded by <> with your values.

1 from twelvelabs import TwelveLabs
2 from typing import List
3 from twelvelabs.models.embed import EmbeddingsTask, SegmentEmbedding
4 
5 # 1. Initialize the client
6 client = TwelveLabs(api_key="<YOUR_API_KEY>")
7 
8 # 2. Upload a video
9 task = client.embed.task.create(
10     model_name="Marengo-retrieval-2.7",
11     video_url="<YOUR_VIDEO_URL>",
12     # video_clip_length=5,
13     # video_start_offset_sec=30,
14     # video_end_offset_sec=60,
15     # video_embedding_scopes=["clip" ,"video"]
16 )
17 print(
18     f"Created task: id={task.id} model_name={task.model_name} status={task.status}")
19 
20 # 3. Monitor the status
21 def on_task_update(task: EmbeddingsTask):
22     print(f"  Status={task.status}")
23 status = task.wait_for_done(sleep_interval=5, callback=on_task_update)
24 print(f"Embedding done: {status}")
25 
26 # 4. Retrieve the embeddings
27 task = task.retrieve(embedding_option=["visual-text", "audio"])
28 
29 # 5. Process the results
30 def print_segments(segments: List[SegmentEmbedding], max_elements: int = 5):
31     for segment in segments:
32         print(
33             f"  embedding_scope={segment.embedding_scope} embedding_option={segment.embedding_option} start_offset_sec={segment.start_offset_sec} end_offset_sec={segment.end_offset_sec}"
34         )
35         print(f"  embeddings: {segment.embeddings_float[:max_elements]}")
36 
37 if task.video_embedding is not None and task.video_embedding.segments is not None:
38     print_segments(task.video_embedding.segments)

Step-by-step guide

Python

Node.js

Import the SDK and initialize the client

Create a client instance to interact with the TwelveLabs Video Understanding Platform.
Function call: You call the constructor of the TwelveLabs class.
Parameters:

api_key: The API key to authenticate your requests to the platform.

Return value: An object of type TwelveLabs configured for making API calls.

Upload a video

To create video embeddings, you must first upload your videos, and the platform must finish processing them.
Function call: You call the embed.task.create function.
Parameters:

model_name: The name of the model you want to use (“Marengo-retrieval-2.7”).
video_url or video_file: The publicly accessible URL or the path of your video file.
Notes
- The platform supports uploading video files that can play without additional user interaction or custom video players. Ensure your URL points to the raw video file, not a web page containing the video. Links to third-party hosting sites, cloud storage services, or videos requiring extra steps to play are not supported.
- Youtube URLs are not supported for Embed API at this time.
(Optional) video_start_offset_sec: The start offset in seconds from the beginning of the video where processing should begin.
(Optional) video_end_offset_sec: The end offset in seconds from the beginning of the video where processing should end.
(Optional) video_clip_length: The desired duration in seconds for each clip for which the platform generates an embedding. It can be between 2 and 10 seconds. Note that the platform automatically truncates video segments shorter than 2 seconds. This truncation only applies to the last segment if it does not meet the minimum length requirement of 2 seconds. Example: for a 31-second video divided into 6-second segments, the final 1-second segment will be truncated.
(Optional) video_embedding_scopes: The scope of the generated embeddings. Valid values are the following:
- ["clip"]: Creates embeddings for multiple clips, as specified by the video_start_offset_sec, video_end_offset_sec, video_clip_length parameters described below. This is the default value.
- ["clip", "video"]: Creates embeddings for specific video segments and the entire video in a single request.

Return value: An object containing, among other a information, a field named id, which represents the unique identifier of your video embedding task. You can use this object to track the status of your video embedding task.

Monitor the status

The platform requires some time to process videos. Check the status of the video embedding task until it’s completed.
Function call: You call the embed.task.wait_for_done function.
Parameters:

sleep_interval: The time interval, in seconds, between successive status checks. In this example, the method checks the status every two seconds. Adjust this value to control how frequently the method checks the status.
callback: A callback function that the SDK executes each time it checks the status. In this example, on_task_update is the callback function.
Return value:
An object containing, among other information, a field named status representing the status of your task. Wait until the value of this field is ready.

Retrieve the embeddings

Once the platform has finished processing your video, you can retrieve the embeddings.
Function call: You call the embed.task.retrieve function.
Parameters:

(Optional) embedding_option: An array of strings that can contain one or more of the following values:
- visual-text: Returns visual embeddings optimized for text search.
- audio: Returns audio embeddings.
  The default value is embedding_option=["visual-text", "audio"].

Return value: The response contains, among other information, an object named video_embedding that contains the embedding data for your video. This object includes the following fields:

segments: An array of objects, each representing a segment of the video with its embedding data. Each item contains:
- embeddings_float: An array of numbers representing the embedding vector for the segment.
- start_offset_sec: The start time of the segment in seconds.
- end_offset_sec: The end time of the segment in seconds.
- embedding_ scope: The scope of the embedding.
- embedding_option: The type of the embedding.
metadata: An object containing metadata associated with the embedding.

Process the results

This example iterates over the results and prints the key properties and a portion of the embedding vectors for each segment.