Create video embeddings

πŸ“˜

Note

The private beta version only processes visual information.

To create video embeddings, you must first upload your videos, and the platform must finish processing them. Uploading and processing videos require some time. Consequently, creating embeddings is an asynchronous process comprised of three steps:

  1. Upload and process a video: When you start uploading a video, the platform creates a video embedding task and returns its unique task identifier.
  2. Monitor the status of your video embedding task: Use the unique identifier of your task to check its status periodically until it's completed.
  3. Retrieve the embeddings: After the video embedding task is completed, retrieve the video embeddings by providing the task identifier.

The platform allows the creation of a single embedding for the entire video and multiple embeddings for specific segments. The default behavior is to create multiple embeddings, each 6 seconds long, for each video. You can modify the default behavior as follows:

  • Embedding scope: The optional video_embedding_scopes parameter determines the scope of the generated embeddings. It is an array of strings and it can have one or both of the following values:

    • video: Use this value to create an embedding of the entire video
    • clip: Use this value to create embeddings for multiple clips, as specified by the video_start_offset_sec, video_end_offset_sec, video_clip_length parameters described below.
      If you include both the video and clip values, the platform creates embeddings for specific video segments and the entire video in a single request.
  • Embedding settings: The following optional parameters customize the timing and length of the embeddings:

    • video_start_offset_sec: Specifies the start offset in seconds from the beginning of the video where processing should begin.
    • video_end_offset_sec: Specifies the end offset in seconds from the beginning of the video where processing should end.
    • video_clip_length: Specifies the desired duration in seconds for each clip for which the platform generates an embedding.

    See the Customize your embeddings section for examples of using these parameters.

Note that the platform automatically truncates video segments shorter than 2 seconds. For a 31-second video divided into 6-second segments, the final 1-second segment will be truncated. This truncation only applies to the last segment if it does not meet the minimum length requirement of 2 seconds.

You can interact with the platform using one of the available SDKs or an HTTP client like requests or axios. This guide demonstrates how to use the SDKs, the recommended approach for most scenarios. If you need to make direct HTTP requests, refer to the API Reference > Video embeddings section for details.

Prerequisites

  • You’re familiar with the concepts that are described on the Platform overview page.
  • You have an API key. To retrieve your API key, navigate to the API Key page and log in with your credentials. Then, select the Copy icon to the right of your API key to copy it to your clipboard.
  • The videos for which you wish to generate embeddings must meet the following requirements:
    • Duration: Must be between 4 seconds and 2 hours (7,200s).
    • File size: Must not exceed 2 GB.
    • Video resolution: Must be greater or equal than 360p and less or equal than 4K.
    • Video and audio formats: The video files must be encoded in the video and audio formats listed on the FFmpeg Formats Documentation page. For videos in other formats, contact us at [email protected].

Procedure

Follow the steps in the sections below to create video embeddings.

1. Upload and process a video

Use theΒ createΒ method of theΒ embed.taskΒ object to create a video embedding task. This method processes a single video and generates embeddings based on the specified parameters. This function takes the following parameters:

  • The name of the video understanding engine to be used. The examples in this section use "Marengo-retrieval-2.6"
  • The video you want to upload, provided as a local file or a publicly accessible URL.
  • (Optional) Any additional parameters for customizing the timing and length of your embeddings. See the Customize your embeddings section below for details.

Upload a video from a publicly accessible URL

The following example code uploads a video from a publicly accessible URL. Ensure you replace the placeholders surrounded by <> with your values.

from twelvelabs import TwelveLabs
from twelvelabs.models.embed import EmbeddingsTask

client = TwelveLabs(api_key="<YOUR_API_KEY>")

task = client.embed.task.create(
    engine_name="Marengo-retrieval-2.6",
    video_url="<YOUR_VIDEO_URL>", # Example: https://sample-videos.com/video321/mp4/720/big_buck_bunny_720p_2mb.mp4
)
print(
    f"Created task: id={task.id} engine_name={task.engine_name} status={task.status}"
)
import { TwelveLabs, EmbeddingsTask } from 'twelvelabs-js';

const client = new TwelveLabs({ apiKey: '<YOUR_API_KEY>' });

let task = await client.embed.task.create(
  'Marengo-retrieval-2.6',
  {
  	url: '<YOUR_VIDEO_URL>' // Example: https://sample-videos.com/video321/mp4/720/big_buck_bunny_720p_2mb.mp4
	}
);
console.log(`Created task: id=${task.id} engineName=${task.engineName} status=${task.status}`);

The output should look similar to the following one:

Created task: id=6659784ff24ade84c6f50e8f engine_name=Marengo-retrieval-2.6 status=processing

Note that the response contains a field named id, which represents the unique identifier of your video embedding task. For a description of each field in the request and response, see the API Reference > Create a video embedding task page.

Upload a video from the local file system

To upload a video from the local file system, provide the video_file parameter, as shown below:

task = client.embed.task.create(
    engine_name="Marengo-retrieval-2.6",
    video_file="<YOUR_VIDEO_PATH>"
)
import { TwelveLabs, EmbeddingsTask } from 'twelvelabs-js';

let task = await client.embed.task.create(
  'Marengo-retrieval-2.6',
  {
  	file: '<YOUR_VIDEO_PATH>'
	}
);
console.log(`Created task: id=${task.id} engineName=${task.engineName} status=${task.status}`);

2. Monitor the status of your video embedding task

The Twelve Labs Video Understanding Platform requires some time to process videos. You can retrieve the video embeddings only after the processing is complete. To monitor the status of your video embedding task, call the wait_for_done method of the task object with the following parameters:

  • sleep_interval: A number specifying the time interval, in seconds, between successive status checks. In this example, the method checks the status every two seconds. Adjust this value to control how frequently the method checks the status.

  • callback: A callback function that the SDK executes each time it checks the status. In this example, on_task_update is the callback function. Note that the callback function takes a parameter of type EmbeddingsTask. Use this parameter to display the status of your video processing task.

    def on_task_update(task: EmbeddingsTask):
        print(f"  Status={task.status}")
    
    status = task.wait_for_done(
      sleep_interval=2,
      callback=on_task_update
      )
    print(f"Embedding done: {status}")
    
      const status = await task.waitForDone(2000, (task: EmbeddingsTask) => {
        console.log(`  Status=${task.status}`);
      });
      console.log(`Embedding done: ${status}`);
    

The output should look similar to the following one:

  Status=processing
  Status=processing
  Status=ready

After a video has been successfully uploaded and processed, the task object contains, among other information, a field named id, representing the unique identifier of your video embedding task. For a description of each field in the response, see the API Reference > Retrieve the status of a video embedding task page.

3. Retrieve the embeddings

Once the platform has finished processing your video, you can retrieve the embeddings by invoking the retrieve method of the embed.task object with the unique identifier of your video embedding task as a parameter:

task = client.embed.task.retrieve(task.id)
if task.video_embeddings is not None:
    for v in task.video_embeddings:
        print(
            f"embedding_scope={v.embedding_scope} start_offset_sec={v.start_offset_sec} end_offset_sec={v.end_offset_sec}"
        )
        print(f"embeddings: {', '.join([str(x) for x in v.embedding.float])}")
  task = await client.embed.task.retrieve(task.id);
  if (task.videoEmbeddings) {
    for (const v of task.videoEmbeddings) {
      console.log(
        `embeddingScope=${v.embeddingScope} startOffsetSec=${v.startOffsetSec} endOffsetSec=${v.endOffsetSec}`,
      );
      console.log(`embeddings: ${v.embedding.float.join(', ')}`);
    }
  }

Note the following about the response:

  • When you use the default behavior of the platform, and no additional parameters are specified, the response should look similar to the following one:

    embedding_scope=clip start_offset_sec=0.0 end_offset_sec=6.0
    embeddings: -0.06261667, -0.012716668, 0.024836386
    ...
    embedding_scope=clip start_offset_sec=6.0 end_offset_sec=12.0
    mbeddings: -0.050863456, -0.014198959, 0.038503144
    ...
    embedding_scope=clip start_offset_sec=156.0 end_offset_sec=160.52
    embeddings: -0.00094736926, -0.010648306, 0.054438476
    ...
    

    In this example response, each object of the video_embeddings array corresponds to a segment and includes the following fields:

    • start_offset_sec: Start time of the segment.
    • end_offset_sec: End time of the segment.
    • embedding_scope: The value of the embedding_scope field is set to clip. This specifies that the embedding is for a clip.
    • embedding: An array of floats that represents the embedding.
  • When you create a single embedding for the entire video by setting the value of the video_embedding_scopes parameter to ["video"], the response should look similar to the following one:

    embedding_scope=video start_offset_sec=0.0 end_offset_sec=160.52
    embeddings: -0.023929736, -0.012013472, 0.043946236
    ...
    

    Note the following about this example response:

    • The video_embeddings array contains a single embedding that corresponds to the entire video
    • The value of the embedding_scope field is set to video. This specifies that the embedding is for the entire video.
  • When you create embeddings for specific video clips and the entire video simultaneously by setting the value of the video_embedding_scopes parameter to ["clip", "video"], the response should look similar to the following one:

    embedding_scope=clip start_offset_sec=0.0 end_offset_sec=6.0
    embeddings: -0.06261667, -0.012716668, 0.024836386
    ...
    embedding_scope=clip start_offset_sec=6.0 end_offset_sec=12.0
    mbeddings: -0.050863456, -0.014198959, 0.038503144
    ...
    embedding_scope=clip start_offset_sec=156.0 end_offset_sec=160.52
    embeddings: -0.00094736926, -0.010648306, 0.054438476
    ...
    embedding_scope=video start_offset_sec=0.0 end_offset_sec=160.52
    embeddings: -0.023929736, -0.012013472, 0.043946236
    

    Note the following about this example response:

    • The first three embeddings have the embedding_scope field set to clip. Each corresponds to a specific segment of the video you provided.
    • The fourth embedding has the embedding_scope field set to video. This embedding corresponds to the entire video.

For a description of each field in the request and response, see the API Reference > Retrieve video embeddings page.

Customize your embeddings

This section provides examples of how you can customize the timing and length of your embeddings.

  • To split a video into multiple 6-second segments and create an embedding for each, do not provide the video_embedding_scopes, video_start_offset_sec, video_end_offset_sec, or video_clip_length parameters:

    task = client.embed.task.create(
        engine_name="Marengo-retrieval-2.6",
        video_url="<YOUR_VIDEO_URL>",
    )
    
    let task = await client.embed.task.create(
      'Marengo-retrieval-2.6',
      {
      	url: '<YOUR_VIDEO_URL>'
    	}
    );
    
  • To split the video into multiple 5-second segments and create an embedding for each:

    task = client.embed.task.create(
      engine_name="Marengo-retrieval-2.6",
      video_url="<YOUR_VIDEO_URL>",
      video_clip_length=5
    )
    
    let task = await client.embed.task.create(
      'Marengo-retrieval-2.6',
      {
      	url: '<YOUR_VIDEO_URL>',
        clipLength: 5
    	}
    );
    
  • To split a video into multiple 5-second segments from the 30-second mark to the 60-second mark and create an embedding for each:

    task = client.embed.task.create(
      engine_name="Marengo-retrieval-2.6",
      video_url="<YOUR_VIDEO_URL>",
      video_clip_length=5,
      video_start_offset_sec=30,
      video_end_offset_sec=60,
    )
    
    let task = await client.embed.task.create(
      'Marengo-retrieval-2.6',
      {
      	url: '<YOUR_VIDEO_URL>',
        clipLength: 5
        startOffsetSec: 30,
        endOffsetSec: 60,
    	}
    );
    
  • To create a single embedding for the entire video, set the value of the video_embedding_scopes parameter to ["video"]:

    task = client.embed.task.create(
      engine_name="Marengo-retrieval-2.6",
      video_url="<YOUR_VIDEO_URL>",
      video_embedding_scopes=["video"]
    )
    
    let task = await client.embed.task.create(
      'Marengo-retrieval-2.6',
      {
      	url: '<YOUR_VIDEO_URL>',
        scopes: 'video'
      }
    );
    
  • To create a single embedding for a video segment from the 2-second mark to the 12-second mark:

    task = client.embed.task.create(
        engine_name="Marengo-retrieval-2.6",
        video_url="<YOUR_VIDEO_URL>,
        video_start_offset_sec=2,
        video_end_offset_sec= 12,
        video_embedding_scopes=["video"]
    )
    
    let task = await client.embed.task.create(
      engineName,
      {
        url: '<YOUR_VIDEO_URL>',
        startOffsetSec: 2,
        endOffsetSec: 12
        video_embedding_scopes=["video"]
      }
    );
    
  • To split the video into multiple 6-second segments and create embeddings for each segment as well as the entire video, set the value of the video_embedding_scopes parameter to ["clip", "video"]:

    task = client.embed.task.create(
      engine_name="Marengo-retrieval-2.6",
      video_url="<YOUR_VIDEO_URL>",
      video_embedding_scopes=["clip", "video"]
    )
    
    let task = await client.embed.task.create(
      engineName,
      {
        url: '<YOUR_VIDEO_URL>',
        scopes: ['clip', 'video']
      }
    );