This quickstart guide provides a simplified introduction to generating text from video using the TwelveLabs Video Understanding Platform. It includes:

A basic working example
Minimal implementation details
Core parameters for common use cases

For a comprehensive guide, see the Create embeddings section.

Prerequisites

To use the platform, you need an API key:

1
If you don’t have an account, sign up for a free account.
2
Go to the API Key page.
3
Select the Copy icon next to your key.

Ensure the TwelveLabs SDK is installed on your computer:

$ pip install twelvelabs

The videos you wish to use must meet the following requirements:
- Video resolution: Must be at least 360x360 and must not exceed 3840x2160.
- Aspect ratio: Must be one of 1:1, 4:3, 4:5, 5:4, 16:9, 9:16, or 17:9.
- Video and audio formats: Your video files must be encoded in the video and audio formats listed on the FFmpeg Formats Documentation page. For videos in other formats, contact us at support@twelvelabs.io.
- Duration: Must be between 4 seconds and 2 hours (7,200s).
- File size: Must not exceed 2 GB.
  If you require different options, contact us at support@twelvelabs.io.
The audio files you wish to use use must meet the following requirements:
- Format: WAV (uncompressed), MP3 (lossy), and FLAC (lossless)
- File size: Must not exceed 10MB.
The images you wish to use use must meet the following requirements:
- Format: JPEG and PNG.
- Dimension: Must be at least 128 x 128 pixels.
- Size: Must not exceed 5MB.

Starter code

You can copy and paste the code below to create embeddings. Replace the placeholders surrounded by <> with your values.

Video embeddings

Text embeddigs

Image embeddings

Audio embeddings

1 from twelvelabs import TwelveLabs
2 from typing import List
3 from twelvelabs.models.embed import EmbeddingsTask, SegmentEmbedding
4 
5 client = TwelveLabs(api_key="<YOUR_API_KEY>")
6 
7 task = client.embed.task.create(model_name="Marengo-retrieval-2.7", video_url="<YOUR_VIDEO_URL")
8 print(f"Created task: id={task.id} model_name={task.model_name} status={task.status}")
9 def on_task_update(task: EmbeddingsTask):
10     print(f"  Status={task.status}")
11 status = task.wait_for_done(sleep_interval=5,callback=on_task_update)
12 print(f"Embedding done: {status}")
13 
14 task = task.retrieve(embedding_option=["visual-text", "audio"])
15 
16 def print_segments(segments: List[SegmentEmbedding], max_elements: int = 5):
17     for segment in segments:
18         print(f"  embedding_scope={segment.embedding_scope} embedding_option={segment.embedding_option} start_offset_sec={segment.start_offset_sec} end_offset_sec={segment.end_offset_sec}")
19         print(f"  embeddings: {', '.join(str(segment.embeddings_float[:max_elements]))}")
20 if task.video_embedding is not None and task.video_embedding.segments is not None:
21     print_segments(task.video_embedding.segments)

Step-by-step guide

Video embeddings

Text embeddings

Image embeddings

Audio embeddings

Import the SDK and initialize the client

Create a client instance to interact with the TwelveLabs Video Understanding Platform.

Upload videos

To perform any downstream tasks, you must first upload your videos, and the platform must finish processing them.

Monitor the status

The platform requires some time to process videos. Check the status of the video embedding task until it’s completed.

Retrieve the embeddings

Once the platform has finished processing your video, you can retrieve the embeddings.

Process the results

This example iterates over the results and prints the key properties and a portion of the embedding vectors for each segment to the standard output.