Use the TwelveLabs Video Understanding Platform to find specific moments in your video content using natural language queries or reference images. The platform analyzes videos by integrating images, audio, speech, and text, offering a deeper understanding than single-modal methods. It captures complex relationships between these elements, detects subtle details, and supports natural language queries and images for intuitive and precise use.

Key features:

Improved accuracy: Multimodal integration enhances accuracy.
Easy interaction: Natural language queries simplify searches.
Advanced search: Enables image-based queries for precise results.
Fewer errors: Multi-faceted analysis reduces misinterpretation.
Time savings: Quickly finds relevant clips without manual review.

Use cases:

Spoken word search: Find video segments where specific words or phrases are spoken.
Visual element search: Locate video segments that match descriptions of visual elements or scenes.
Action or event search: Identify video segments that depict specific actions or events.
Image similarity search: Find video segments that visually resemble a provided image.
Entity search: Locate video segments containing specific people, car models, animal species, or branded objects with improved accuracy (Marengo 3.0 only).

The platform supports the following types of search queries:

Text queries: Search using natural language descriptions of visual elements, actions, sounds, or spoken words.
Image queries: Search using images to find visually similar content in your videos.
Composed queries: Combine text descriptions with images for more precise results (Marengo 3.0 only).

For guidance on choosing the correct type of query for your use case, see Search with text, image, and composed queries.

To understand how your usage is measured and billed, see the Pricing page.

Note

You can only perform searches at the individual index level, meaning you can only search within one index per request and cannot search at the video level or across multiple indexes simultaneously.

Prerequisites

To use the platform, you need an API key:

1
If you don’t have an account, sign up for a free account.
2
Go to the API Keys page.
3
Select the Copy icon next to your key.

Depending on the programming language you are using, install the TwelveLabs SDK by entering one of the following commands:

$ pip install twelvelabs

Your video files must meet the format requirements.
If you wish to use images as queries, ensure that your image file meet the format requirements.

Complete examples

These complete examples shows how to create an index, upload a video, and perform search requests using text, image, and composed queries. Ensure you replace the placeholders surrounded by <> with your values.

Text queries

Image queries

Composed text and image queries

1 from twelvelabs import TwelveLabs
2 from twelvelabs.indexes import IndexesCreateRequestModelsItem
3 from twelvelabs.tasks import TasksRetrieveResponse
4 
5 # 1. Initialize the client
6 client = TwelveLabs(api_key="<YOUR_API_KEY>")
7 
8 # 2. Create an index
9 # An index is a container for organizing your video content
10 index = client.indexes.create(
11     index_name="<YOUR_INDEX_NAME>",
12     models=[
13         IndexesCreateRequestModelsItem(
14             model_name="marengo3.0",
15             model_options=["visual", "audio"]
16         )
17     ]
18 )
19 print(f"Created index: id={index.id}")
20 
21 # 3. Upload a video
22 task = client.tasks.create(
23     index_id=index.id,
24     video_url="<YOUR_VIDEO_URL>"
25     # Or for a local file: video_file=open("<PATH_TO_VIDEO_FILE>", "rb")
26     )
27 print(f"Created task: id={task.id}")
28 
29 # 4. Monitor the indexing process
30 def on_task_update(task: TasksRetrieveResponse):
31     print(f"  Status={task.status}")
32 
33 task = client.tasks.wait_for_done(sleep_interval= 5, task_id=task.id, callback=on_task_update)
34 if task.status != "ready":
35     raise RuntimeError(f"Indexing failed with status {task.status}")
36 print(
37     f"Upload complete. The unique identifier of your video is {task.video_id}.")
38 
39 # 5. Perform a search request
40 search_pager = client.search.query(
41     index_id=index.id,
42     query_text="<YOUR_QUERY>",
43     search_options=["visual", "audio"],
44     # operator="or" # Optional: Use "and" to find segments matching all modalities
45     # transcription_options=["lexical", "semantic"]  # Optional: Control transcription matching (Marengo 3.0 only, requires "transcription" in search_options)
46 )
47 
48 # 6. Process the search results
49 print("Search results:")
50 for clip in search_pager:
51     print(
52         f" video_id {clip.video_id} rank={clip.rank} start={clip.start} end={clip.end}"
53     )

Step-by-step guide

Python

Node.js

Import the SDK and initialize the client

Create a client instance to interact with the TwelveLabs Video Understanding Platform.
Function call: You call the constructor of the TwelveLabs class.
Parameters:

api_key: The API key to authenticate your requests to the platform.

Return value: An object of type TwelveLabs configured for making API calls.

Create an index

Indexes store and organize your video data, allowing you to group related videos. Create one before uploading videos.
Function call: You call the indexes.create function.
Parameters:

index_name: The name of the index.
models: An array specifying your model configuration.

See the Indexes page for more details on creating an index and specifying the model configuration.

Return value: An object containing, among other information, a field named id representing the unique identifier of the newly created index.

Upload videos

To perform any downstream tasks, you must first upload your videos, and the platform must finish indexing them.
Function call: You call the tasks.create function.
Parameters:

index_id: The unique identifier of your index.
video_url or video_file:
- video_url: The publicly accessible URL of your video file (string)
- video_file: An opened file object in binary read mode. Use open(path, 'rb') to open your local file

Return value: An object of type TasksCreateResponse that you can use to track the status of your video upload and indexing process. This object contains, among other information, the following fields:

id: The unique identifier of your video indexing task.
video_id: The unique identifier of your video.

Monitor the indexing process

The platform requires some time to index videos. Check the status of the video indexing task until it’s completed.
Function call: You call the tasks.wait_for_done function.
Parameters:

sleep_interval: The time interval, in seconds, between successive status checks. In this example, the method checks the status every five seconds.
task_id: The unique identifier of your video indexing task.
callback: A callback function that the SDK executes each time it checks the status.

Return value: An object of type TasksRetrieveResponse containing, among other information, a field named status representing the status of your task. Wait until the value of this field is ready.

Perform a search request

Perform a search within your index using a text or image query or a combination of both.

Text queries

Image queries

Composed text and image queries

Function call: You call the search.query method.
Parameters:

index_id: The unique identifier of the index.
query_text: Your search query. Note that the platform supports full natural language-based search. The maximum query length varies by model. Marengo 3.0 supports up to 500 tokens per query, while Marengo 2.7 supports up to 77 tokens per query.
search_options: The modalities the platform uses when performing a search. This example searches using visual and audio cues. For details, see the Search options section.
(Optional) operator: Combines multiple search options using or (default) or and. Use and to find segments matching all search options. Use or to find segments matching any search option.
(Optional) transcription_options: Specifies how the platform matches your query against spoken words. This parameter applies only when transcription is included in search_options and when Marengo 3.0 is enabled for your index. Available options are lexical, semantic, or both (default). For details, see the Transcription options section.

Return value: An object of type SyncPager[SearchItem] that can be iterated to access search results. Each item contains the following fields, among other information:

video_id: The unique identifier of the video that matched your search terms.
start: The start time of the matching video clip, expressed in seconds.
end: The end time of the matching video clip, expressed in seconds.
rank: The relevance ranking assigned by the model. Lower numbers indicate higher relevance, starting with 1 for the most relevant result.
Only Marengo 3.0 and newer versions return this field. Earlier versions return score and confidence instead.

Process the search results

This example iterates over the results using a for loop to display the search results to the standard output.

Next steps

Learn more about searching with text, image, and composed queries for best practices and advanced techniques.
Explore entity search to find specific people in your videos.
Learn query engineering techniques to refine your search queries.
Use sorting to organize your search results.
Apply grouping to cluster search results from the same video together.
Implement filtering to narrow down results based on specific criteria.