Search

The TwelveLabs Video Understanding Platform analyzes videos by integrating images, audio, speech, and text, offering a deeper understanding than single-modal methods. It captures complex relationships between these elements, detects subtle details, and supports natural language queries and images for intuitive and precise use.

Key features:

  • Improved accuracy: Multimodal integration enhances accuracy.
  • Easy interaction: Natural language queries simplify searches.
  • Advanced search: Enables image-based queries for precise results.
  • Fewer errors: Multi-faceted analysis reduces misinterpretation.
  • Time savings: Quickly finds relevant clips without manual review.

Use cases:

  • Spoken word search: Find video segments where specific words or phrases are spoken.
  • Visual element search: Locate video segments that match descriptions of visual elements or scenes.
  • Action or event search: Identify video segments that depict specific actions or events.
  • Image similarity search: Find video segments that visually resemble a provided image.

To understand how your usage is measured and billed, see the Pricing page.

Note

You can only perform searches at the individual index level, meaning you can only search within one index per request and cannot search at the video level or across multiple indexes simultaneously.

Prerequisites

  • To use the platform, you need an API key:

    1

    If you don’t have an account, sign up for a free account.

    2

    Go to the API Key page.

    3

    Select the Copy icon next to your key.

  • Ensure the TwelveLabs SDK is installed on your computer:

    $pip install twelvelabs
  • The videos you wish to use must meet the following requirements:

    • Video resolution: Must be at least 360x360 and must not exceed 3840x2160.
    • Aspect ratio: Must be one of 1:1, 4:3, 4:5, 5:4, 16:9, or 9:16.
    • Video and audio formats: Your video files must be encoded in the video and audio formats listed on the FFmpeg Formats Documentation page. For videos in other formats, contact us at support@twelvelabs.io.
    • Duration: Must be between 4 seconds and 2 hours (7,200s).
    • File size: Must not exceed 2 GB.
      If you require different options, contact us at support@twelvelabs.io.
  • If you wish to use images as queries, ensure that your images meet the following requirements:

    • Format: JPEG and PNG.
    • Dimension: Must be at least 64 x 64 pixels.
    • Size: Must not exceed 5MB.
    • Object visibility: Ensure that the objects of interest are visible and occupy at least 50% of the video frame. This helps the platform accurately identify and match the objects.

Complete example

This complete example shows how to create an index, upload a video, and perform search requests using text and image queries. Ensure you replace the placeholders surrounded by <> with your values.

1from twelvelabs import TwelveLabs
2from twelvelabs.models.task import Task
3
4# 1. Initialize the client
5client = TwelveLabs(api_key="<YOUR_API_KEY>")
6
7# 2. Create an index
8models = [ {"name": "marengo2.7", "options": ["visual", "audio"]}]
9index = client.index.create(name="<YOUR_INDEX_NAME>", models=models)
10print(f"Index created: id={index.id}, name={index.name}")
11
12# 3. Upload a video
13task = client.task.create(index_id=index.id, file="<YOUR_VIDEO_FILE>")
14print(f"Task id={task.id}, Video id={task.video_id}")
15
16# 4. Monitor the indexing process
17def on_task_update(task: Task):
18 print(f" Status={task.status}")
19task.wait_for_done(sleep_interval=5, callback=on_task_update)
20if task.status != "ready":
21 raise RuntimeError(f"Indexing failed with status {task.status}")
22print(f"The unique identifier of your video is {task.video_id}.")
23
24# 5. Perform a search request
25search_results = client.search.query(
26 index_id=index.id, query_text="<YOUR_QUERY>", options=["visual", "audio"], operator="or")
27
28# 6. Process the search results
29def print_page(page):
30 for clip in page:
31 print(
32 f" video_id={clip.video_id} score={clip.score} start={clip.start} end={clip.end} confidence={clip.confidence}"
33 )
34print_page(search_results.data)
35while True:
36 try:
37 print_page(next(search_results))
38 except StopIteration:
39 break

Step-by-step guide

1

Import the SDK and initialize the client

Create a client instance to interact with the TwelveLabs Video Understanding platform.
Function call: You call the constructor of the TwelveLabs class.
Parameters:

  • api_key: The API key to authenticate your requests to the platform.

Return value: An object of type TwelveLabs configured for making API calls.

2

Specify the index containing your videos

Indexes help you organize and search through related videos efficiently. This example creates a new index, but you can also use an existing index by specifying its unique identifier. See the Indexes page for more details on creating an index.
Function call: You call the index.create function.
Parameters:

  • name: The name of the index.
  • models: An object specifying your model configuration. This example enables the Marengo video understanding model and the visual and audio model options.

Return value: An object containing, among other information, a field named id representing the unique identifier of the newly created index.

3

Upload videos

To perform any downstream tasks, you must first upload your videos, and the platform must finish processing them.
Function call: You call the task.create function. This starts a video indexing task, which is an object of type Task that tracks the status of your video upload and indexing process.
Parameters:

  • index_id: The unique identifier of your index.
  • file or url: The path or the publicly accessible URL of your video file.

Return value: An object of type Task containing, among other information, the following fields:

  • video_id: The unique identifier of your video
  • status: The status of your video indexing task.
Note

You can also upload multiple videos in a single API call. For details, see the Cloud-to-cloud integrations page.

4

Monitor the indexing process

The platform requires some time to index videos. Check the status of the video indexing task until it’s completed.
Function call: You call the task.wait_for_done function.
Parameters:

  • sleep_interval: The time interval, in seconds, between successive status checks. In this example, the method checks the status every five seconds.
  • callback: A callback function that the SDK executes each time it checks the status. Note that the callback function takes a parameter of type Task representig the video indexing task you’ve created in the previous step. Use it to display the status of your video indexing task.

Return value: An object containing, among other information, a field named status representing the status of your task. Wait until the value of this field is ready.

5

Perform a search request

Perform a search within your index using a text or image query.

Function call: You call the search.query method.
Parameters:

  • index_id: The unique identifier of the index.
  • query_text: Your search query. Note that the platform supports full natural language-based search.
  • options: The modalities the platform uses when performing a search. This example searches using visual cues.
  • (Optional) operator: The logical operator, either “or” or “and”, specifies how your search combines multiple sources of information; it defaults to “or”. Use this parameter when the “options” parameter lists more than one source of information. For example, when you set this parameter to “and”, the search returns video segments matching all specified sources of information. 

Return value: An object which contains the following fields:

  • data: A list of objects representing the search results, each of which contains the following fields:
    • video_id: The unique identifier of the video that matched your search terms.
    • start: The start time of the matching video clip, expressed in seconds.
    • end: The end time of the matching video clip, expressed in seconds.
    • score: A quantitative value determined by the platform representing the level of confidence that the results match your search terms.
  • page_info: An object that provides information about pagination. The platform returns the results one page at a time, with a default limit of 10 results per page.
  • pool: An object that contains the total number of videos within the index, the total duration of the videos, and the unique identifier of the index that you’ve searched.

For details about all fields, see the API Reference > Make any-to-video search requests page.

6

Process the search results

Display the search results, including handling pagination to retrieve all pages.
Function call: Iterate over the results by calling the next function.
Parameters: The search results retrieved in the previous step.

Return value: The next page or raises a StopIteration exception if the iterator has reached the last page.