Open-ended analysis

This guide shows how you can use the Analyze API to perform open-ended analysis on video content, generating tailored text outputs based on your prompts. This feature provides more customization options than the summarization feature. It supports generating various content types based on your prompts, including, but not limited to, tables of content, action items, memos, reports, and comprehensive analyses.

The platform provides two distinct methods for retrieving the results of the open-ended analysis:

Streaming responses deliver text fragments in real-time as they are generated, enabling immediate processing and feedback. This method is the default behavior of the platform and is ideal for applications requiring incremental updates.

  • Response format: A stream of JSON objects in NDJSON format, with three event types:
    • stream_start: Marks the beginning of the stream.
    • text_generation: Delivers a fragment of the generated text.
    • stream_end: Signals the end of the stream.
  • Response handling:
    • Iterate over the stream to process text fragments as they arrive.
    • Use text_stream.aggregated_text to access the complete text after streaming ends.
  • Advantages:
    • Real-time processing of partial results.
    • Reduced perceived latency.
  • Use case: Live transcription, real-time analysis, or applications needing instant updates.

Non-streaming responses deliver the complete generated text in a single response, simplifying processing when the full result is needed.

  • Response format: A single string containing the full generated text.
  • Response handling:
    • Access the complete text directly from the response.
  • Advantages:
    • Simplicity in handling the full result.
    • Immediate access to the entire text.
  • Use case: Generating reports, summaries, or any scenario where the whole text is required at once.

This guide provides a complete example. For a simplified introduction with just the essentials, see the Analyze videos quickstart guide.

Prerequisites

  • To use the platform, you need an API key:

    1

    If you don’t have an account, sign up for a free account.

    2

    Go to the API Key page.

    3

    Select the Copy icon next to your key.

  • Ensure the TwelveLabs SDK is installed on your computer:

    $pip install twelvelabs
  • The videos you wish to use must meet the following requirements:

    • Video resolution: Must be at least 360x360 and must not exceed 3840x2160.
    • Aspect ratio: Must be one of 1:1, 4:3, 4:5, 5:4, 16:9, or 9:16.
    • Video and audio formats: Your video files must be encoded in the video and audio formats listed on the FFmpeg Formats Documentation page. For videos in other formats, contact us at support@twelvelabs.io.
    • Duration: Must be between 4 seconds and 60 minutes (3600s). In a future release, the maximum duration will be 2 hours (7,200 seconds).
    • File size: Must not exceed 2 GB.
      If you require different options, contact us at support@twelvelabs.io.

Complete example

This complete example shows how to create an index, upload a video, and perform open-ended analysis to generate text from your video. . Ensure you replace the placeholders surrounded by <> with your values.

1from twelvelabs import TwelveLabs
2from twelvelabs.models.task import Task
3
4# 1. Initialize the client
5client = TwelveLabs(api_key="<YOUR_API_KEY>")
6
7# 2. Create an index
8models = [
9 {
10 "name": "pegasus1.2",
11 "options": ["visual", "audio"]
12 }
13]
14index = client.index.create(name="<YOUR_INDEX_NAME>", models=models)
15print(f"Index created: id={index.id}, name={index.name}")
16
17# 3. Upload a video
18task = client.task.create(index_id=index.id, file="<YOUR_VIDEO_FILE>")
19print(f"Task id={task.id}, Video id={task.video_id}")
20
21# 4. Monitor the indexing process
22def on_task_update(task: Task):
23 print(f" Status={task.status}")
24task.wait_for_done(sleep_interval=5, callback=on_task_update)
25
26if task.status != "ready":
27 raise RuntimeError(f"Indexing failed with status {task.status}")
28print(f"The unique identifier of your video is {task.video_id}.")
29
30# 5. Generate open-ended text
31text_stream = client.analyze_stream(
32 video_id=task.video_id, prompt="<YOUR_PROMPT>", temperature=0.2)
33
34# 6. Process the results
35for text in text_stream:
36 print(text)
37print(f"Aggregated text: {text_stream.aggregated_text}")

Step-by-step guide

1

Import the SDK and initialize the client

Create a client instance to interact with the TwelveLabs Video Understanding platform.
Function call: You call the constructor of the TwelveLabs class.
Parameters:

  • api_key: The API key to authenticate your requests to the platform.

Return value: An object of type TwelveLabs configured for making API calls.

2

Specify the index containing your videos

Indexes help you organize and search through related videos efficiently. This example creates a new index, but you can also use an existing index by specifying its unique identifier. See the Indexes page for more details on creating an index.
Function call: You call the index.create function.
Parameters:

  • name: The name of the index.
  • models: An object specifying your model configuration. This example enables the Pegasus video understanding model and the visual and audio model options.

Return value: An object containing, among other information, a field named id representing the unique identifier of the newly created index.

3

Upload videos

To perform any downstream tasks, you must first upload your videos, and the platform must finish processing them.
Function call: You call the task.create function. This starts a video indexing task, which is an object of type Task that tracks the status of your video upload and indexing process.
Parameters:

  • index_id: The unique identifier of your index.
  • file or url: The path or the publicly accessible URL of your video file.

Return value: An object of type Task containing, among other information, the following fields:

  • video_id: The unique identifier of your video
  • status: The status of your video indexing task.
Note

You can also upload multiple videos in a single API call. For details, see the Cloud-to-cloud integrations page.

4

Monitor the indexing process

The platform requires some time to index videos. Check the status of the video indexing task until it’s completed.
Function call: You call the task.wait_for_done function.
Parameters:

  • sleep_interval: The time interval, in seconds, between successive status checks. In this example, the method checks the status every five seconds.
  • callback: A callback function that the SDK executes each time it checks the status. Note that the callback function takes a parameter of type Task representig the video indexing task you’ve created in the previous step. Use it to display the status of your video indexing task.

Return value: An object containing, among other information, a field named status representing the status of your task. Wait until the value of this field is ready.

5

Perform open-ended analysis

Function call: You call the analyze_stream method.
Parameters:

  • video_id: The unique identifier of the video for which you want to generate text.
  • (Optional) prompt: A string that guides the model on the desired format or content. The maximum length of a prompt is 2,000 tokens.
  • (Optional) temperature: A number that controls the randomness of the text. A higher value generates more creative text, while a lower value produces more deterministic text.

Return value: An object that handles streaming HTTP responses and provides an iterator interface allowing you to process text fragments as they arrive. It contains the following fields, among other information:

  • texts: A list accumulating individual text fragments received from the stream.
  • aggregated_text: A concatenated string of all text fragments received so far.
6

Process the results

Use a loop to iterate over the stream. Inside the loop, handle each text fragment as it arrives. This example prints each fragment to the standard output. After the stream ends, use the aggregated_text field if you need the full generated text.