Generate text from video

This quickstart guide provides a simplified introduction to generating text from video using the TwelveLabs Video Understanding Platform. It includes:

  • A basic working example
  • Minimal implementation details
  • Core parameters for common use cases

For a comprehensive guide, see the Generate text from video section.

Prerequisites

  • To use the platform, you need an API key:

    1

    If you don’t have an account, sign up for a free account.

    2

    Go to the API Key page.

    3

    Select the Copy icon next to your key.

  • Ensure the TwelveLabs SDK is installed on your computer:

    $pip install twelvelabs
  • The videos you wish to use must meet the following requirements:

    • Video resolution: Must be at least 360x360 and must not exceed 3840x2160.
    • Aspect ratio: Must be one of 1:1, 4:3, 4:5, 5:4, 16:9, or 9:16.
    • Video and audio formats: Your video files must be encoded in the video and audio formats listed on the FFmpeg Formats Documentation page. For videos in other formats, contact us at support@twelvelabs.io.
    • Duration: Must be between 4 seconds and 60 minutes (3600s). In a future release, the maximum duration will be 2 hours (7,200 seconds).
    • File size: Must not exceed 2 GB.
      If you require different options, contact us at support@twelvelabs.io.

Starter code

You can copy and paste the code below to generate text from video. Replace the placeholders surrounded by <> with your values.

1from twelvelabs import TwelveLabs
2
3client = TwelveLabs(api_key="<YOUR_API_KEY>")
4
5index = client.index.create(name="<YOUR_INDEX_NAME>", models=[{"name": "pegasus1.2", "options": ["visual", "audio"]}])
6print(f"Created index: id={index.id} name={index.name}")
7
8task = client.task.create(index_id=index.id, url="<YOUR_VIDEO_URL>")
9print(f"Created task: id={task.id}")
10task.wait_for_done(sleep_interval=50, callback=lambda t: print(f" Status={t.status}"))
11if task.status != "ready":
12 raise RuntimeError(f"Indexing failed with status {task.status}")
13print(f"Upload complete. The unique identifier of your video is {task.video_id}.")
14
15gist = client.generate.gist(video_id=task.video_id,types=["title", "topic", "hashtag"])
16print(f"Title={gist.title}\nTopics={gist.topics}\nHashtags={gist.hashtags}")

Step-by-step guide

1

Import the SDK and initialize the client

Create a client instance to interact with the TwelveLabs Video Understanding platform.

2

Create an index

Indexes help you organize and search through related videos efficiently. To create an index, you must provide its name and the video understanding models you wish to enable. This example uses Pegasus as the model and specifies that the model should analyze the visual and audio modalities. See the Indexes page for more details on creating an index.

3

Upload videos

To perform any downstream tasks, you must first upload your videos, and the platform must finish processing them.

4

Monitor the indexing process

The platform requires some time to index videos. Check the status of the video indexing task until it’s completed.

5

Generate titles, topics, and hashtags

Generate one or more of the following types of text: titles, topics, and hashtags.

6

Process the results

This example prints the generated text to the standard output.