MindsDB - The TwelveLabs handler

Summary: The TwelveLabs handler for MindsDB allows you to search and summarize video content directly within MindsDB, streamlining the integration of these features into your applications.

Description: This guide outlines how you can use the handler and how the handler interfaces with the TwelveLabs Video Understanding Platform to combine TwelveLabs’ state-of-the-art foundation models for video understanding with MindsDB’s platform for building customized AI solutions.

Step-by-step guide: Our blog post, Build a Powerful Video Summarization Tool with Twelve Lans, MindsDB, and Slack, walks you through the steps required to configure the TwelveLabs integration in MindsDB, deploy the TwelveLabs model for summarization within MindsDB, and automate the whole flow through a Slack bot that will periodically post the video summarizations as announcements.

GitHub: TwelveLabs Handler.

Use the handler

This section assumes the following:

  • You have an API key. To retrieve your API key, go to the API Key page and select the Copy icon to the right of the key to copy it to your clipboard.
  • You have a running MindsDB database inside of a Docker container. If not, see the Docker for MindsDB section of the MindsDB documentation

Typically, the steps for using the handler are as follows:

  1. Install the required dependencies inside the Docker container:

    $pip install mindsdb[twelve_labs]
  2. Open the MindsDB SQL Editor.

  3. Create an ML engine. Use the CREATE ML_ENGINE statement, replacing the placeholders surrounded by <> with your values:


  1. 1CREATE ML_ENGINE <YOUR_ENGINE_NAME>
    2from twelve_labs
    3USING
    4 twelve_labs_api_key = '<YOUR_API_KEY>'

    The example below creates an ML engine named twelve_labs_engine:

    1CREATE ML_ENGINE twelve_labs_engine
    2from twelve_labs
    3USING
    4 twelve_labs_api_key = 'tlk_111'
  2. Create a model. Use the CREATE_MODEL statement to create a model. The PREDICT clause specifies the name of the column that will contain the results of the task. The USING clause specifies the parameters for the model. The parameters depend on the task you want to perform. The available tasks are search and summarization, and the parameters for each task are described in the Creating Models section of the handler’s GitHub Readme file.
    The example below creates a model for the search task:

    1CREATE MODEL mindsdb.twelve_labs_search
    2PREDICT search_results
    3USING
    4 engine = 'twelve_labs_engine',
    5 task = 'search',
    6 engine_id = 'marengo2.6',
    7 index_name = 'index_1',
    8 index_options = ['visual', 'conversation', 'text_in_video', 'logo'],
    9 video_urls = ['https://.../video_1.mp4', 'https://.../video_2.mp4'],
    10 search_options = ['visual', 'conversation', 'text_in_video', 'logo'],
    11 search_query_column = 'query';

    The example below creates a model for the summarization task:

    1CREATE MODEL mindsdb.twelve_labs_summarization
    2PREDICT summarization_results
    3USING
    4 engine = 'twelve_labs_engine',
    5 task = 'summarization',
    6 engine_id = 'pegasus1',
    7 index_name = 'index_1',
    8 index_options = ['visual', 'conversation'],
    9 video_urls = ['https://.../video_1.mp4', 'https://.../video_2.mp4'],
    10 summarization_type = 'summary';
  3. (Optional) Check the status of the video indexing process. The TwelveLabs Video Understanding Platform requires some time to index videos. You can search or summarize your videos only after the indexing process is complete. Use the DESCRIBE statement to check the status of the indexing process, replacing the placeholder surrounded by <> with your the name your model:

    1DESCRIBE mindsdb.<YOUR_MODEL_NAME>;

    The example below checks the status of a model named twelve_labs_summarization:

    DESCRIBE mindsdb.twelve_labs_summarization;

    You should see the status as complete in the STATUS column. In case of an error, check the ERROR column, which contains detailed information about the error.

  4. Retrieve the identifiers of the indexed videos. Perform this step if you want to summarize a video. To retrieve the identifiers, use the DESCRIBE statement on the indexed_videos table of your model, replacing the placeholder surrounded by <> with the name of your model:

    DESCRIBE mindsdb.<YOUR_MODEL_NAME>.indexed_videos;

    The example below retrieves the identifiers of the videos uploaded to a model named twelve_labs_summarization:

    DESCRIBE mindsdb.twelve_labs_summarization.indexed_videos;
  5. Make predictions. Use the SELECT statement to make predictions using the model created in the previous step. The WHERE clause specifies the condition for the prediction. The condition depends on the task you want to perform. The available tasks are search and summarization, and the conditions for each task are described in the Making Predictions section of the handler’s GitHub Readme file:
    The example below performs a search request. Ensure you replace the placeholder surrounded by <> with your query

    Search:
    In the SQL query below, ensure you replace the placeholders surrounded by <> with your values:

    1SELECT *
    2FROM mindsdb.<YOUR_MODEL_NAME>
    3WHERE query = '<YOUR_QUERY>';

    The example below makes predictions for the search task using a model named twelve_labs_search:

    1SELECT *
    2FROM mindsdb.twelve_labs_search
    3WHERE query = 'Soccer player scoring a goal';

    Summarize:

    In the SQL query below, ensure you replace the placeholders surrounded by <> with your values:

    1SELECT *
    2FROM mindsdb.<YOUR_MODEL_NAME>
    3WHERE video_id = '<YOUR_VIDEO_ID>';

    The example below makes predictions for the summarization task using a model named twelve_labs_summarization:

    1SELECT *
    2FROM mindsdb.twelve_labs_summarization
    3WHERE video_id = '660bfa6766995fbd9fd662ee';

Integration with TwelveLabs

For brevity, the sections below outline the key components for integrating MindsDB with the TwelveLabs Video Understanding Platform:

  • Initialize a client
  • Create indexes
  • Upload videos
  • Perform downstream tasks such as search or classification.

For all the components, refer to the TwelveLabs Handler page on GitHub.

Initialize a client

The constructor sets up a new TwelveLabsAPIClient object that establishes a connection to the TwelveLabs Video Understanding Platform:

1def __init__(self, api_key: str, base_url: str = None):
2 """
3 The initializer for the TwelveLabsAPIClient.
4
5 Parameters
6 ----------
7 api_key : str
8 The TwelveLabs API key.
9 base_url : str, Optional
10 The base URL for the TwelveLabs API. Defaults to the base URL in the TwelveLabs handler settings.
11 """
12
13 self.api_key = api_key
14 self.headers = {
15 'Content-Type': 'application/json',
16 'x-api-key': self.api_key
17 }
18 self.base_url = base_url if base_url else twelve_labs_handler_config.BASE_URL

Create indexes

To create indexes, the create_index method invokes the POST method of the /indexes endpoint:

1def create_index(self, index_name: str, index_options: List[str], engine_id: Optional[str] = None, addons: Optional[List[str]] = None) -> str:
2 """
3 Create an index.
4
5 Parameters
6 ----------
7 index_name : str
8 Name of the index to be created.
9
10 index_options : List[str]
11 List of that specifies how the platform will process the videos uploaded to this index.
12
13 engine_id : str, Optional
14 ID of the engine. If not provided, the default engine is used.
15
16 addons : List[str], Optional
17 List of addons that should be enabled for the index.
18
19 Returns
20 -------
21 str
22 ID of the created index.
23 """
24
25 # TODO: change index_options to engine_options?
26 # TODO: support multiple engines per index?
27 body = {
28 "index_name": index_name,
29 "engines": [{
30 "engine_name": engine_id if engine_id else twelve_labs_handler_config.DEFAULT_ENGINE_ID,
31 "engine_options": index_options
32 }],
33 "addons": addons,
34 }
35
36 result = self._submit_request(
37 method="POST",
38 endpoint="/indexes",
39 data=body,
40 )
41
42 logger.info(f"Index {index_name} successfully created.")
43 return result['_id']

Upload videos

To upload videos to the TwelveLabs Video Understanding Platform and index them, the handler invokes the POST method of the /tasks endpoint:

1def create_video_indexing_tasks(self, index_id: str, video_urls: List[str] = None, video_files: List[str] = None) -> List[str]:
2 """
3 Create video indexing tasks.
4
5 Parameters
6 ----------
7 index_id : str
8 ID of the index.
9
10 video_urls : List[str], Optional
11 List of video urls to be indexed. Either video_urls or video_files should be provided. This validation is handled by TwelveLabsHandlerModel.
12
13 video_files : List[str], Optional
14 List of video files to be indexed. Either video_urls or video_files should be provided. This validation is handled by TwelveLabsHandlerModel.
15
16 Returns
17 -------
18 List[str]
19 List of task IDs created.
20 """
21
22 task_ids = []
23
24 if video_urls:
25 logger.info("video_urls has been set, therefore, it will be given precedence.")
26 logger.info("Creating video indexing tasks for video urls.")
27
28 for video_url in video_urls:
29 task_ids.append(
30 self._create_video_indexing_task(
31 index_id=index_id,
32 video_url=video_url
33 )
34 )
35
36 elif video_files:
37 logger.info("video_urls has not been set, therefore, video_files will be used.")
38 logger.info("Creating video indexing tasks for video files.")
39 for video_file in video_files:
40 task_ids.append(
41 self._create_video_indexing_task(
42 index_id=index_id,
43 video_file=video_file
44 )
45 )
46
47 return task_ids
48
49def _create_video_indexing_task(self, index_id: str, video_url: str = None, video_file: str = None) -> str:
50 """
51 Create a video indexing task.
52
53 Parameters
54 ----------
55 index_id : str
56 ID of the index.
57
58 video_url : str, Optional
59 URL of the video to be indexed. Either video_url or video_file should be provided. This validation is handled by TwelveLabsHandlerModel.
60
61 video_file : str, Optional
62 Path to the video file to be indexed. Either video_url or video_file should be provided. This validation is handled by TwelveLabsHandlerModel.
63
64 Returns
65 -------
66 str
67 ID of the created task.
68 """
69
70 body = {
71 "index_id": index_id,
72 }
73
74 file_to_close = None
75 if video_url:
76 body['video_url'] = video_url
77
78 elif video_file:
79 import mimetypes
80 # WE need the file open for the duration of the request. Maybe simplify it with context manager later, but needs _create_video_indexing_task re-written
81 file_to_close = open(video_file, 'rb')
82 mime_type, _ = mimetypes.guess_type(video_file)
83 body['video_file'] = (file_to_close.name, file_to_close, mime_type)
84
85 result = self._submit_multi_part_request(
86 method="POST",
87 endpoint="/tasks",
88 data=body,
89 )
90
91 if file_to_close:
92 file_to_close.close()
93
94 task_id = result['_id']
95 logger.info(f"Created video indexing task {task_id} for {video_url if video_url else video_file} successfully.")
96
97 # update the video title
98 video_reference = video_url if video_url else video_file
99 task = self._get_video_indexing_task(task_id=task_id)
100 self._update_video_metadata(
101 index_id=index_id,
102 video_id=task['video_id'],
103 metadata={
104 "video_reference": video_reference
105 }
106 )
107
108 return task_id

Once the video has been uploaded to the platform, the handler monitors the indexing process using the GET method of the /tasks/{task_id} endpoint:

1def poll_for_video_indexing_tasks(self, task_ids: List[str]) -> None:
2 """
3 Poll for video indexing tasks to complete.
4
5 Parameters
6 ----------
7 task_ids : List[str]
8 List of task IDs to be polled.
9
10 Returns
11 -------
12 None
13 """
14
15 for task_id in task_ids:
16 logger.info(f"Polling status of video indexing task {task_id}.")
17 is_task_running = True
18
19 while is_task_running:
20 task = self._get_video_indexing_task(task_id=task_id)
21 status = task['status']
22 logger.info(f"Task {task_id} is in the {status} state.")
23
24 wait_durtion = task['process']['remain_seconds'] if 'process' in task else twelve_labs_handler_config.DEFAULT_WAIT_DURATION
25
26 if status in ('pending', 'indexing', 'validating'):
27 logger.info(f"Task {task_id} will be polled again in {wait_durtion} seconds.")
28 time.sleep(wait_durtion)
29
30 elif status == 'ready':
31 logger.info(f"Task {task_id} completed successfully.")
32 is_task_running = False
33
34 else:
35 logger.error(f"Task {task_id} failed with status {task['status']}.")
36 # TODO: update Exception to be more specific
37 raise Exception(f"Task {task_id} failed with status {task['status']}.")
38
39 logger.info("All videos indexed successffully.")

Perform downstream tasks

The handler supports the following downstream tasks - search and summarize videos. See the sections below for details.

Search videos

To perform search requests, the handler invokes the POST method of the /search endpoint:

1def search_index(self, index_id: str, query: str, search_options: List[str]) -> Dict:
2 """
3 Search an index.
4
5 Parameters
6 ----------
7 index_id : str
8 ID of the index.
9
10 query : str
11 Query to be searched.
12
13 search_options : List[str]
14 List of search options to be used.
15
16 Returns
17 -------
18 Dict
19 Search results.
20 """
21
22 body = {
23 "index_id": index_id,
24 "query": query,
25 "search_options": search_options
26 }
27
28 data = []
29 result = self._submit_request(
30 method="POST",
31 endpoint="/search",
32 data=body,
33 )
34 data.extend(result['data'])
35
36 while 'next_page_token' in result['page_info']:
37 result = self._submit_request(
38 method="GET",
39 endpoint=f"/search/{result['page_info']['next_page_token']}"
40 )
41 data.extend(result['data'])
42
43 logger.info(f"Search for index {index_id} completed successfully.")
44 return data

Summarize videos

To summarize videos, the handler invokes the POST method of the summarize endpoint:

1 def summarize_videos(self, video_ids: List[str], summarization_type: str, prompt: str) -> Dict:
2 """
3 Summarize videos.
4
5 Parameters
6 ----------
7 video_ids : List[str]
8 List of video IDs.
9
10 summarization_type : str
11 Type of the summary to be generated. Supported types are 'summary', 'chapter' and 'highlight'.
12
13 prompt: str
14 Prompt to be used for the Summarize task
15
16 Returns
17 -------
18 Dict
19 Summary of the videos.
20 """
21
22 results = []
23 results = [self.summarize_video(video_id, summarization_type, prompt) for video_id in video_ids]
24
25 logger.info(f"Summarized videos {video_ids} successfully.")
26 return results
27
28 def summarize_video(self, video_id: str, summarization_type: str, prompt: str) -> Dict:
29 """
30 Summarize a video.
31
32 Parameters
33 ----------
34 video_id : str
35 ID of the video.
36
37 summarization_type : str
38 Type of the summary to be generated. Supported types are 'summary', 'chapter' and 'highlight'.
39
40 prompt: str
41 Prompt to be used for the Summarize task
42
43 Returns
44 -------
45 Dict
46 Summary of the video.
47 """
48 body = {
49 "video_id": video_id,
50 "type": summarization_type,
51 "prompt": prompt
52 }
53
54 result = self._submit_request(
55 method="POST",
56 endpoint="/summarize",
57 data=body,
58 )
59
60 logger.info(f"Video {video_id} summarized successfully.")
61 return result

Next steps

After reading this page, you have several options:

  • Use the handler: Inspect the TwelveLabs Handler page on GitHub to better understand its features and start using it in your applications.
  • Explore further: Try the applications built by the community or our sample applications to get more insights into the TwelveLabs Video Understanding Platform’s diverse capabilities and learn more about integrating the platform into your applications.
Built with