Summary: This integration combines TwelveLabs’ Embed API with Databricks Mosaic AI Vector Search to create advanced video understanding applications. It transforms video content into multimodal embeddings that capture the relationships between visual expressions, body language, spoken words, and overall context, enabling powerful similarity search and recommendation systems.

Description: Integrating TwelveLabs with Databricks Mosaic AI addresses key challenges in video AI, such as efficient processing of large-scale video datasets and accurate multimodal content representation. The process involves these main steps:

Generate multimodal embeddings from video content using TwelveLabs’ Embed API
Store these embeddings along with video metadata in a Delta Table
Configure Mosaic AI Vector Search with a Delta Sync Index to access the embeddings
Generate text embeddings for search queries
Perform similarity searches between text queries and video content
Build a video recommendation system that suggests videos similar to a given video based on embedding similarities

Step-by-step guide: Our blog post, Mastering Multimodal AI: Advanced Video Understanding with TwelveLabs + Databricks Mosaic AI, guides you through setting up the environment, generating embeddings, and implementing the similarity search and recommendation functionalities.

Integration with TwelveLabs

This section describes how you can use the TwelveLabs Python SDK to create embeddings. The integration involves creating two types of embeddings:

Video embeddings from your video content
Text embeddings from queries

Video embeddings

The get_video_embeddings function creates a Pandas UDF to generate multimodal embeddings using TwelveLabs Embed API:

Python

1 from pyspark.sql.functions import pandas_udf
2 from pyspark.sql.types import ArrayType, FloatType
3 from twelvelabs.models.embed import EmbeddingsTask
4 import pandas as pd
5 
6 @pandas_udf(ArrayType(FloatType()))
7 def get_video_embeddings(urls: pd.Series) -> pd.Series:
8     def generate_embedding(video_url):
9         twelvelabs_client = TwelveLabs(api_key=TWELVE_LABS_API_KEY)
10         task = twelvelabs_client.embed.task.create(
11             engine_name="Marengo-retrieval-2.6",
12             video_url=video_url
13         )
14         task.wait_for_done()
15         task_result = twelvelabs_client.embed.task.retrieve(task.id)
16         embeddings = []
17         for v in task_result.video_embeddings:
18             embeddings.append({
19                 'embedding': v.embedding.float,
20                 'start_offset_sec': v.start_offset_sec,
21                 'end_offset_sec': v.end_offset_sec,
22                 'embedding_scope': v.embedding_scope
23             })
24         return embeddings
25 
26     def process_url(url):
27         embeddings = generate_embedding(url)
28         return embeddings[0]['embedding'] if embeddings else None
29 
30     return urls.apply(process_url)

For details on creating video embeddings, see the Create video embeddings page.

Text embeddings

The get_text_embedding function generates text embeddings:

Python

1 def get_text_embedding(text_query):
2     # TwelveLabs Embed API supports text-to-embedding
3     text_embedding = twelvelabs_client.embed.create(
4       engine_name="Marengo-retrieval-2.6",
5       text=text_query,
6       text_truncate="start"
7     )
8 
9     return text_embedding.text_embedding.float

For details on creating video embeddings, see the Create text embeddings page.

Similarity search

The similarity_search function generates an embedding for a text query, and uses the Mosaic AI Vector Search index to find similar videos:

Python

1 def similarity_search(query_text, num_results=5):
2     # Initialize the Vector Search client and get the query embedding
3     mosaic_client = VectorSearchClient()
4     query_embedding = get_text_embedding(query_text)
5 
6     print(f"Query embedding generated: {len(query_embedding)} dimensions")
7 
8     # Perform the similarity search
9     results = index.similarity_search(
10         query_vector=query_embedding,
11         num_results=num_results,
12         columns=["id", "url", "title"]
13     )
14     return results

Video recommendation

The get_video_recommendations takes a video ID and the number of recommendations to return as parameters and performs a similarity search to find the most similar videos.

Python

1 def get_video_recommendations(video_id, num_recommendations=5):
2     # Initialize the Vector Search client
3     mosaic_client = VectorSearchClient()
4 
5     # First, retrieve the embedding for the given video_id
6     source_df = spark.table("videos_source_embeddings")
7     video_embedding = source_df.filter(f"id = {video_id}").select("embedding").first()
8 
9     if not video_embedding:
10         print(f"No video found with id: {video_id}")
11         return []
12 
13     # Perform similarity search using the video's embedding
14     try:
15         results = index.similarity_search(
16             query_vector=video_embedding["embedding"],
17             num_results=num_recommendations + 1,  # +1 to account for the input video
18             columns=["id", "url", "title"]
19         )
20         
21         # Parse the results
22         recommendations = parse_search_results(results)
23         
24         # Remove the input video from recommendations if present
25         recommendations = [r for r in recommendations if r.get('id') != video_id]
26         
27         return recommendations[:num_recommendations]
28     except Exception as e:
29         print(f"Error during recommendation: {e}")
30         return []
31 
32 # Helper function to display recommendations
33 def display_recommendations(recommendations):
34     if recommendations:
35         print(f"Top {len(recommendations)} recommended videos:")
36         for i, video in enumerate(recommendations, 1):
37             print(f"{i}. Title: {video.get('title', 'N/A')}")
38             print(f"   URL: {video.get('url', 'N/A')}")
39             print(f"   Similarity Score: {video.get('score', 'N/A')}")
40             print()
41     else:
42         print("No recommendations found.")
43 
44 # Example usage
45 video_id = 1  # Assuming this is a valid video ID in your dataset
46 recommendations = get_video_recommendations(video_id)
47 display_recommendations(recommendations)

Next steps

After reading this page, you have the following options:

Customize and use the example: After implementing the basic integration, consider these improvements:
- Update and synchronize the index: Implement efficient incremental updates and scheduled synchronization jobs using Delta Lake features.
- Optimize performance and scaling: Leverage distributed processing, intelligent caching, and index partitioning for larger video libraries
- Monitoring and analytics: Track key performance metrics, implement feedback loops, and correlate capabilities with business metrics
- Explore further: Try the applications built by the community or our sample applications to get more insights into the TwelveLabs Video Understanding Platform’s diverse capabilities and learn more about integrating the platform into your applications.