For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sample appsIntegrationsDiscordPlaygroundDevEx repo
GuidesSDK ReferenceAPI Reference
GuidesSDK ReferenceAPI Reference
  • Get Started
    • Introduction
    • Quickstart
    • Manage your plan
    • Rate limits
    • Release notes
    • Migration guide
  • Guides
    • Search
    • Analyze videos
    • Segment videos
    • Create embeddings
  • Concepts
    • Models
    • Upload and processing methods
    • Indexes
    • Modalities
    • Multimodal large language models
  • Cloud partner integrations
    • Amazon Bedrock
  • Advanced
    • Organizations
    • Fine-tuning
    • Webhooks
    • Metadata
    • Model context protocol
    • Claude Code Plugin
  • Resources
    • Platform overview
    • Playground
    • TwelveLabs SDKs
    • Frequently asked questions
    • Use cases
    • Sample applications
    • Partner integrations
      • Adobe Premiere Pro Plugin
      • ApertureDB - Semantic video search engine
      • Backblaze B2 - Media management application
      • Chroma - Multimodal RAG: Chat with Videos
      • Databricks - Advanced video understanding
      • LanceDB - Building advanced video understanding applications
      • Langflow - Building smart video agents
      • Milvus - Advanced video search
      • MindsDB - The TwelveLabs handler
      • MongoDB - Semantic video search
      • Oracle - Unleashing Video Intelligence
      • Pinecone - Multimodal RAG
      • Qdrant - Building a semantic video search workflow
      • Snowflake - Multimodal Video Understanding
      • Vespa - Multivector video retrieval
      • VideoDB - Real-time video understanding
      • Voxel51 - Semantic video search plugin
      • Weaviate - Leveraging RAG for Improved Video Processing Times
    • From the community
LogoLogo
Sample appsIntegrationsDiscordPlaygroundDevEx repo
On this page
  • Integration with TwelveLabs
  • Create an index
  • Upload videos
  • Perform semantic searches
  • Next steps
ResourcesPartner integrations

Voxel51 - Semantic video search plugin

Was this page helpful?
Previous

Weaviate - Leveraging RAG for Improved Video Processing Times

Next
Built with

Summary: The Semantic Video Search plugin integrates Voxel FiftyOne, an open-source tool for building and enhancing machine learning datasets, with the TwelveLabs Video Understanding Platform, enabling you to perform semantic searches across multiple modalities.

Description: The plugin allows you to accurately identify movements, actions, objects, people, sounds, on-screen text, and speech. For example, this feature is helpful in scenarios where you need to quickly locate and analyze specific scenes based on actions or spoken words, significantly improving your efficiency in categorizing and analyzing video data.

Code explanation: Our blog post, Search Your Videos Semantically with TwelveLabs and FiftyOne Plugin, walks you through the steps required to create this plugin from scratch.

GitHub: Semantic Video Search

Integration with TwelveLabs

The integration with the TwelveLabs Video Understanding Platform is comprised of three distinct steps:

  • Create an index
  • Upload videos
  • Perform semantic searches

Create an index

The plugin invokes the POST method of the /indexes endpoint to create an index and enable the Marengo video understanding engine with the engine options that the user has selected:

Python
1INDEX_NAME = ctx.params.get("index_name")
2
3INDEXES_URL = f"{API_URL}/indexes"
4
5headers = {
6 "x-api-key": API_KEY
7}
8
9so = []
10
11if ctx.params.get("visual"):
12 so.append("visual")
13if ctx.params.get("logo"):
14 so.append("logo")
15if ctx.params.get("text_in_video"):
16 so.append("text_in_video")
17if ctx.params.get("conversation"):
18 so.append("conversation")
19
20data = {
21"engine_id": "marengo2.7",
22"index_options": so,
23"index_name": INDEX_NAME,
24}
25
26response = requests.post(INDEXES_URL, headers=headers, json=data)

Upload videos

The plugin invokes the POST method of the /tasks endpoint. Then, it monitors the indexing process using the GET method of the /tasks/{task_id} endpoint:

Python
1TASKS_URL = f"{API_URL}/tasks"
2
3videos = target_view
4for sample in videos:
5 if sample.metadata.duration < 4:
6 continue
7 else:
8 file_name = sample.filepath.split("/")[-1]
9 file_path = sample.filepath
10 file_stream = open(file_path,"rb")
11
12 headers = {
13 "x-api-key": API_KEY
14 }
15
16 data = {
17 "index_id": INDEX_ID,
18 "language": "en"
19 }
20
21 file_param=[
22 ("video_file", (file_name, file_stream, "application/octet-stream")),]
23
24 response = requests.post(TASKS_URL, headers=headers, data=data, files=file_param)
25 TASK_ID = response.json().get("_id")
26 print (f"Status code: {response.status_code}")
27 pprint (response.json())
28
29 TASK_STATUS_URL = f"{API_URL}/tasks/{TASK_ID}"
30 while True:
31 response = requests.get(TASK_STATUS_URL, headers=headers)
32 STATUS = response.json().get("status")
33 if STATUS == "ready":
34 break
35 time.sleep(10)
36
37 VIDEO_ID = response.json().get('video_id')
38 sample["TwelveLabs " + INDEX_NAME] = VIDEO_ID
39 sample.save()

Perform semantic searches

The plugin invokes the POST method of the /search endpoint to search across the sources of information that the user has selected:

Python
1SEARCH_URL = f"{API_URL}/search"
2
3headers = {
4"x-api-key": API_KEY
5}
6
7so = []
8
9if ctx.params.get("visual"):
10 so.append("visual")
11if ctx.params.get("logo"):
12 so.append("logo")
13if ctx.params.get("text_in_video"):
14 so.append("text_in_video")
15if ctx.params.get("conversation"):
16 so.append("conversation")
17
18data = {
19"query": prompt,
20"index_id": INDEX_ID,
21"search_options": so,
22}
23
24response = requests.post(SEARCH_URL, headers=headers, json=data)
25video_ids = [entry['video_id'] for entry in response.json()['data']]
26print(response.json())
27samples = []
28view1 = target_view.select_by("TwelveLabs " + INDEX_NAME, video_ids,ordered=True)
29start = [entry['start'] for entry in response.json()['data']]
30end = [entry['end'] for entry in response.json()['data']]
31if "results" in ctx.dataset.get_field_schema().keys():
32 ctx.dataset.delete_sample_field("results")
33
34i=0
35for sample in view1:
36 support = [int(start[i]*sample.metadata.frame_rate)+1 ,int(end[i]*sample.metadata.frame_rate)+1]
37 sample["results"] = fo.TemporalDetection(label=prompt, support=tuple(support))
38 sample.save()
39
40view2 = view1.to_clips("results")
41ctx.trigger("set_view", {"view": view2._serialize()})
42
43return {}

Next steps

After reading this page, you have several options:

  • Use the plugin as-is: Inspect the source code to better understand the platform’s features and start using the plugin immediately.
  • Customize and enhance the plugin: Feel free to modify the code to meet your specific requirements.
  • Explore further: Try the applications built by the community or our sample applications to get more insights into the TwelveLabs Video Understanding Platform’s diverse capabilities and learn more about integrating the platform into your applications.