Social and public goods
The example projects on this page utilize the Twelve Labs Video Understanding Platform to create social and public goods. These projects demonstrate how multimodal AI can drive positive changes, exemplifying its transformative power.
Israel Palestine Video Understanding
Summary: The "Israel Palestine Video Understanding" application addresses misinformation and promotes empathy regarding the Israel-Palestine conflict.
Description: The application aggregates and summarizes content from YouTube and Reddit, presenting diverse viewpoints on the issue. These summaries, covering a range of opinions, are then visualized using an algorithm similar to T-SNE , offering a comprehensive understanding of the conflict's various perspectives. The application was developed by Sasha Sheng.
GitHub repo: Israel Palestine Video Understanding.
Integration with Twelve Labs
This application invokes the /summarize
endpoint to create summaries for videos based on their content, specifically focusing on their stance regarding the Israel-Palestine conflict and the level of violence depicted:
def generate_summary(videoID, videoID_to_filename):
SUMMARIZE_URL = f"{API_URL}/summarize"
headers = {
"x-api-key": API_KEY
}
data = {
"video_id": videoID,
"type": "summary",
"prompt": "Summarize if this video is pro-israel or pro-palestine or else and how violent it is."
}
response = requests.post(SUMMARIZE_URL, headers=headers, json=data)
print(f"{videoID}: status code - {response.status_code}")
summary_data = response.json()
print(summary_data)
with open(filename, 'a') as f:
writer = csv.writer(f, delimiter='\t')
writer.writerow([videoID, videoID_to_filename[videoID][0], videoID_to_filename[videoID][1], summary_data.get('summary')])
Accelerate SF Notifications
Summary: The "Accelerate SF Notifications" application simplifies public hearings for residents and special interest groups, particularly those focused on San Francisco housing developments.
Description: The application addresses the challenge of keeping up with numerous and lengthy public hearings, where the critical issue is identifying relevant discussions without watching entire meetings. The application was developed by Rahul Pal, Lloyd Chang, and Haonan Chen.
Key features include:
- Data scraping: Extract information from public agendas, live-streamed hearings, and sources like San Francisco Gov TV.
- Issue tracking: Utilize algorithms to pinpoint and extract discussions about housing projects and specific issues within hearings.
- Automated notifications: Implement a system that sends real-time alerts.
GitHub repo: Accelerate SF Notifications
Integration with Twelve Labs
The application uses the /summarize
endpoint to perform the following main functions: summarize videos and generate lists of chapters.
-
A summary encapsulates the key points of a video clearly. The code below shows how the application generates summaries:
data = { "video_id": "6545f931195730422cc38329", "type": "summary" } # Send request response = requests.post(f"{BASE_URL}/summarize", json=data, headers={"x-api-key": api_key})
-
A list of chapters provides a chronological breakdown of all the parts in a video. The following code shows how the application generates lists of chapters:
data = { "video_id": "6545f931195730422cc38329", "type": "chapter" } # Send request response = requests.post(f"{BASE_URL}/summarize", json=data, headers={"x-api-key": api_key})
The /gist
endpoint generates swift breakdowns of the essence of your videos in the form of titles, topics, and hashtags. The following code shows how the application invokes this endpoint:
data = {
"video_id": "6545f931195730422cc38329",
"types": [
"title",
"hashtag",
"topic"
]
}
# Send request
response = requests.post(f"{BASE_URL}/gist", json=data, headers={"x-api-key": api_key})
Deep Green
Summary: The "Dep Green" application uses the Twelve Labs Video Understanding Platform to accurately detect and map ocean trash using aerial and satellite imagery.
Description: The application offers a solution to the problem of plastic pollution in the oceans. It detects different types of ocean trash with over 90% accuracy and can scan over 500 hours of video daily. Trash is timestamped and geographically pinpointed, allowing easy data analysis and export. The application was developed by Shalini Ananda and Hans Walker.
GitHub: Deep Green
Integration with Twelve Labs
The search_trash
function searches for videos containing specific types of trash, returning a list of such videos with key information about each:
def search_trash(query, API_KEY):
data = {
"query": query,
"index_id": INDEX_ID,
"search_options": ["visual"]
}
response = requests.post(f"{API_URL}/search", headers={"x-api-key": API_KEY}, json=data)
response = response.json()
results = []
# Getting thumbnail and relevant data
for i in range(len(response['data'])):
score = response['data'][i]['score']
video_id = response['data'][i]['video_id']
video_location = Get_Video_Metadata(response['data'][i]['video_id'],API_KEY)['Location Type']
thumbnail_url = response['data'][i]['thumbnail_url']
results.append({"score": score,"video_location": video_location, "video_id": video_id, "thumbnail_url": thumbnail_url})
return results
The search_video_single
function finds specific content within a single video:
def search_video_single(video_id, query, API_KEY):
headers = {
"accept": "application/json",
"x-api-key": API_KEY,
"Content-Type": "application/json"}
data = {
"query": query,
"search_options": ["visual", "conversation", "text_in_video", "logo"],
"threshold": "high",
"filter": { "id": [video_id] },
"index_id": INDEX_ID }
response = requests.post(f"{API_URL}/search", headers=headers, json=data)
results = []
for i in range(len(response.json()['data'])):
score = response.json()['data'][i]['score']
video_id = response.json()['data'][i]['video_id']
results.append({"score": score, 'start_time':response.json()['data'][i]['start'],
'end_time':response.json()['data'][i]['end']})
The classify_latest_video
function classifies videos into specific environmental categories:
def classify_latest_video(id, file_name, API_KEY):
classify_url = f"{API_URL}/classify"
file_name = file_name.split('.')[0]
video_list = get_video_list(API_KEY)
time_initiated = time.time()
video_uploaded=True
video_index = 0
for i, next_video in enumerate(video_list):
if(next_video['metadata']['filename']==file_name):
video_uploaded=False
video_index = i
break
while(video_uploaded):
time.sleep(60)
video_list = get_video_list(API_KEY)
for i, next_video in enumerate(video_list):
print(next_video['metadata']['filename']," ",file_name)
if(next_video['metadata']['filename']==file_name):
video_uploaded=False
video_index = i
break
id = video_list[video_index]["_id"]
meta_url = f"{API_URL}/indexes/{INDEX_ID}/videos/{id}"
print("\n\nStarting Metadata",time.time()-time_initiated,"\n\n", file=sys.stderr)
payload = {
"page_limit": 10,
"include_clips": False,
"threshold": {
"min_video_score": 15,
"min_clip_score": 15,
"min_duration_ratio": 0.5
},
"show_detailed_score": False,
"options": ["conversation"],
"conversation_option": "semantic",
"classes": [
{
"prompts": ["This video is taken in an urban enviorment", "This means a dense environment", "Lots of people, cars and buildings"],
"options": ["visual"],
"conversation_option": "semantic",
"name": "Urban"
},
{
"prompts": ["This video is taken in a suburban enviorment", "There should be buildings, roads", "Everything should be a lot more spread out", "The majority of the space should be developed"],
"options": ["visual"],
"conversation_option": "semantic",
"name": "Suburban"
},
{
"prompts": ["This video was taken in a rural enviorment", "There shouldn't be a ton of human development", "Buildings should be extremly spread out", "Should mostly be nature", "Very few humans around"],
"options": ["visual"],
"conversation_option": "semantic",
"name": "Rural"
}
],
"video_ids": [id]
}
headers = {
"accept": "application/json",
"x-api-key": API_KEY,
"Content-Type": "application/json"
}
response = requests.post(classify_url, json=payload, headers=headers)
response = response.json()
print(response, file=sys.stderr)
video_class = response['data'][0]['classes'][0]['name']
payload = { "metadata": { "Location Type": video_class } }
headers = {
"accept": "application/json",
"x-api-key": API_KEY,
"Content-Type": "application/json"
}
response = requests.put(meta_url, json=payload, headers=headers)
RememberMe - Dementia Assistant
Summary: The project addresses the critical challenge of assisting individuals with dementia in retaining their independence and enhancing their quality of life.
Description: The application is a comprehensive digital support system with a home screen displaying the current date, important reminders, and action buttons. It also has a chatbot that users can use to ask questions about their lives. The application collects data such as video, audio, and personal notes, and it utilizes the Twelve Labs Video Understanding Platform to convert multimedia information into text for the chatbot database's organizational and storage purposes. The objective is to provide a seamless and intuitive platform that enables users to recall important details about their lives, manage daily tasks, and maintain connections with people and places that matter to them. The application was developed by Tatiane Wu Li, Pedro Goncalves de Paiva, Aleksei (Alex) Korablev, and Na Le.
GitHub: RememberMe.
Presentation: RememberMe.
Integration with Twelve Labs
The submit_video_for_processing
function uploads a video to the platform by invoking the POST
method of the /tasks/external-provider
endpoint. Upon receiving the response, the function processes it to determine the outcome. If the upload is successful, the function returns the unique identifier of the submitted task. In case of an error, the function returns an error message that details the specific reason for the failure. This helps developers identify and resolve any issues with the video upload process.
import requests
from pprint import pprint
# Constants
API_URL = "https://api.twelvelabs.io/v1.2"
API_KEY = "<YOUR_API_KEY>"
INDEX_ID = "<YOUR_INDEX_ID>" # Replace with your actual index ID obtained from creating an index
# Function to submit a video URL for processing by an external provider
def submit_video_for_processing(video_url):
"""Submit a video URL to an external processing service and return the task ID."""
TASKS_URL = f"{API_URL}/tasks/external-provider"
headers = {"x-api-key": API_KEY}
data = {"index_id": INDEX_ID, "url": video_url}
response = requests.post(TASKS_URL, headers=headers, json=data)
if response.status_code == 201:
task_id = response.json().get("_id")
print(f"Task submitted successfully. Task ID: {task_id}")
return task_id
else:
print(f"Failed to submit task: {response.status_code}")
pprint(response.json())
return None
# Example usage
video_url = "https://www.youtube.com/watch?v=TLwhqmf4Td4&ab_channel=RGSACHIN"
task_id = submit_video_for_processing(video_url)
The get_video_summary
function takes the unique identifier of a video as a parameter and invokes the POST
method of the /generate
endpoint to summarize it. If successful, it returns the generated summary; otherwise, it prints an error message and returns None
.
def get_video_summary(video_id):
GENERATE_URL = f"{API_URL}/generate" # Define the URL to generate the summary
data = {"video_id": video_id, "prompt": "Make a summary"} # Set up the data payload
response = requests.post(GENERATE_URL, headers=headers, json=data) # Make the POST request
if response.status_code == 200:
summary = response.json().get('data') # Get the summary data from the response
print("Video summary generated successfully.")
return summary # Return the summary
else:
print(f"Failed to generate summary: {response.status_code}") # Print failure message
pprint(response.json())
return None # Return None if summary generation fails
CamSense AI
Summary: "CamSense AI" is an AI-powered application that assesses webcam videos, providing instant insights and alerts. It uses the Twelve Labs Video Understanding Platform to analyze video content and identify significant changes or events.
Description: The application addresses the challenge of custom trigger creation based on content understanding of unattended recorded video. This solution is particularly useful in ecology, fire safety, and flood water level monitoring.
The typical workflow is as follows:
- The Twelve Labs Video Understanding Platform generates embeddings for the reference frame and the subsequent video clips and summarizes them.
- The application uses Groq to produce natural language descriptions of the differences.
- The application determines the significance of these differences.
- Clips that differ significantly are logged along with their timestamps and descriptions.
- The process concludes with the aggregation of all logs into a final report
The application was developed by Daniel Talero, Paul Kubie, and Todd Gardiner.
Colab notebook: hackathon.ipynb .
Integration with Twelve Labs
The code below creates a video indexing task that uploads a video to the Twelve Labs Video Understanding Platform by invoking the create
method of the task
object:
video_files = glob(reference_filename) # Example: "/videos/*.mp4
print(f"Uploading {reference_filename}")
task = client.task.create(index_id=index_obj.id, file=reference_filename, language="en")
print(f"Task id={task.id}")
print(f"Task_video_id = {task.video_id}")
ref_id = task.video_id
frame_id = []
if len(rawvids) > 0 :
for i in range(len(rawvids)):
video_files = glob( ("/content/rawdata/" + str(rawvids[i]) ) ) # Example: "/videos/*.mp4
print(f"Uploading {rawvids[i]}")
task = client.task.create(index_id=index_obj.id, file= ("/content/rawdata/" + str(rawvids[i]) ) , language="en")
print(f"Task id={task.id}")
frame_id.append(task.video_id)
The code below invokes the create
method of the embed.task
object to create an embedding for the reference frame:
task = client.embed.task.create(
engine_name="Marengo-retrieval-2.6",
video_url="https://storage.googleapis.com/lab-storage-items/sample-5s.mp4")
print(
f"Created task: id={task.id} engine_name={task.engine_name} status={task.status}"
)
def on_task_update(task: EmbeddingsTask):
print(f" Status={task.status}")
status = task.wait_for_done(
sleep_interval=5,
callback=on_task_update
)
print(f"Embedding done: {status}")
task = client.embed.task.retrieve(task.id)
if task.video_embeddings is not None:
for v in task.video_embeddings:
print(
f"embedding_scope={v.embedding_scope} start_offset_sec={v.start_offset_sec} end_offset_sec={v.end_offset_sec}"
)
print(f"embeddings: {', '.join([str(x) for x in v.embedding.float])}")
ref_emb = np.array([str(x) for x in v.embedding.float])
The code below creates embeddings for the subsequent frames:
fref_emb = []
for i in range(len(rawvids)):
task = client.embed.task.create(
engine_name="Marengo-retrieval-2.6",
video_url="https://storage.googleapis.com/lab-storage-items/sample-5s.mp4")
print(
f"Created task: id={task.id} engine_name={task.engine_name} status={task.status}"
)
status = task.wait_for_done(
sleep_interval=5,
callback=on_task_update
)
print(f"Embedding done: {status}")
task = client.embed.task.retrieve(task.id)
if task.video_embeddings is not None:
for v in task.video_embeddings:
print(
f"embedding_scope={v.embedding_scope} start_offset_sec={v.start_offset_sec} end_offset_sec={v.end_offset_sec}"
)
print(f"embeddings: {', '.join([str(x) for x in v.embedding.float])}")
fref_emb.append(np.array([str(x) for x in v.embedding.float]))
The code below invokes the summarize
method of the generate
object to summarize the reference frame:
res = client.generate.summarize(ref_id, type='summary', prompt="In a detailed way, describe this video clip." )
The code below summarizes each subsequent video and stores the results in a list:
The code below summarizes each subsequent video and stores the results in a list:
fres = []
fres_emb = []
for i in range(len(rawvids)):
res2 = client.generate.summarize(frame_id[i], type='summary', prompt="In a detailed way, describe this video clip." )
fres.append(res2)
fres_emb.append(model.encode(res2.summary))
Updated 15 days ago