Social and public goods

The example projects on this page utilize the Twelve Labs Video Understanding Platform to create social and public goods. These projects demonstrate how multimodal AI can drive positive changes, exemplifying its transformative power.

Israel Palestine Video Understanding

Summary: The "Israel Palestine Video Understanding" application addresses misinformation and promotes empathy regarding the Israel-Palestine conflict.

Description: The application aggregates and summarizes content from YouTube and Reddit, presenting diverse viewpoints on the issue. These summaries, covering a range of opinions, are then visualized using an algorithm similar to T-SNE , offering a comprehensive understanding of the conflict's various perspectives. The application was developed by Sasha Sheng.

GitHub repo: Israel Palestine Video Understanding

Integration with Twelve Labs

This application invokes the /summarize endpoint to create summaries for videos based on their content, specifically focusing on their stance regarding the Israel-Palestine conflict and the level of violence depicted:

def generate_summary(videoID, videoID_to_filename):
    SUMMARIZE_URL = f"{API_URL}/summarize"
    headers = {
        "x-api-key": API_KEY
    }

    data = {
      "video_id": videoID,
      "type": "summary",
      "prompt": "Summarize if this video is pro-israel or pro-palestine or else and how violent it is."
    }

    response = requests.post(SUMMARIZE_URL, headers=headers, json=data)
    print(f"{videoID}: status code - {response.status_code}")

    summary_data = response.json()
    print(summary_data)

    with open(filename, 'a') as f:
        writer = csv.writer(f, delimiter='\t')
        writer.writerow([videoID, videoID_to_filename[videoID][0], videoID_to_filename[videoID][1], summary_data.get('summary')])

Accelerate SF Notifications

Summary: The "Accelerate SF Notifications" application simplifies public hearings for residents and special interest groups, particularly those focused on San Francisco housing developments.

Description: The application addresses the challenge of keeping up with numerous and lengthy public hearings, where the critical issue is identifying relevant discussions without watching entire meetings. The application was developed by Rahul Pal, Lloyd Chang, and Haonan Chen.

Key features include:

  • Data scraping: Extract information from public agendas, live-streamed hearings, and sources like San Francisco Gov TV.
  • Issue tracking: Utilize algorithms to pinpoint and extract discussions about housing projects and specific issues within hearings.
  • Automated notifications: Implement a system that sends real-time alerts.

GitHub repo: Accelerate SF Notifications

Integration with Twelve Labs

The application uses the /summarize endpoint to perform the following main functions: summarize videos and generate lists of chapters.

  • A summary encapsulates the key points of a video clearly. The code below shows how the application generates summaries:

    
    data = {
        "video_id": "6545f931195730422cc38329",
        "type": "summary"
    }
    
    # Send request
    response = requests.post(f"{BASE_URL}/summarize", json=data, headers={"x-api-key": api_key})
    
    
  • A list of chapters provides a chronological breakdown of all the parts in a video. The following code shows how the application generates lists of chapters:

    data = {
        "video_id": "6545f931195730422cc38329",
        "type": "chapter"
    }
    
    # Send request
    response = requests.post(f"{BASE_URL}/summarize", json=data, headers={"x-api-key": api_key})
    

The /gist endpoint generates swift breakdowns of the essence of your videos in the form of titles, topics, and hashtags. The following code shows how the application invokes this endpoint:

data = {
    "video_id": "6545f931195730422cc38329",
    "types": [
        "title",
        "hashtag",
        "topic"
    ]
}

# Send request
response = requests.post(f"{BASE_URL}/gist", json=data, headers={"x-api-key": api_key})

Deep Green

Summary: The "Dep Green" application uses the Twelve Labs Video Understanding Platform to accurately detect and map ocean trash using aerial and satellite imagery.

Description: The application offers a solution to the problem of plastic pollution in the oceans. It detects different types of ocean trash with over 90% accuracy and can scan over 500 hours of video daily. Trash is timestamped and geographically pinpointed, allowing easy data analysis and export. The application was developed by Shalini Ananda and Hans Walker.

GitHub: Deep Green

Integration with Twelve Labs

The search_trash function searches for videos containing specific types of trash, returning a list of such videos with key information about each:

def search_trash(query, API_KEY):

      data = {
            "query": query,
            "index_id": INDEX_ID,
            "search_options": ["visual"]
        }

      response = requests.post(f"{API_URL}/search", headers={"x-api-key": API_KEY}, json=data)

      response = response.json()


      results = []

      # Getting thumbnail and relevant data
      for i in range(len(response['data'])):
            score = response['data'][i]['score']
            video_id = response['data'][i]['video_id']
            video_location = Get_Video_Metadata(response['data'][i]['video_id'],API_KEY)['Location Type']
            thumbnail_url = response['data'][i]['thumbnail_url']
            results.append({"score": score,"video_location": video_location, "video_id": video_id, "thumbnail_url": thumbnail_url})

        return results

The search_video_single function finds specific content within a single video:

def search_video_single(video_id, query, API_KEY):


    headers = {
    "accept": "application/json",
    "x-api-key": API_KEY,
    "Content-Type": "application/json"}
    
    data = {
    "query": query,
    "search_options": ["visual", "conversation", "text_in_video", "logo"],
    "threshold": "high",
    "filter": { "id": [video_id] },
    "index_id": INDEX_ID }


    response = requests.post(f"{API_URL}/search", headers=headers, json=data)

    results = []

    for i in range(len(response.json()['data'])):
        score = response.json()['data'][i]['score']
        video_id = response.json()['data'][i]['video_id']
        results.append({"score": score, 'start_time':response.json()['data'][i]['start'], 
                        'end_time':response.json()['data'][i]['end']})
    

The classify_latest_video function classifies videos into specific environmental categories:

def classify_latest_video(id, file_name, API_KEY):
    classify_url = f"{API_URL}/classify"
    file_name = file_name.split('.')[0]

    video_list = get_video_list(API_KEY)
    time_initiated = time.time()
    video_uploaded=True
    video_index = 0
    for i, next_video in enumerate(video_list):
        if(next_video['metadata']['filename']==file_name):
            video_uploaded=False
            video_index = i
            break
    while(video_uploaded):
        time.sleep(60)
        video_list = get_video_list(API_KEY)
        for i, next_video in enumerate(video_list):
            print(next_video['metadata']['filename'],"   ",file_name)
            if(next_video['metadata']['filename']==file_name):
                video_uploaded=False
                video_index = i
                break
    
    id = video_list[video_index]["_id"]

    meta_url = f"{API_URL}/indexes/{INDEX_ID}/videos/{id}"

    print("\n\nStarting Metadata",time.time()-time_initiated,"\n\n", file=sys.stderr)
    payload = {
        "page_limit": 10,
        "include_clips": False,
        "threshold": {
            "min_video_score": 15,
            "min_clip_score": 15,
            "min_duration_ratio": 0.5
        },
        "show_detailed_score": False,
        "options": ["conversation"],
        "conversation_option": "semantic",
        "classes": [
            {
                "prompts": ["This video is taken in an urban enviorment", "This means a dense environment", "Lots of people, cars and buildings"],
                "options": ["visual"],
                "conversation_option": "semantic",
                "name": "Urban"
            },
            {
                "prompts": ["This video is taken in a suburban enviorment", "There should be buildings, roads", "Everything should be a lot more spread out", "The majority of the space should be developed"],
                "options": ["visual"],
                "conversation_option": "semantic",
                "name": "Suburban"
            },
            {
                "prompts": ["This video was taken in a rural enviorment", "There shouldn't be a ton of human development", "Buildings should be extremly spread out", "Should mostly be nature", "Very few humans around"],
                "options": ["visual"],
                "conversation_option": "semantic",
                "name": "Rural"
            }
        ],
        "video_ids": [id]
    }
    headers = {
        "accept": "application/json",
        "x-api-key": API_KEY,
        "Content-Type": "application/json"
    }

    response = requests.post(classify_url, json=payload, headers=headers)
    response = response.json()
    
    print(response, file=sys.stderr)
    video_class = response['data'][0]['classes'][0]['name']

    payload = { "metadata": { "Location Type": video_class } }
    headers = {
        "accept": "application/json",
        "x-api-key": API_KEY,
        "Content-Type": "application/json"
    }

    response = requests.put(meta_url, json=payload, headers=headers)