Media and entertainment

The applications on this page demonstrate the capabilities of the TwelveLabs Video Understanding Platform in transforming how you interact with and consume digital content.

Ghool

Summary: Ghool is a trailer creation application that uses the TwelveLabs Video Understanding Platform to search, retrieve, and concatenate video clips based on specific prompts and quality metrics.

Description: The application uses both the Marengo and Pegasus video understanding models to process video content and create optimized video sequences with minimal user intervention. It retrieves relevant video clips based on user prompts. It then processes these videos by concatenating them and evaluating the transitions between different videos based on specific quality metrics, such as smoothness.

Ghool was created by Fazil Onuralp Adic and Oscar Chen and was one of the winners at the Cerebral Beach Hacks – LA Tech Week 2024 Kickoff Hackathon.

GitHub repo: cerebral_ghool

Integration with TwelveLabs

Ghool integrates with the TwelveLabs Video Understanding Platform to search, retrieve, and analyze video content based on specific prompts. The integration enables automated evaluation of video transitions and quality assessment without manual intervention.

The find_video function uses the TwelveLabs Python SDK to perform text queries against a specified index. It filters results by confidence level and evaluates each clip’s quality against user-defined criteria.

Python
1def find_video(prompt,wanted,quality):
2
3page = client.search.query(index_id="66f1cde8163dbc55ba3bb220", query_text=prompt, options=["visual"])
4
5video_vec = []
6
7i = 0
8
9for clip in page.data:
10
11if clip.confidence == "high" and i<wanted+4:
12
13i+=1
14
15video_dict = {"id":clip.video_id,"start":clip.start, "end":clip.end}
16
17video_quality = get_comment(quality, video_dict)
18
19video_dict["quality"] =video_quality
20
21video_vec.append(video_dict)
22
23video_vec = sorted(video_vec, key=lambda x: x["quality"], reverse=True)[:wanted]
24
25return video_vec

The get_video function retrieves the HLS streaming URL for a video by its ID and uses MoviePy to extract the relevant segment.

Python
1def get_video(video_info, save_file=None, duration=9999):
2 url = f"https://api.twelvelabs.io/v1.2/indexes/66f1cde8163dbc55ba3bb220/videos/{video_info['id']}"
3 response = requests.get(url, headers=headers)
4 video_url = response.json()["hls"]["video_url"]
5 start = video_info["start"]
6 end = video_info["end"]
7 if duration < video_info["end"]-video_info["start"]:
8 end = video_info["start"]+duration
9 clip = VideoFileClip(video_url).subclip(start, end)
10 # Additional handling for saving or previewing

The get_comment function utlizies the Generate API to evaluate video clips, producing numeric quality scores for criteria like “scariness” or “smoothness of transitions.”

Python
1def get_comment(prompt, video, headers=headers):
2
3url = "https://api.twelvelabs.io/v1.2/generate"
4
5
6
7payload = {
8
9"temperature": 0.7,
10
11"prompt": f"only output a number from 0-100 evaluating the following clip on these measures: {prompt}",
12
13"stream": False,
14
15"video_id": video["id"]
16
17}
18
19
20
21response = requests.post(url, json=payload, headers=headers)
22
23
24
25return response.json()["data"]

The compare_transition function creates and evaluates transitions between video clips by:

  • Concatenating pairs of videos
  • Uploading the concatenated videos to TwelveLabs
  • Analyzing transition quality using the get_comment function described above
  • Ranking the transitions based on smoothness
Python
1def compare_transition(videos1,videos2,transition_name,wanted=2):
2
3combined_videos = []
4
5
6
7# Iterate over all combinations of videos1 and videos2
8
9for i, video1 in enumerate(videos1):
10
11for j, video2 in enumerate(videos2):
12
13concatenate_videos([video1,video2],f"{transition_name}{i}{j}.mp4")
14
15task = client_mine.task.create(index_id=upload_id, file=f"{transition_name}{i}{j}.mp4", language="en")
16
17task.wait_for_done(sleep_interval=10, callback=on_task_update)
18
19quality = get_comment("smoothness of the transition between the two stitched videos", {"id":task.video_id},headers_mine)
20
21video_props = {"name": f"{transition_name}{i}{j}.mp4", "id":task.video_id,"quality":quality}
22
23combined_videos.append(video_props)
24
25
26
27combined_videos = sorted(combined_videos, key=lambda x: x["quality"], reverse=True)[:wanted]
28
29
30
31return combined_videos

AI Sports Recap

Summary: AI Sports Recap is a Streamlit-based application that generates video highlights and textual summaries of sports press conferences from YouTube video links. The application utilizes the Pegasus video understanding engine, GPT-4o, and Docker to provide an efficient and user-friendly experience.

Description: The application streamlines the process of extracting essential information from sports press conferences. Users can input a YouTube video link and a specific query, and the application will generate relevant video highlights and a concise textual summary. This application is particularly useful for sports enthusiasts, journalists, and analysts who need to extract and share important information from lengthy press conferences quickly.

The application was developed by Prateek Chhikara, Omkar Masur, Tanmay Rode, and it has won second place at the Multimodal AI Media & Entertainment Hackathon.

GitHub: sports-highlights.

Integration with TwelveLabs

The code below initializes the TwelveLabs Python SDK and creates a new index for which it enables the Marengo and Pegasus video understanding engines:

1client = TwelveLabs(api_key = api_key)
2
3engines = [
4 {
5 "name": "marengo2.6",
6 "options": ["visual", "conversation", "text_in_video", "logo"]
7 },
8 {
9 "name": "pegasus1",
10 "options": ["visual", "conversation"]
11 }
12 ]
13
14index = client.index.create(
15 name = "tlabs2",
16 engines=engines,
17 addons=["thumbnail"] # Optional
18)
19print(f"A new index has been created: id={index.id} name={index.name} engines={index.engines}")

The upload_video function uploads a video to YouTube:

1def upload_video(index_id, video_url, transcription_url=None):
2 print("INSIDE UPLOAD VIDEO FUNCTION")
3 task = client.task.external_provider(
4 index_id = index_id,
5 url = video_url
6 )
7
8 print(f"Task id={task.id}")
9
10 return task.id

The  get_transcript function generates the transcript for a specified video. For each segment, the function returns the text, start times, and end times in separate lists:

1def get_transcript(index_id, video_id):
2 print("INSIDE GET TRANSCRIPT FUNCTION")
3 transcriptions = client.index.video.transcription(
4 index_id = index_id,
5 id = video_id
6 )
7
8 transcription_list = []
9 start_points = []
10 end_points = []
11
12 for transcription in transcriptions:
13 print(
14 f"value={transcription.value} start={transcription.start} end={transcription.end}"
15 )
16
17 transcription_list.append(transcription.value)
18 start_points.append(transcription.start)
19 end_points.append(transcription.end)
20
21 return transcription_list, start_points, end_points

ThirteenLabs Smart AI Editor

Summary: ThirteenLabs Smart AI Editor uses artificial intelligence to process videos based on user prompts. It creates highlight reels, generates descriptions, and dubs in multiple languages.

Description: The application utilizes AI models from TwelveLabs, Gemini, and ElevenLabs to deliver high-quality video editing and transcription services. It offers a user-friendly interface built with Gradio.

The application was developed by Dylan Ler and it has won first place at the Multimodal AI Media & Entertainment Hackathon.

Integration with TwelveLabs

The get_transcript function retrieves the transcript of a video using the TwelveLabs Python SDK. It returns the text, start time, and end time for each segment as a list of dictionaries.

1def get_transcript(video_file_name, video_id_input, which_index):
2 video_id = get_video_id(video_file_name)
3 if video_id is None or video_id_input != "":
4 video_id = video_id_input
5 client = TwelveLabs(api_key="YOUR_API_KEY")
6 transcriptions = client.index.video.transcription(index_id="INDEX_ID", id=f"{video_id}")
7 output = []
8 for transcription in transcriptions:
9 output.append({"transciption": transcription.value, "start_time": transcription.start, "end_time": transcription.end})
10 return output

Cactus

Summary: Cactus is a content generation application that uses functionalities of the TwelveLabs Video Understanding Platform to automatically transform long-form YouTube videos into engaging short-form reels. The application democratizes content creation, making it accessible and affordable for all creators, regardless of their resources.
Description: Cactus addresses the challenge of time-consuming video editing by bridging the gap between long-form and short-form content creation. The platform analyzes video content, identifies the most engaging moments, and compiles them into optimized highlight reels tailored for various social media platforms.

Cactus offers several key features and benefits for content creators:

  • Saves time by automating the editing process, allowing creators to focus on content creation.
  • Reduces costs by minimizing the need for professional editing services.
  • Enhances reach by enabling the quick production of more content.
  • Content creators can expand their audience across multiple platforms.
  • The application was developed by Saurabh Ghanekar, Noah Bergren, Christopher Kinoshita, and Shrutika Nikola.

GitHub: Cactus

Integration with TwelveLabs

The generate_segment_itinerary function invokes the POST method of the /generate endpoint to segment a video and create an itinerary based on activities and locations shown in the video. The function returns the segmented itinerary if successful or logs any errors encountered.

1async def generate_segment_itinerary(video_id: str) -> str:
2 url = f"{BASE_TWELVE_URL}/generate"
3 payload = {
4 "prompt": "Given the following video, segment the videos and provide corresponding timestamps based on the different activities and places that the subject does and visits so that the segmented videos can later be used to build an itinerary.\nMake the response concise & precise.",
5 "video_id": f"{video_id}",
6 "temperature": 0.4,
7 }
8 headers = {
9 "accept": "application/json",
10 "x-api-key": TWELVE_LABS_API_KEY,
11 "Content-Type": "application/json",
12 }
13
14
15 logging.info("Generating segmented itinerary")
16 async with aiohttp.ClientSession() as session:
17 async with session.post(url, json=payload, headers=headers) as response:
18 if response.status == 200:
19 try:
20 result = await response.text()
21 segmented_itinerary = json.loads(result).get("data")
22 return segmented_itinerary
23 except Exception as e:
24 logging.exception(e)
25 else:
26 logging.info(response.status)

Hello Garfield

Summary: The “Hello Garfield” application provides an immersive virtual reality experience combining traditional movie theaters with cutting-edge technology. It features a personalized AI concierge, themed environments, and interactive elements to enhance the movie-watching experience.
Description: The application transforms how you engage with movies in a virtual space. Upon entering the virtual theater, you are greeted by an AI concierge. This concierge offers personalized movie recommendations based on your preferences and viewing history.

Key features include:

  • Video Q&A chatbot: The application uses the Generate API to allow you to ask questions about the movies you’re watching.
  • Immersive VR/MR environment: A realistic virtual movie theater with a large screen, created using VR/MR development platforms such as Unity and Unreal Engine.
  • AI concierge: A chatbot that provides personalized movie suggestions and enhances the user experience through friendly interaction.
  • Enhanced viewing experience: The concierge suggests themed snacks, recipes, and merchandise related to the chosen movie, creating a more immersive and enjoyable viewing experience.
  • AR filters: You can “try on” costumes from your favorite films and decorate your virtual spaces using augmented reality technology.
  • Community interaction: A shared virtual theater space that allows you to connect with other film enthusiasts, fostering a sense of community.

The application was developed by Lauren Descher, Dulce Baerga, and Catherine Rhee.

GitHub: aila-hack


Integration with TwelveLabs

The code below handles different types of requests:

  1. It checks the type of request and prepares appropriate data for each.
  2. For each type of request, it constructs a data object with a specific video ID and other relevant parameters.
  3. It sends a POST request to the /gist or /summarize endpoints.
1 if (event.request === "gist") {
2 if (event.trailername === "garfield") {
3
4 data = {
5 "video_id": "666581dbd22b3a3c97bf1d57",
6 "types": [
7 "title",
8 "hashtag",
9 "topic"
10 ]
11 };
12 }
13 data = {
14 "video_id": "666581dbd22b3a3c97bf1d57",
15 "types": [
16 "title",
17 "hashtag",
18 "topic"
19 ]
20 };
21 response = await fetch(baseUrl + "/gist", {
22 method: "POST",
23 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
24 body: JSON.stringify(data)
25 });
26 }
27 else if (event.request === "summary") {
28 // SUMMARY REQUESTED
29 data = {
30 "video_id": "666581dbd22b3a3c97bf1d57",
31 "type": "summary"
32 };
33 if (event.trailername === "garfield") {
34 data = {
35 "video_id": "666581dbd22b3a3c97bf1d57",
36 "type": "summary"
37 };
38 response = await fetch(baseUrl + "/summarize", {
39 method: "POST",
40 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
41 body: JSON.stringify(data)
42 });
43 }
44 response = await fetch(baseUrl + "/summarize", {
45 method: "POST",
46 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
47 body: JSON.stringify(data)
48 });
49
50 }
51 else if (event.request === "chapters") {
52 data = {
53 "video_id": "666581dbd22b3a3c97bf1d57",
54 "type": "chapter"
55 };
56
57 if (event.trailername === "garfield") {
58 data = {
59 "video_id": "666581dbd22b3a3c97bf1d57",
60 "type": "chapter"
61 };
62 response = await fetch(baseUrl + "/summarize", {
63 method: "POST",
64 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
65 body: JSON.stringify(data)
66 });
67 }
68 response = await fetch(baseUrl + "/summarize", {
69 method: "POST",
70 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
71 body: JSON.stringify(data)
72 });
73 }
74 else if (event.request === "highlights") {
75 data = {
76 "video_id": "666581dbd22b3a3c97bf1d57",
77 "type": "highlight",
78 "prompt": "tell me about food\n"
79 };
80
81 if (event.trailername === "garfield") {
82 data = {
83 "video_id": "666581dbd22b3a3c97bf1d57",
84 "type": "highlight",
85 "prompt": "tell me about food\n"
86 };
87 response = await fetch(baseUrl + "/summarize", {
88 method: "POST",
89 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
90 body: JSON.stringify(data)
91 });
92 }
93 response = await fetch(baseUrl + "/summarize", {
94 method: "POST",
95 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
96 body: JSON.stringify(data)
97 });
98 }

Sports Recap

Summary: Sports Recap is a NextJS-based application that generates video highlights and summaries from sports press conferences.

Description: The application is designed to transform lengthy sports press conferences into concise, engaging highlight reels. The application utilizes the Generate API to create relevant highlights based on user-specified criteria.

The application was developed by Daniel Jacobs, Yurko Turskiy, Suxu Li, and Melissa Regan.

GitHub: hackathone-challenge-3

Integration with TwelveLabs

The function below extracts data from the incoming request and invokes the POST method of the /summarize endpoint, passing the unique identifier of a video and a prompt to generate highlights:

1export async function POST(req: NextRequest) {
2 const { projectId, videoId, prompt } = await req.json();
3 const baseUrl = "https://api.twelvelabs.io/v1.2";
4 const apiKey = process.env.TWELVELABS_API as string;
5 const data = {
6 prompt: prompt,
7 video_id: videoId,
8 type: "highlight",
9 };
10
11 // Send request
12 const response = await fetch(baseUrl + "/summarize", {
13 method: "POST",
14 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
15 body: JSON.stringify(data),
16 });

The function below extracts the unique identifier of a video from the incoming request and invokes the POST method of the /summarize endpoint. It passes the video ID along with a predefined prompt to generate a summary that identifies the main character and lists key topics discussed:

1export async function POST(req: NextRequest) {
2 const { videoId } = await req.json();
3
4 // Variables
5 const baseUrl = "https://api.twelvelabs.io/v1.2";
6 const apiKey = process.env.TWELVELABS_API as string;
7 const data = {
8 video_id: videoId,
9 type: "summary",
10 prompt:
11 "Specify the name of the main character of the video. Generate bullet list of key topics the main character is talking about",
12 };
13
14 // Send request
15 const response = await fetch(baseUrl + "/summarize", {
16 method: "POST",
17 headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
18 body: JSON.stringify(data),
19 });
Was this page helpful?
Built with