Media and entertainment

The applications on this page demonstrate the capabilities of the Twelve Labs Video Understanding Platform in transforming how you interact with and consume digital content.

AI Sports Recap

Summary: AI Sports Recap is a Streamlit-based application that generates video highlights and textual summaries of sports press conferences from YouTube video links. The application utilizes the Pegasus video understanding engine, GPT-4o, and Docker to provide an efficient and user-friendly experience.

Description: The application streamlines the process of extracting essential information from sports press conferences. Users can input a YouTube video link and a specific query, and the application will generate relevant video highlights and a concise textual summary. This application is particularly useful for sports enthusiasts, journalists, and analysts who need to extract and share important information from lengthy press conferences quickly.

The application was developed by Prateek Chhikara, Omkar Masur, Tanmay Rode, and it has won second place at the Multimodal AI Media & Entertainment Hackathon.

GitHub: sports-highlights.

Integration with Twelve Labs

The code below initializes the Twelve Labs Python SDK and creates a new index for which it enables the Marengo and Pegasus video understanding engines:

client = TwelveLabs(api_key = api_key)

engines = [
        {
          "name": "marengo2.6",
          "options": ["visual", "conversation", "text_in_video", "logo"]
        },
        {
            "name": "pegasus1",
            "options": ["visual", "conversation"]
        }
  ]

index = client.index.create(
    name = "tlabs2",
    engines=engines,
    addons=["thumbnail"] # Optional
)
print(f"A new index has been created: id={index.id} name={index.name} engines={index.engines}")

The upload_video function uploads a video to YouTube:

def upload_video(index_id, video_url, transcription_url=None):
    print("INSIDE UPLOAD VIDEO FUNCTION")
    task = client.task.external_provider(
        index_id = index_id,
        url = video_url
        )
    
    print(f"Task id={task.id}")

    return task.id

The  get_transcript function generates the transcript for a specified video. For each segment, the function returns the text, start times, and end times in separate lists:

def get_transcript(index_id, video_id):
    print("INSIDE GET TRANSCRIPT FUNCTION")
    transcriptions = client.index.video.transcription(
        index_id = index_id,
        id = video_id
    )

    transcription_list = []
    start_points = []
    end_points = []

    for transcription in transcriptions:
        print(
            f"value={transcription.value} start={transcription.start} end={transcription.end}"
        )

        transcription_list.append(transcription.value)
        start_points.append(transcription.start)
        end_points.append(transcription.end)

    return transcription_list, start_points, end_points

ThirteenLabs Smart AI Editor

Summary: ThirteenLabs Smart AI Editor uses artificial intelligence to process videos based on user prompts. It creates highlight reels, generates descriptions, and dubs in multiple languages.

Description: The application utilizes AI models from TwelveLabs, Gemini, and ElevenLabs to deliver high-quality video editing and transcription services. It offers a user-friendly interface built with Gradio.

The application was developed by Dylan Ler and it has won first place at the Multimodal AI Media & Entertainment Hackathon.

Integration with Twelve Labs

The get_transcript function retrieves the transcript of a video using the Twelve Labs Python SDK. It returns the text, start time, and end time for each segment as a list of dictionaries.

def get_transcript(video_file_name, video_id_input, which_index):
    video_id = get_video_id(video_file_name)
    if video_id is None or video_id_input != "":
        video_id = video_id_input
    client = TwelveLabs(api_key="YOUR_API_KEY")
    transcriptions = client.index.video.transcription(index_id="INDEX_ID", id=f"{video_id}")
    output = []
    for transcription in transcriptions:
        output.append({"transciption": transcription.value, "start_time": transcription.start, "end_time": transcription.end})
    return output

Cactus

Summary: Cactus is a content generation application that uses functionalities of the Twelve Labs Video Understanding Platform to automatically transform long-form YouTube videos into engaging short-form reels. The application democratizes content creation, making it accessible and affordable for all creators, regardless of their resources.
Description: Cactus addresses the challenge of time-consuming video editing by bridging the gap between long-form and short-form content creation. The platform analyzes video content, identifies the most engaging moments, and compiles them into optimized highlight reels tailored for various social media platforms.

Cactus offers several key features and benefits for content creators:

  • Saves time by automating the editing process, allowing creators to focus on content creation.
  • Reduces costs by minimizing the need for professional editing services.
  • Enhances reach by enabling the quick production of more content.
  • Content creators can expand their audience across multiple platforms.
  • The application was developed by Saurabh Ghanekar, Noah Bergren, Christopher Kinoshita, and Shrutika Nikola.

GitHub: Cactus

Integration with Twelve Labs

The generate_segment_itinerary function invokes the POST method of the /generate endpoint to segment a video and create an itinerary based on activities and locations shown in the video. The function returns the segmented itinerary if successful or logs any errors encountered.

async def generate_segment_itinerary(video_id: str) -> str:
   url = f"{BASE_TWELVE_URL}/generate"
   payload = {
       "prompt": "Given the following video, segment the videos and provide corresponding timestamps based on the different activities and places that the subject does and visits so that the segmented videos can later be used to build an itinerary.\nMake the response concise & precise.",
       "video_id": f"{video_id}",
       "temperature": 0.4,
   }
   headers = {
       "accept": "application/json",
       "x-api-key": TWELVE_LABS_API_KEY,
       "Content-Type": "application/json",
   }


   logging.info("Generating segmented itinerary")
   async with aiohttp.ClientSession() as session:
       async with session.post(url, json=payload, headers=headers) as response:
           if response.status == 200:
               try:
                   result = await response.text()
                   segmented_itinerary = json.loads(result).get("data")
                   return segmented_itinerary
               except Exception as e:
                   logging.exception(e)
           else:
               logging.info(response.status)

Hello Garfield

Summary: The "Hello Garfield" application provides an immersive virtual reality experience combining traditional movie theaters with cutting-edge technology. It features a personalized AI concierge, themed environments, and interactive elements to enhance the movie-watching experience.
Description: The application transforms how you engage with movies in a virtual space. Upon entering the virtual theater, you are greeted by an AI concierge. This concierge offers personalized movie recommendations based on your preferences and viewing history.

Key features include:

  • Video Q&A chatbot: The application uses the Generate API to allow you to ask questions about the movies you're watching.
  • Immersive VR/MR environment: A realistic virtual movie theater with a large screen, created using VR/MR development platforms such as Unity and Unreal Engine.
  • AI concierge: A chatbot that provides personalized movie suggestions and enhances the user experience through friendly interaction.
  • Enhanced viewing experience: The concierge suggests themed snacks, recipes, and merchandise related to the chosen movie, creating a more immersive and enjoyable viewing experience.
  • AR filters: You can "try on" costumes from your favorite films and decorate your virtual spaces using augmented reality technology.
  • Community interaction: A shared virtual theater space that allows you to connect with other film enthusiasts, fostering a sense of community.

The application was developed by Lauren Descher, Dulce Baerga, and Catherine Rhee.

GitHub: aila-hack


Integration with Twelve Labs

The code below handles different types of requests:

  1. It checks the type of request and prepares appropriate data for each.
  2. For each type of request, it constructs a data object with a specific video ID and other relevant parameters.
  3. It sends a POST request to the /gist or /summarize endpoints.
  if (event.request === "gist") {
    if (event.trailername === "garfield") {

      data = {
        "video_id": "666581dbd22b3a3c97bf1d57",
        "types": [
          "title",
          "hashtag",
          "topic"
        ]
      };
    }
    data = {
      "video_id": "666581dbd22b3a3c97bf1d57",
      "types": [
        "title",
        "hashtag",
        "topic"
      ]
    };
    response = await fetch(baseUrl + "/gist", {
      method: "POST",
      headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
      body: JSON.stringify(data)
    });
  }
  else if (event.request === "summary") {
    // SUMMARY REQUESTED
    data = {
      "video_id": "666581dbd22b3a3c97bf1d57",
      "type": "summary"
    };
    if (event.trailername === "garfield") {
      data = {
        "video_id": "666581dbd22b3a3c97bf1d57",
        "type": "summary"
      };
      response = await fetch(baseUrl + "/summarize", {
        method: "POST",
        headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
        body: JSON.stringify(data)
      });
    }
    response = await fetch(baseUrl + "/summarize", {
      method: "POST",
      headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
      body: JSON.stringify(data)
    });

  }
  else if (event.request === "chapters") {
    data = {
      "video_id": "666581dbd22b3a3c97bf1d57",
      "type": "chapter"
    };

    if (event.trailername === "garfield") {
      data = {
        "video_id": "666581dbd22b3a3c97bf1d57",
        "type": "chapter"
      };
      response = await fetch(baseUrl + "/summarize", {
        method: "POST",
        headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
        body: JSON.stringify(data)
      });
    }
    response = await fetch(baseUrl + "/summarize", {
      method: "POST",
      headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
      body: JSON.stringify(data)
    });
  }
  else if (event.request === "highlights") {
    data = {
      "video_id": "666581dbd22b3a3c97bf1d57",
      "type": "highlight",
      "prompt": "tell me about food\n"
    };

    if (event.trailername === "garfield") {
      data = {
        "video_id": "666581dbd22b3a3c97bf1d57",
        "type": "highlight",
        "prompt": "tell me about food\n"
      };
      response = await fetch(baseUrl + "/summarize", {
        method: "POST",
        headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
        body: JSON.stringify(data)
      });
    }
    response = await fetch(baseUrl + "/summarize", {
      method: "POST",
      headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
      body: JSON.stringify(data)
    });
  }

Sports Recap

Summary: Sports Recap is a NextJS-based application that generates video highlights and summaries from sports press conferences.

Description: The application is designed to transform lengthy sports press conferences into concise, engaging highlight reels. The application utilizes the Generate API to create relevant highlights based on user-specified criteria.

The application was developed by Daniel Jacobs, Yurko Turskiy, Suxu Li, and Melissa Regan.

GitHub: hackathone-challenge-3

Integration with Twelve Labs

The function below extracts data from the incoming request and invokes the POST method of the /summarize endpoint, passing the unique identifier of a video and a prompt to generate highlights:

export async function POST(req: NextRequest) {
  const { projectId, videoId, prompt } = await req.json();
  const baseUrl = "https://api.twelvelabs.io/v1.2";
  const apiKey = process.env.TWELVELABS_API as string;
  const data = {
    prompt: prompt,
    video_id: videoId,
    type: "highlight",
  };

  // Send request
  const response = await fetch(baseUrl + "/summarize", {
    method: "POST",
    headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
    body: JSON.stringify(data),
  });

The function below extracts the unique identifier of a video from the incoming request and invokes the POST method of the /summarize endpoint. It passes the video ID along with a predefined prompt to generate a summary that identifies the main character and lists key topics discussed:

export async function POST(req: NextRequest) {
  const { videoId } = await req.json();

  // Variables
  const baseUrl = "https://api.twelvelabs.io/v1.2";
  const apiKey = process.env.TWELVELABS_API as string;
  const data = {
    video_id: videoId,
    type: "summary",
    prompt:
      "Specify the name of the main character of the video. Generate bullet list of key topics the main character is talking about",
  };

  // Send request
  const response = await fetch(baseUrl + "/summarize", {
    method: "POST",
    headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
    body: JSON.stringify(data),
  });