Examples
This page shows examples of using the Mengo and Pegasus video understanding models. Note that the screenshots in the sections below are from the Playground. However, the principles demonstrated are similar when invoking the API programmatically.
Marengo
This section contains examples of using the Marengo video understanding model.
Steve Jobs introducing the iPhone
In the example screenshot below, the query was “How did Steve Jobs introduce the iPhone?“. The Marengo video understanding model used information found in the visual and conversation modalities to perform the following tasks:
- Visual recognition of a famous person (Steve Jobs)
- Joint speech and visual recognition to semantically search for the moment when Steve Jobs introduced the iPhone. Note that semantic search finds information based on the intended meaning of the query rather than the literal words you used, meaning that the platform identified the matching video fragments even if Steve Jobs didn’t explicitly say the words in the query.
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.
Polar bear holding a Coca-Cola bottle
In the example screenshot below, the query was “Polar bear holding a Coca-Cola bottle.” The Marengo video understanding model used information found in the visual and logo modalities to perform the following tasks:
- Recognition of a cartoon character (polar bear)
- Identification of an object (bottle)
- Detection of a specific brand logo (Coca-Cola)
- Identification of an action (polar bear holding a bottle)
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.
Using different languages
This section provides examples of using different languages to perform search requests.
Spanish
In the example screenshot below, the query was “¿Cómo presentó Steve Jobs el iPhone?” (“How did Steve Jobs introduce the iPhone?”). The Marengo video understanding model used information from the visual and audio modalities.
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.
Chinese
In the example screenshot below, the query was “猫做有趣的事情” (“Cats doing funny things.”). The Marengo video understanding model used information from the visual modality.
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.
French
In the example screenshot below, the query was “J’ai trouvé la solution” (“I found the solution.”). The Marengo video understanding model used information from the visual modality (text displayed on the screen).
Pegasus
This section contains examples of using the Pegasus video understanding model.
Summarizing educational videos
In the example screenshot below, the platform has summarized an educational video using predefined templates without any customization:
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.
Generating captions for social media
In the example screenshot below, the prompt instructs the platform to generate a caption for a social media post:
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.
Writing police reports
In the example screenshot below, the prompt instructs the platform to write a police report using a specific template for a video showing a robbery:
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.
Using different languages
This sections provides example of using different languages to generate text from videos.
Spanish
The following example summarizes a video, indicating that the response should be in Spanish. Note that the prompt is in English, and the output is in Spanish.
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.
French
The following example summarizes the main three takeaways from this video. Note that the prompt and the output are in French.
To see this example in the Playground, ensure you’re logged in, and then open this URL in your browser.