Pegasus

Pegasus is a generative model for video-to-text generation. The current version is Pegasus 1.2.

Pegasus analyzes multiple modalities to generate contextually relevant text based on the content of your videos.

Key features

  • Video-to-text generation: Creates detailed textual descriptions based on video content
  • Extended processing capacity: Processes videos up to 1 hour in length
  • Granular visual comprehension: Analyzes objects, on-screen text, and numerical content
  • Temporal grounding: Accurately identifies timestamps of specific events
  • Multimodal understanding: Combines visual, audio, and textual information for comprehensive analysis

Use cases

  • Content summarization: Generate concise summaries of video content
  • Detailed descriptions: Create comprehensive textual descriptions of visual scenes
  • Timestamp identification: Answer questions about when specific events occur in videos
  • Content analysis: Extract key information from video content for further processing

Input requirements

The specifications on this page reflect the maximum capabilities of the model. Your actual requirements depend on the upload method and operation you choose. For details about the available upload methods and the corresponding limits, see the Upload methods page.

Video file requirements

VersionDurationFile SizeResolutionAspect RatioFormats
Pegasus 1.24 sec - 1 hour (2 hours in a future release)≤ 2 GB360x360 - 5184x2160Between 1:1 and 1:2.4, or between 2.4:1 and 1:1. For example, you can use 1:1, 4:3, 4:5, 5:4, 16:9, 9:16, or 17:9.FFmpeg supported

Audio and video stream durations must not differ by more than 0.5 seconds.

For videos in other formats or if you require different options, contact us at support@twelvelabs.io.

Note

If you upload files using publicly accessible URLs, use direct links to raw video files that play without user interaction or custom video players (example: https://example.com/videos/sample-video.mp4). Video hosting platforms like YouTube and cloud storage sharing links are not supported.

Supported languages

Pegasus supports the following languages for processing visual and audio content, understanding prompts, and generating outputs:

  • Full support: English
  • Partial support: Arabic, Chinese, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Thai, Vietnamese

Examples

Summarizing educational videos

This example prompt summarizes an educational video without any customization.

Summarize this video

Generating captions for social media

This example prompt generates a caption for a social media post.

Generate an attention-grabbing caption for a social media post. Keep it shorter than 200 characters.

Table of contents

This example prompt creates a table of contents detailing the main sections.

Provide a table of contents detailing the main sections of this video.

Company-wide memo

This example prompt generates a company-wide memo.

Generate a company-wide memo based on the content of this video.

Video annotations

This example prompt identifies and lists key visual elements, scene changes, and notable events, briefly describing each.

Identify and list key visual elements, scene changes, and notable events in the video, briefly describing each.

Video question answering

This example prompt identifies the key takeaways.

What are the key takeaways of this video?

Timestamp breakdown

This example prompt lists all timestamps in an advertisement where a close-up of the product appears.

Tell me all the timestamps in the advertisement where a closeup of the product appears.

Police report

This example prompt creates a police report using a specific template for a video showing a robbery.

Create a police report based on what happened in the video. Provide the exact time range where the suspect appears in the video

Using different languages

Spanish

This example prompt summarizes a video and indicates that the response should be in Spanish. Note that the prompt is in English, and the output is in Spanish.

Write a summary in Spanish.

French

This example prompt summarizes the three main takeaways of a video. Note that the prompt and the output are in French.

Résumez les trois principaux points à retenir de cette vidéo

Support

For support or feedback regarding Pegasus, contact support@twelvelabs.io.