Pegasus
Pegasus is a generative model for video-to-text generation. The current version is Pegasus 1.2.
Pegasus analyzes multiple modalities to generate contextually relevant text based on the content of your videos.
Key features
- Video-to-text generation: Creates detailed textual descriptions based on video content
- Extended processing capacity: Processes videos up to 1 hour in length
- Granular visual comprehension: Analyzes objects, on-screen text, and numerical content
- Temporal grounding: Accurately identifies timestamps of specific events
- Multimodal understanding: Combines visual, audio, and textual information for comprehensive analysis
Use cases
- Content summarization: Generate concise summaries of video content
- Detailed descriptions: Create comprehensive textual descriptions of visual scenes
- Timestamp identification: Answer questions about when specific events occur in videos
- Content analysis: Extract key information from video content for further processing
Input requirements
The specifications on this page reflect the maximum capabilities of the model. Your actual requirements depend on the upload method and operation you choose. For details about the available upload methods and the corresponding limits, see the Upload methods page.
Video file requirements
Audio and video stream durations must not differ by more than 0.5 seconds.
For videos in other formats or if you require different options, contact us at support@twelvelabs.io.
Note
If you upload files using publicly accessible URLs, use direct links to raw video files that play without user interaction or custom video players (example: https://example.com/videos/sample-video.mp4). Video hosting platforms like YouTube and cloud storage sharing links are not supported.
Supported languages
Pegasus supports the following languages for processing visual and audio content, understanding prompts, and generating outputs:
- Full support: English
- Partial support: Arabic, Chinese, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Thai, Vietnamese
Examples
Summarizing educational videos
This example prompt summarizes an educational video without any customization.
Generating captions for social media
This example prompt generates a caption for a social media post.
Table of contents
This example prompt creates a table of contents detailing the main sections.
Company-wide memo
This example prompt generates a company-wide memo.
Video annotations
This example prompt identifies and lists key visual elements, scene changes, and notable events, briefly describing each.
Video question answering
This example prompt identifies the key takeaways.
Timestamp breakdown
This example prompt lists all timestamps in an advertisement where a close-up of the product appears.
Police report
This example prompt creates a police report using a specific template for a video showing a robbery.
Using different languages
Spanish
This example prompt summarizes a video and indicates that the response should be in Spanish. Note that the prompt is in English, and the output is in Spanish.
French
This example prompt summarizes the three main takeaways of a video. Note that the prompt and the output are in French.
Support
For support or feedback regarding Pegasus, contact support@twelvelabs.io.