After your videos have been indexed by the Pegasus video language model, you can prompt the platform to analyze your video content and generate text outputs. Prompt engineering is the process of iteratively refining how you craft your instructions or questions to the model to improve the quality, relevance, and precision of the responses. Prompt engineering is important for enhancing the effectiveness of the model in various use cases, from content creation and summarization to question-answering, as shown below:

Improves accuracy: Tailored prompts produce more precise outputs by clearly specifying the task.
Reduces ambiguity: Well-designed prompts limit the model’s scope for interpretation, ensuring relevant responses.
Enhances efficiency: Effective prompts reduce the need for post-processing, saving time and resources.
Customizes outputs: Through prompt engineering, you can tailor the outputs to specific tones, styles, or formats, meeting diverse requirements.
Provides context: Prompts can provide essential context to the model, ensuring the output is relevant and appropriate for the given situation or domain.

Steps in prompt engineering

The typical steps involved in prompt engineering are as follows:

Define the objective: Identify what you need from the model, such as generating a description of a video segment or answering a question based on video content.
Craft the initial prompt: Based on your objective, develop the initial version of your prompt. Include all necessary details and context, and specify the expected output format.
Test and iterate: Analyze the output and refine your prompt based on the results. This step may involve several iterations.

Tips for writing better prompts

Crafting the perfect prompt is not achieved through a universal solution, as the effectiveness of a specific method can vary widely depending on the task at hand. However, the tips provided in this section can help enhance your prompt-writing skills. By experimenting with them, you can discover approaches that lead to more accurate and relevant responses.

Provide examples

Examples guide the model in generating the expected output, reducing ambiguity, and ensuring the platform generates relevant responses. The following example creates a police report based on surveillance footage. It includes an example of a similar report to guide the model’s response.

Write a police report based on this video using the following example:
Date: 11/01/2020
Location: San Francisco Police Department
Witnesser’s full name: John Smith
Reporter: Barbara Lim
On 11/01/2020 around 5 PM, I saw a suspect walking in a retail store on Height Street…

Provide context

Providing context in prompts helps the platform understand your requirements, ensuring the generated response is accurately tailored to your needs. By providing context, you reduce the chances of irrelevant outputs. The following example provides the required context to customize the generated response according to your needs.

Write a script for a reporter to read on the news about the event shown in this video.

Be specific

Specificity guides the model in producing highly relevant and targeted responses by aligning the output with your intentions. The following example indicates the exact aspect of the video you want the model to focus on - creating a daily workout plan for this week based on the workout routine mentioned in this video. This helps the model understand the scope of the prompt and generate a targeted response.

Create a daily workout plan for this week based on based on the workout routine mentioned in this video.

Choose the type of prompt

Based on your requirements, differentiate between question-answering and description-based prompts, as each will guide the model’s focus differently. The example prompt below is phrased as a question and instructs the model to list the filming techniques used in a video.

What kind of filming techniques were used in this video?

Specify the desired style and format for the output

Clearly state the desired output’s length, style, and format (examples: JSON format, email) to ensure the output meets your requirements. The example below summarizes a video as an email, focusing on the five most important points.

Write a summary in the form of an email. In your summary, focus on the most important points presented in the video.

Choose the language for the output

Specify if you want the output in a different language. The following example summarizes a video, indicating that the response should be in Spanish.

Write a summary in Spanish.

Be concise

Being concise helps the model focus on the essential information. This speeds up processing and increases the likelihood of generating precise, relevant responses.

Tune the temperature

Tuning the temperature controls the randomness of the text output. A lower temperature results in more deterministic results, which is ideal for tasks requiring high accuracy and specificity. In contrast, a higher temperature produces more creative text, which is suitable for brainstorming or creative writing tasks. Experiment with this setting to find the optimal balance that meets your objectives. For details, see the Tune the temperature page.

Practical examples of prompt engineering

From overview to detailed scene-by-scene descriptions

Objective: Generate increasingly detailed video descriptions by iteratively refining your prompts from a general summary to timestamped scene breakdowns.

Start with a general summary

Summarize this video in 2-3 sentences.

Request a structured breakdown with time ranges

Break down the video into main segments with their approximate start and end times.

Specify the exact format with precise timestamps

Provide a scene-by-scene description with precise timestamps in the format [start_time, end_time] in seconds. Keep each description to one sentence.

Find segments with precise timestamps

Objective: Locate specific moments in a video by progressively narrowing your search criteria from broad segments to exact timestamps.

Search for relevant segments

Find all the replays in this video and list their timestamps.

Narrow down by adding specific criteria

Show me only the slow-motion replays with their timestamps.

Pinpoint the exact moment within a segment

At what timestamp does the slow-motion replay show Usain Bolt leaving the starting blocks?

Write recipes from videos

Objective: Extract a complete, formatted recipe from a cooking video by iteratively adding detail and structure to your prompts.

Extract the structure of the recipe

List the ingredients and steps for making macaroni and cheese from this video.

Add specificity

Provide the recipe with exact measurements for each ingredient and detailed step-by-step instructions.

Format as a complete recipe

Write a complete recipe for macaroni and cheese from this video. Include an ingredients list with measurements, numbered cooking steps, and any tips or techniques mentioned.

Write a workout plan

Objective: Transform video content into a structured workout plan by extracting information, organizing it, and formatting it for a specific output format.

Extract workout information

List the workouts mentioned in this video

Add structure and detail

List the workouts mentioned in this video. Add a brief description of up to two short sentences for each workout.

Format the output as an email

List the workouts mentioned in this video. Add a brief description of up to two short sentences for each workout. Format your response as an email.