Open-ended text
Use the /generate
endpoint for generating open-ended texts from videos that are more customizable and tailor-made than the results provided by the /summarize
endpoint. This endpoint can generate diverse results based on your prompts, including, but not limited to, tables of content, action items, memos, reports, and comprehensive analyses.
Below are some examples of prompts tailored to generate specific content types:
Content type | Prompt example |
---|---|
Table of contents | Provide a table of contents detailing the main sections of this video. |
Action items | Identify and list all the action items assigned to each team member. |
Memo | Generate a company-wide memo based on the announcements made in the video. |
Police report | Write a police report based on this video using the following example: Date: 11/01/2020 Location: San Francisco Police Department Witnesser’s full name: John Smith Reporter: Barbara Lim On 11/01/2020 around 5 PM, I saw a suspect walking in a retail store on Height Street… |
Meeting minutes | Generate detailed meeting minutes from this video, including discussion points, decisions made, and follow-up actions assigned. |
Video annotations | Identify and list key visual elements, scene changes, and notable events in the video, briefly describing each. |
Video question answering | - What are the key takeaways of this video? - What is the creative approach of this video? |
For a description of each field in the request and response, see the API Reference > Generate open-ended textspage.
Prerequisites
The examples in this guide assume the following:
- You’re familiar with the concepts that are described on the Platform overview page.
- You’ve already created an index and the Pegasus video understanding model is enabled for this index.
- You've uploaded a video, and the platform has finished indexing it.
Examples
When generating open-ended texts, the default behavior of the platform is to stream responses. This enables real-time processing of partial results, enhances the user experience with immediate feedback and significantly reduces the perceived latency.
For a description of each field in the request and response, see the API Reference > Open-ended texts page.
You can choose whether the model generates streaming responses or non-streaming responses. For details, see one of the sections below:
Streaming responses
For streaming responses, you must invoke the text_stream
method of the generate
object. The response consists of a stream of JSON objects, each on its own line, following the NDJSON format. Each object represents an event in the generation process, with three event types:
stream_start
: Indicates the beginning of the stream. When you receive this event, initialize your processing logic.
Example:{ "event_type": "stream_start", "metadata": { "generation_id": "2f6d0bdd-aed8-47b1-8124-3c9d8006cdc9" } }
text_generation
: Contains a fragment of generated text. Astext_generation
events arrive, handle the text fragments based on your application's needs. This might involve displaying the text in real-time, analyzing it, or storing it for later use. Note that these fragments may be of varying lengths and are not guaranteed to align with word or sentence boundaries.
Example:{ "event_type": "text_generation", "text": "Dive into the delightful world" }
stream_end
: Indicates the end of the stream. When you receive this event, finalize your processing logic.
Example:{ "event_type": "stream_end", "metadata": { "generation_id": "2f6d0bdd-aed8-47b1-8124-3c9d8006cdc9" } }
To use streaming responses in your application:
- Start a stream by invoking the
textStream
method of thegenerate
object with the following parameters:video_id
: A string representing the unique identifier of your videoprompt
: A string that guides the model on the desired format or content.
- Use a loop to iterate over the stream.
- Inside the loop, handle each text fragment as it arrives. This example prints each fragment to the standard output.
- (Optional) After the stream ends, use the
textStream.aggregatedText
field if you need the full generated text.
The example code below demonstrates using the SDKs to generate and process a streaming response. It starts a stream for a specified video and prompt, prints each text fragment as it arrives, and prints the complete aggregated text. Ensure you replace the placeholders surrounded by <>
with your values.
from twelvelabs import TwelveLabs
client = TwelveLabs(api_key="<YOUR_API_KEY>")
text_stream = client.generate.text_stream(
video_id="<YOUR_VIDEO_ID>",
prompt="<YOUR_PROMPT>"
)
for text in text_stream:
print(text)
print(f"Aggregated text: {text_stream.aggregated_text}")
import { TwelveLabs } from 'twelvelabs-js';
const client = new TwelveLabs({ apiKey: '<YOUR_API_KEY>'});
const textStream = await client.generate.textStream({
'<YOUR_VIDEO_ID>',
'<YOUR_PROMPT>',
});
for await (const text of textStream) {
console.log(text);
}
console.log(`Aggregated text: ${textStream.aggregatedText}`);
The output should look similar to the following:
This
video charmingly captures the
whims
ical and playful nature of
cats engaging
in a variety of activities
,
from frolicking and
exploring
to moments of relaxation and
quirky
interactions with their environment.
It highlights their
curious behaviors and the
joy they bring to everyday
scenes.
Aggregated text: This video charmingly captures the whimsical and playful nature of cats engaging in a variety of activities, from frolicking and exploring to moments of relaxation and quirky interactions with their environment. It highlights their curious behaviors and the joy they bring to everyday scenes.
Non-streaming responses
For streaming responses, you must invoke the text
method of the generate
object. The following example generates a brief summary with a specific format by invoking the text
method of the generate
object with the following parameters:
video_id
: A string representing the unique identifier of the video for which you want to generate a title.prompt
: A string that guides the model on the desired format or content.
from twelvelabs import TwelveLabs
client = TwelveLabs(api_key="<YOUR_API_KEY>")
res = client.generate.text(
video_id="<YOUR_VIDEO_ID>",
prompt="I want to generate a description for my video with the following format: Title of the video, followed by a summary in 2-3 sentences, highlighting the main topics."
)
print(f"{res.data}")
import { TwelveLabs } from 'twelvelabs-js';
const client = new TwelveLabs({ apiKey: '<YOUR_API_KEY>'});
const text = await client.generate.text(
'<YOUR_VIDEO_ID>',
'I want to generate a description for my video with the following format: Title of the video, followed by a summary in 2-3 sentences, highlighting the main topics.',
);
console.log(`${text.data}`);
The output should be similar to the following one:
Title: A Summer Day in Minnesota: College Graduation, Sun, Shopping, and Pennyboarding
Summary: In this video, a woman shares her summer day in Minnesota after her college graduation. She vlogs about her temporary move back home, showing her childhood home and expressing her love for getting some sun. The video captures various activities, including applying sunscreen, discovering a foul smell in her car, a shopping haul from favorite stores, the preparation of a bread salad, and meeting up with a friend to go pennyboarding at a parking garage. It's a fun and eventful day filled with sunshine, shopping, and outdoor adventures.
The following example generates a police report based on the provided template:
from twelvelabs import TwelveLabs
client = TwelveLabs(api_key="<YOUR_API_KEY>")
res = client.generate.text(
video_id="<YOUR_VIDEO_ID>",
prompt="Write a police report based on this video with the following example:\nDate: \n11/01/2020\nLocation: San Francisco Police Department\nWitnesser’s full name: John Smith\nReporter: Barbara Lim\n\nOn 11/01/2020 around 5 PM, I saw a suspect walking in a retail store on Height Street…"
)
print(f"{res.data}")
import { TwelveLabs } from 'twelvelabs-js';
const client = new TwelveLabs({ apiKey: '<YOUR_API_KEY>'});
const text = await client.generate.text(
'<YOUR_VIDEO_ID>',
'Write a police report based on this video with the following example:\nDate: \n11/01/2020\nLocation: San Francisco Police Department\nWitnesser’s full name: John Smith\nReporter: Barbara Lim\n\nOn 11/01/2020 around 5 PM, I saw a suspect walking in a retail store on Height Street…',
);
console.log(`${text.data}`);
The output should be similar to the following one:
Date: 11/01/2020
Location: San Francisco Police Department
Witness's full name: John Smith
Reporter: Barbara Lim
On 11/01/2020 around 5 PM, I, John Smith, witnessed a suspect walking in a retail store on
Height Street. The suspect was observed stealing items from the store, including an item
directly from the cash register. Two other individuals were also seen engaging in theft within
the store.
The video evidence obtained from the store's surveillance cameras clearly captures the suspect's
actions. The suspect was seen walking through the store and discreetly taking items without being
noticed by anyone. Additionally, the video shows two other individuals stealing multiple items
from the store before leaving.
One particular moment in the video shows a woman entering the camera's view, picking up a bottle
of alcohol from the shelf, and putting it inside her bag. This incident adds to the evidence of
theft within the store.
Based on the video footage and witness testimony, it is evident that multiple instances of theft
occurred within the retail store on Height Street. The stolen items include those taken directly
from the cash register, as well as various other items throughout the store.
We request further investigation into this matter to identify and apprehend the suspects involved
in these thefts. The video evidence should be analyzed thoroughly to assist in the identification
and prosecution of the individuals responsible.
Witnesser: John Smith
Reporter: Barbara Lim
The following example displays the key takeaways of a video:
from twelvelabs import TwelveLabs
client = TwelveLabs(api_key="<YOUR_API_KEY>")
res = client.generate.text(
video_id="<YOUR_VIDEO_ID>",
prompt="What are the key takeaways of this video?"
)
print(f"{res.data}")
import { TwelveLabs } from 'twelvelabs-js';
const client = new TwelveLabs({ apiKey: '<YOUR_API_KEY>'});
const text = await client.generate.text(
'<YOUR_VIDEO_ID>',
'What are the key takeaways of this video?',
);
console.log(`${text.data}`);
The output should be similar to the following one:
The key takeaways from the video are as follows:
Good posture is crucial for maintaining physical and mental health.
Poor posture can lead to discomfort, impaired body mechanics, and musculoskeletal issues.
Maintaining proper postural alignment is essential for overall physical well-being.
Posture can affect emotional state, sensitivity to pain, and overall balance.
Prolonged awkward positions and looking downwards while using electronic devices can lead to
musculoskeletal problems.
Proper spinal alignment and understanding the structure of the spine are important for preventing issues.
Babies develop more curves in their spine as their muscles strengthen, enabling them to stay upright.
Posture plays a role in reducing stress and maintaining alignment.
Good postural alignment is essential while sitting, especially for those working at a computer.
Tips for maintaining good posture include using ergonomic aids, wearing suitable footwear, and
keeping muscles and joints active.
Regular exercise and using muscles effectively support proper posture.
Consulting a physical therapist can provide guidance on proper postural alignment.
The following example identifies the creative approach of a video:
from twelvelabs import TwelveLabs
client = TwelveLabs(api_key="<YOUR_API_KEY>")
res = client.generate.text(
video_id="<YOUR_VIDEO_ID>",
prompt="What is the creative approach of the video?"
)
print(f"{res.data}")
import { TwelveLabs } from 'twelvelabs-js';
const client = new TwelveLabs({ apiKey: '<YOUR_API_KEY>'});
const text = await client.generate.text(
'<YOUR_VIDEO_ID>',
'What is the creative approach of this video?',
);
console.log(`${text.data}`);
The output should be similar to the following one:
The creative approach of the video is to showcase a "Joyful Journey" theme by featuring a man
exploring different locations and opening doors to reveal various settings. The video transitions
between scenes of a jungle, an industrial area, and a hillside with water falling from above.
It also includes animated scenes and people enjoying Coca-Cola drinks together. The advertisement
aims to convey a sense of joy and togetherness associated with drinking Coca-Cola.
Updated 1 day ago