Migration guide
This guide shows how to migrate your applications to the 1.3 version of the API, which introduces significant improvements to the video understanding capabilities of the platform, simplified modalities, and a streamlined endpoint structure.
SDK version compatibility with API versions:
- SDK 0.4.x and later supports API 1.3
- SDK 0.3.x and earlier supports API 1.2
Note
All changes described in this guide apply to the 1.3 version of the API. Features marked as deprecated, renamed, or removed are unavailable in v1.3 but remain functional in v1.2 until further notice. The only exception is the Classify API, which has been deprecated in v1.2 on Feb 28, 2025.
What’s new in v1.3?
- Marengo 2.7: A new version of the Marengo video understanding model has been released. The 1.3 version of the API version only supports Marengo 2.7. This new version improves accuracy and performance in the following areas:
- Multimodal processing that combines visual, audio, and text elements.
- Fine-grained image-to-video search: detect brand logos, text, and small objects (as small as 10% of the video frame).
- Improvement in motion search capability.
- Counting capabilities.
- More nuanced audio comprehension: music, lyrics, sound, and silence.
- Simplified modalities:
visual
: includes objects, actions, text OCR, logos.audio
: includes speech, music, and ambient sounds.conversation
has been deprecated.text_in_video
andlogo
are now part ofvisual
.
- Streamlined endpoint structure: Several endpoints and parameters have been deprecated, removed, or renamed.
Note
This guide presents the changes to the API. Since the SDKs reflect the structure of the API, review the Migration examples section below and the relevant SDK reference sections to understand how these changes have been implemented:
Breaking changes
This section presents the changes that require updates to your code and includes the following subsections:
- Global changes that affect multiple endpoints
- Changes organized by endpoint and functionality (example: upload videos, manage indexes, etc.)
In the sections below, see the Required Action column for each change, then use the corresponding example in the Migration examples section to update your code.
Global changes
Deprecated endpoints
Upload videos
Manage indexes
Manage videos
Search
Generate text from video
Non-breaking changes
These changes add new functionality while maintaining backward compatibility.
Upload videos
Migration steps
Migrating to v1.3 involves two main steps:
- Update your integration
- Update your code. Refer to the Migration Examples setion for details.
1. Update your integration
Choose the appropriate method based on how you interact with the TwelveLabs API:
- Official SDKs: Install version 0.4.x or later.
- HTTP client: Update your base URL.
2. Migration examples
Below are examples showing how to update your code for key breaking changes. Choose the examples matching your integration type.
Create indexes
Creating an index in version 1.3 includes the following key changes:
- Renamed parameters: The parameters that previously began with
engine*
have now been renamed tomodel*
. - Simplified modalities: The previous modalities of [
visual
,conversation
,text_in_video
,logo
] have been simplified to [visual
,audio
]. - Marengo version update: Use “marengo2.7” instead of “marengo2.6”.
Perform a search request
Performing a search request includes the following key changes:
- Simplified modalities: The previous modalities of [
visual
,conversation
,text_in_video
,logo
] have been simplified to [visual
,audio
]. - Deprecated parameter: The
conversation_option
parameter has been deprecated. - Streamlined response: The
metadata
andmodules
fields in the response have been deprecated.
Create embeddings
Creating embeddings includes the following key changes:
- Marengo version update: Use “Marengo-retrieval-2.7” instead of “Marengo-retrieval-2.6”.
- Renamed parameter: The parameters that previously began with
engine*
have now been renamed tomodel*
.
The following example creates a text embedding, but the principles demonstrated are similar for image, audio, and video embeddings:
Use Pegasus to classify videos
The Pegasus video understanding model offers flexible video classification through its text generation capabilities. You can use established category systems like YouTube video categories or IAB Tech Lab Content Taxonomy . You can also define custom categories for your specific needs.
The example below classifies a video based on YouTube’s video categories:
Detect logos
You can search for logos using text or image queries:
- Text queries: For logos that include text (example: Nike)
- Image queries: For logos without text (example: Apple’s apple symbol).
The following example searches for the Nike logo using a text query:
The following example searches for the Apple logo using an image query:
Search for text shown in videos
To search for text in videos, use text queries that target either on-screen text or spoken words in transcriptions rather than objects or concepts. The platform searches across both:
- Text shown on screen (such as titles, captions, or signs)
- Spoken words from audio transcriptions
Note that the platform may return both textual and visual matches. For example, searching for the word “smartphone” might return:
- Segments where “smartphone” appears as on-screen text.
- Segments where “smartphone” is spoken.
- Segments where smartphones are visible as objects.
The example below finds all the segments where the word “innovation” appears as on-screen text or as a spoken word in transcriptions: