Navigate to the section that best addresses your query. If you don’t find an answer to your question, please contact us .
This section answers frequently asked general questions.
We utilize a technique known as Positional Encoding, which is employed within the Transformers architecture to convey information regarding the position of a sequence of tokens within the input data. In this case, the tokens refer to the key scenes within the video. This technique facilitates the integration of sequential information into our model while simultaneously preserving the parallel processing capability of self-attention within the Transformer architecture.
Video hours measure the total duration of video you index. The limits depend on your plan.
Note the following about the Free plan:
For details about each plan, see the Pricing page. To increase your limits, upgrade to the Developer plan.
Indexing is typically completed in 30-40% of the duration of the video. However, indexing duration also depends on the number of concurrent indexing tasks, and delays can occur if too many indexing tasks are being processed simultaneously. If you’re on the Free plan, for faster indexing, consider upgrading to the Developer plan, which supports more concurrent tasks. We also offer a dedicated cloud deployment option for enterprise customers. Please contact us at sales@twelvelabs.io to discuss this option.
Yes, the model analyzes visual and audio information and learns the correlation between certain visual objects or situations with sounds frequently appearing together.
Yes, the models support multiple languages. See the Supported languages page for details.
The platform utilizes a multimodal approach for video understanding. Instead of relying on textual input like traditional LLMs, the platform interprets visuals, sounds, and spoken words to deliver comprehensive and accurate results.
You can optionally integrate our video-to-text model (Pegasus) with your LLMs.
To change your login method (for example, from username/password to SSO or vice versa), contact our support team at support@twelvelabs.io to delete your current account, then create a new one with your preferred login method.
If you’re on the Developer plan, TwelveLabs provides invoices that include a detailed cost breakdown. You can view your invoice using one of the following methods:
If you’re on the Enterprise plan, TwelveLabs provides invoices without detailed cost breakdowns.
This section answers frequently asked questions related to the Embed API.
The Embed API and built-in search service offer different functionalities for working with visual content.
Embed API
Built-in search service
This section answers frequently asked questions related to the Analyze API.
The Analyze API employs our foundational Visual Language Model (VLM), which integrates a language encoder to extract multimodal data from videos and a decoder to generate concise text representations.
Yes, you must reindex videos using the Pegasus engine. See the Analyze videos and Pricing pages for details.
Pricing depends on your plan.
The video duration counts toward your plan quota. The number of segment definitions does not affect this quota.
You pay based on how much video you process and how many segment definitions you include. If you provide the start_time and end_time parameters, you pay for that time range only. Otherwise, you pay for the full video duration. Each segment definition multiplies the cost.
Examples:
Time ranges within individual segment definitions control which portions of the video are analyzed. The billable duration is always the full start_time–end_time span.
For current rates, see the Pricing page.