Rate limits control how many requests you can make within a specific time window. These limits ensure fair usage and maintain optimal performance for all users.
The platform uses multiple dimensions to measure your usage. Each dimension tracks different aspects of your requests. Understanding how rate limits work helps you optimize your usage and avoid errors.
The platform measures your usage across these dimensions:
Rate limits vary by endpoint based on computational requirements:
The platform checks your usage against each applicable dimension. You receive an error when you exceed any limit, even if other dimensions have remaining capacity.
For example, you might have remaining request capacity (RPM) but exceed your duration limit (DPH) when processing a long video. The platform returns an error based on the duration limit.
When using an organization account, rate limits apply in aggregate to all the API keys in the organization.
Your plan and monthly spending determine your rate limits.
TwelveLabs offers three plans:
Rate limits vary by modality. Endpoints in the same category share a rate limit. All requests to these endpoints count toward the shared limit. An endpoint can have different rate limits based on the type of content you process.
POST /tasks, POST /indexes/{index-id}/indexed-assetsPOST /assetsPOST /embed, POST /embed/tasks, POST /embed-v2, POST /embed-v2/tasksPOST /embed, POST /embed-v2, POST /embed-v2/tasksPOST /embed, POST /embed-v2POST /embed, POST /embed-v2POST /embed-v2POST /searchPOST /analyzeNew accounts start with the Free plan at no cost.
The Developer plan has three tiers. You start at Tier 1 when you add a payment method. Your tier increases automatically based on your monthly spending.
See the Pricing page to calculate your spending.
The Enterprise plan provides custom rate limits. Contact us to discuss your requirements.
Upgrades: The platform upgrades your account when you reach the spending requirement for a higher tier. The upgrade takes effect immediately. You receive an email notification when your tier changes.
Downgrades within the Developer plan: When your monthly spending falls below your current tier’s threshold, a one-month grace period applies before a downgrade.
Example:
Plan downgrades: When you downgrade from the Developer plan to the Free plan, the tier change takes effect immediately, with no grace period.
Use response headers to track your usage, implement best practices to handle errors, and upgrade your plan when you need higher limits.
Check your usage using HTTP response headers.
Each response includes headers for the active dimensions:
Responses include only the headers that apply to the endpoint you call.
These headers provide aggregate rate limit information. The platform maintains them for backward compatibility but they will be removed in a future release:
The platform returns an HTTP 429 - Too Many Requests error when you exceed a rate limit. The error shows which limit you exceeded and when it resets.
Follow these practices to handle rate limit errors:
Upgrade your plan or tier to increase rate limits: