Batch analysis

Requires Pegasus 1.5

The AnalyzeAsyncClient.BatchesClient class provides methods to submit up to 1,000 video analysis requests in a single call, monitor batch progress, retrieve per-item results, and cancel or delete batches.

When to use this class:

  • Run the same model and analysis settings across many videos.
  • Track a single batch instead of many individual analysis tasks.

Do not use this class for:

  • Single videos that require immediate results. Use the analyze method instead.
  • Background processing of a single video. Use the analyze_async.tasks.create method instead.
Model requirement

Batch analysis requires Pegasus 1.5. Set the model_name parameter to pegasus1.5.

Working with batches involves the following steps:

  1. Create the batch using the create method. The platform returns a batch identifier and one analysis task identifier per request.
  2. Monitor the batch using the retrieve method. Wait until the status reaches completed, canceled, or expired.
  3. Retrieve the per-item results using the results method.
  4. (Optional) Cancel the batch using the cancel method while the status is pending or processing.
  5. (Optional) Delete the batch using the delete method after the status reaches completed, canceled, or expired.

For schema details and status definitions, see the The batch object page.

Methods

Create a batch

Description: Use this method to submit many video analysis requests in a single call. Each request creates an analysis task. The response contains one batch identifier and one task identifier per request. Use the batch identifier to check progress and retrieve results.

Limits:

  • Up to 1,000 requests per batch.
  • Up to 2,000 total content hours per batch.
  • Up to 5 active batches per account.

Retention: Batches expire 24 hours after creation. You can retrieve results for 30 days after creation. If processing does not finish for some items in time, resubmit them in a new batch.

Function signature and example:

1def create(
2 self,
3 *,
4 model_name: CreateAnalyzeBatchRequestModelName,
5 analysis_mode: CreateAnalyzeBatchRequestAnalysisMode,
6 requests: typing.Sequence[BatchItemRequest],
7 defaults: typing.Optional[BatchDefaults] = OMIT,
8 request_options: typing.Optional[RequestOptions] = None,
9) -> CreateAnalyzeBatchResponse

Parameters:

NameTypeRequiredDescription
model_nameCreateAnalyzeBatchRequestModelNameYesThe video understanding model to use for every item in this batch. Batch analysis requires Pegasus 1.5. Value: "pegasus1.5".
analysis_modeCreateAnalyzeBatchRequestAnalysisModeYesThe analysis approach for every item in this batch. Values:
- "general": Generate text from each video based on the prompt (the item’s prompt field if set, otherwise defaults.prompt). Supports structured JSON output by using json_schema in the response_format.type field.
- "time_based_metadata": Extract timestamped metadata by using segment_definitions in the response_format.type field.
Batches with mixed modes are not supported.
requeststyping.Sequence[BatchItemRequest]YesThe analysis requests in the batch. Provide 1 to 1,000 requests, with a combined video duration of up to 2,000 hours. See BatchItemRequest.
defaultstyping.Optional[BatchDefaults]NoDefault values applied to every item that does not override them. See BatchDefaults.
request_optionstyping.Optional[RequestOptions]NoRequest-specific configuration.

BatchItemRequest

The BatchItemRequest class describes a single analysis request inside a batch. Identify the asset in the video field. Every other field is optional and overrides the matching field in the defaults object.

NameTypeRequiredDescription
videoBatchVideoContextYesThe asset to analyze. See BatchVideoContext.
custom_idOptional[str]NoAn optional identifier you set per item when you create the batch. Use this field to map batch results back to records in your system, for example, to correlate each result with the source video in your database. The platform stores this value unchanged and returns it in batch responses and task responses. Format: 1 to 64 characters. Alphanumeric, hyphens (-), and underscores (_) only. The value must be unique within a batch.
promptOptional[BatchPrompt]NoOverride the defaults.prompt value for this item. See BatchPrompt.
response_formatOptional[AsyncResponseFormat]NoOverride the defaults.response_format value for this item. For the type definition, see the Create an async analysis task method.
temperatureOptional[float]NoOverride the defaults.temperature value for this item.
max_tokensOptional[int]NoOverride the defaults.max_tokens value for this item.
min_segment_durationOptional[float]NoOverride the defaults.min_segment_duration value for this item.
max_segment_durationOptional[float]NoOverride the defaults.max_segment_duration value for this item.
start_timeOptional[float]NoOverride the defaults.start_time value for this item.
end_timeOptional[float]NoOverride the defaults.end_time value for this item.

BatchVideoContext

The BatchVideoContext class identifies the asset to analyze. Batch analysis accepts only assets. URLs and base64-encoded video data are not supported.

NameTypeRequiredDescription
typestrYesSet this field to "asset_id".
asset_idstrYesThe unique identifier of an asset from a direct or multipart upload. The asset status must be ready. Use assets.retrieve to check the status.

BatchDefaults

The BatchDefaults class defines default values applied to every item that does not override them. Every field is optional. Items in the requests array override these values. To override the prompt or response_format field, provide the full object on the item. You cannot change only some of its nested fields.

NameTypeRequiredDescription
promptOptional[BatchPrompt]NoThe default prompt applied to every item. See BatchPrompt.
response_formatOptional[AsyncResponseFormat]NoThe default response format applied to every item. For the type definition, see the Create an async analysis task method.
temperatureOptional[float]NoControls the randomness of the text output. Default: 0.2. Min: 0. Max: 1.
max_tokensOptional[int]NoThe maximum number of tokens to generate per item. For Pegasus 1.5 general mode: Min: 512, Max: 98,304. For Pegasus 1.5 time_based_metadata mode: Min: 2,048, Max: 98,304.
min_segment_durationOptional[float]NoMinimum duration for each extracted segment, in seconds. Applies only when the analysis_mode field is "time_based_metadata". Min: 2.
max_segment_durationOptional[float]NoMaximum duration for each extracted segment, in seconds. Must be greater than or equal to the min_segment_duration field. Applies only when the analysis_mode field is "time_based_metadata". Min: 2.
start_timeOptional[float]NoStart of the analysis window, in seconds, applied to every item. Use with end_time to analyze only the [start_time, end_time) portion of each video. If omitted, defaults to 0. Must be less than end_time. Mutually exclusive with response_format.segment_definitions[].time_ranges.
end_timeOptional[float]NoEnd of the analysis window, in seconds, applied to every item. Use with start_time to analyze only the [start_time, end_time) portion of each video. If omitted, defaults to the video duration. Must be greater than start_time. Mutually exclusive with response_format.segment_definitions[].time_ranges.

BatchPrompt

The BatchPrompt class defines a structured prompt with <@name> placeholders for referencing images. Not supported when the analysis_mode field is "time_based_metadata".

NameTypeRequiredDescription
input_textstrYesThe text of the prompt. Use <@name> placeholders to reference images declared in media_sources. This text counts toward the context window.
media_sourcesOptional[List[SmeMediaSource]]NoReference images for the <@name> placeholders in the prompt. Maximum 4 sources. For the type definition, see SmeMediaSource on the Async analysis page.

Return value: Returns a CreateAnalyzeBatchResponse object.

The CreateAnalyzeBatchResponse class contains the following properties:

NameTypeDescription
batch_idstrThe unique identifier of the batch.
statusBatchStatusThe initial status of the batch. Values: pending, processing, canceling, canceled, completed, expired.
total_itemsintThe number of items submitted in the batch.
created_atdatetimeThe date and time, in the RFC 3339 format, when the batch was created.
expires_atdatetimeThe date and time, in the RFC 3339 format, when the batch expires. Unfinished items at expiration are canceled.
itemsList[CreatedBatchItem]One entry per submitted item. Each entry pairs the unique task identifier generated by the platform with the custom identifier you provided. See CreatedBatchItem.

CreatedBatchItem

The CreatedBatchItem class contains the following properties:

NameTypeDescription
task_idstrThe unique task identifier generated by the platform for this item. Use this value with analyze_async.tasks.retrieve to retrieve the task’s status and results.
custom_idstrThe custom identifier you provided when you created the batch. If you did not provide one, this field is null.

API Reference: Create a batch

List batches

Description: This method returns a list of the batch objects in your account. The platform returns your batches sorted by creation date, with the newest batch first.

Function signature and example:

1def list(
2 self,
3 *,
4 page: typing.Optional[int] = None,
5 page_limit: typing.Optional[int] = None,
6 status: typing.Optional[typing.Union[BatchStatus, typing.Sequence[BatchStatus]]] = None,
7 analysis_mode: typing.Optional[
8 typing.Union[BatchesListRequestAnalysisModeItem, typing.Sequence[BatchesListRequestAnalysisModeItem]]
9 ] = None,
10 request_options: typing.Optional[RequestOptions] = None,
11) -> SyncPager[AnalyzeBatchStatusResponse]

Parameters:

NameTypeRequiredDescription
pageOptional[int]NoA number that identifies the page to retrieve. Default: 1.
page_limitOptional[int]NoThe number of items to return on each page. Default: 10. Max: 50.
statusOptional[Union[BatchStatus, Sequence[BatchStatus]]]NoFilter batches by status. Provide one or more values to include batches matching any of them. If omitted, the response includes batches in all statuses.
analysis_modeOptional[Union[BatchesListRequestAnalysisModeItem, Sequence[BatchesListRequestAnalysisModeItem]]]NoFilter batches by analysis mode. Provide one or more values to include batches matching any of them. Values: "general", "time_based_metadata". If omitted, the response includes batches in all analysis modes.
request_optionsOptional[RequestOptions]NoRequest-specific configuration.

Return value: Returns a SyncPager[AnalyzeBatchStatusResponse] object. Iterate over it directly to receive batches across pages, or call iter_pages() to receive each page as a list. For the per-batch object schema, see Retrieve batch status below.

API Reference: List batches

Retrieve batch status

Description: Use this method to monitor a batch. The response includes the current batch status and counts for queued, processing, ready, failed, and canceled items.

Poll this method until the batch reaches the completed, canceled, or expired status. To retrieve the results, call the results method.

Note

Do not treat the completed status as a success signal. It means processing has finished for every item, not that every analysis succeeded. To see how many items succeeded, failed, or were canceled, check the ready_items, failed_items, and canceled_items fields. A batch never has the failed status.

Function signature and example:

1def retrieve(
2 self,
3 batch_id: str,
4 *,
5 request_options: typing.Optional[RequestOptions] = None,
6) -> AnalyzeBatchStatusResponse

Parameters:

NameTypeRequiredDescription
batch_idstrYesThe unique identifier of the batch.
request_optionsOptional[RequestOptions]NoRequest-specific configuration.

Return value: Returns an AnalyzeBatchStatusResponse object.

The AnalyzeBatchStatusResponse class contains the following properties:

NameTypeDescription
batch_idstrThe unique identifier of the batch.
analysis_modestrThe analysis mode applied to every item in this batch. Values: "general", "time_based_metadata".
model_namestrThe model used for every item in this batch.
statusBatchStatusThe current batch status. Values: pending, processing, canceling, canceled, completed, expired.
total_itemsintThe number of items submitted in the batch.
queued_itemsintThe number of items in the queued status.
processing_itemsintThe number of items in the processing status.
ready_itemsintThe number of items that completed successfully.
failed_itemsintThe number of items that failed.
canceled_itemsintThe number of items that were canceled, either because the batch was canceled while the item was in the queued status, or because the batch expired before the item finished processing.
created_atdatetimeThe date and time, in the RFC 3339 format, when the batch was created.
expires_atdatetimeThe date and time, in the RFC 3339 format, when the batch expires.
completed_atOptional[datetime]The date and time, in the RFC 3339 format, when the batch status became completed. Present only when the status is completed.
canceled_atOptional[datetime]The date and time, in the RFC 3339 format, when the batch status became canceled. Present only when the status is canceled.
expired_atOptional[datetime]The date and time, in the RFC 3339 format, when the batch status became expired. Present only when the status is expired.
webhooksOptional[List[AnalyzeTaskWebhookInfo]]The delivery status of each webhook endpoint for the batch completion notification. Present only after the platform sends the webhook notifications. To register webhooks, see the Webhooks page.

API Reference: Retrieve batch status

Retrieve batch results

Description: Use this method to retrieve the results for each item in a batch. You can call it while the batch has the pending or processing status.

Each result entry has a status. For details on each status, see the Item statuses section on the The batch object page.

Each result entry includes a task identifier in the task_id field. Use this value with analyze_async.tasks.retrieve if you need the full analysis task response.

You can retrieve results for 30 days after batch creation.

The method streams the response. Iterate over the returned generator to receive one BatchResultItem per batch item.

Function signature and example:

1def results(
2 self,
3 batch_id: str,
4 *,
5 request_options: typing.Optional[RequestOptions] = None,
6) -> typing.Iterator[BatchResultItem]

Parameters:

NameTypeRequiredDescription
batch_idstrYesThe unique identifier of the batch.
request_optionsOptional[RequestOptions]NoRequest-specific configuration.

Yields: An iterator of BatchResultItem objects. Each entry corresponds to one batch item.

The BatchResultItem class contains the following properties:

NameTypeDescription
task_idstrThe unique task identifier generated by the platform for this item.
custom_idstrThe custom identifier you provided when you created the batch. If you did not provide one, this field is null.
statusBatchItemStatusThe current status of the result entry. Values: queued, processing, ready, failed, canceled.
dataOptional[AnalyzeTaskResult]The analysis result for an entry in the ready status. The analysis text is in this object’s nested data field. The envelope uses the same schema as the result field on analyze_async.tasks.retrieve.
errorOptional[BatchItemError]Failure details for an entry in the failed status. See BatchItemError.

BatchItemError

The BatchItemError class contains the following properties:

NameTypeDescription
codeOptional[str]A machine-readable error code identifying the failure category. May be omitted.
messagestrA human-readable explanation of the failure.

API Reference: Retrieve batch results

Cancel a batch

Description: Use this method to request cancellation for a batch with the pending or processing status.

When you invoke this method, the platform performs the following steps:

  • Cancels the items in the queued status.
  • Finishes the analysis for the items in the processing status.

The batch status changes to canceling immediately, and to canceled after every item reaches ready, failed, or canceled. You are not billed for canceled or failed items.

Function signature and example:

1def cancel(
2 self,
3 batch_id: str,
4 *,
5 request_options: typing.Optional[RequestOptions] = None,
6) -> AnalyzeBatchStatusResponse

Parameters:

NameTypeRequiredDescription
batch_idstrYesThe unique identifier of the batch.
request_optionsOptional[RequestOptions]NoRequest-specific configuration.

Return value: Returns an AnalyzeBatchStatusResponse object that reflects the batch state at the time of the request. The batch status is canceling until every item reaches ready, failed, or canceled, then becomes canceled. To confirm the batch is fully canceled, call the retrieve method. For the return type properties, see Retrieve batch status.

API Reference: Cancel a batch

Delete a batch

Description: Use this method to delete a batch and all the tasks associated with it. You can only delete batches with status completed, canceled, or expired.

Deleting a batch does not affect billing. You are billed for every completed analysis regardless of whether you delete the batch afterward.

To stop a batch with the pending or processing status, use the cancel method.

Batches are deleted 30 days after creation.

Function signature and example:

1def delete(
2 self,
3 batch_id: str,
4 *,
5 request_options: typing.Optional[RequestOptions] = None,
6) -> None

Parameters:

NameTypeRequiredDescription
batch_idstrYesThe unique identifier of the batch.
request_optionsOptional[RequestOptions]NoRequest-specific configuration.

Return value: Returns None. If successful, the platform returns a 204 No Content response.

API Reference: Delete a batch