Use the Marengo video understanding model to generate embeddings from video, audio, text, and image inputs. These embeddings enable similarity search, content clustering, recommendation systems, and other machine learning applications.
Marengo 2.7 will be deprecated. Embeddings created with Marengo 2.7 are not compatible with Marengo 3.0. You must migrate to 3.0 and regenerate all your embeddings. For details, see the Migration guide page.
Marengo is available in the following regions: US East (N. Virginia), Europe (Ireland), Asia Pacific (Seoul)
The model has two types of limits: the maximum input size you can submit and the portion of content that it embeds.
This table shows the maximum size for each type of input:
This table shows what portion of your input the model processes into embeddings:
For details on pricing, see the Amazon Bedrock pricing page.
Select the processing method based on your use case and performance requirements. Synchronous processing returns embeddings immediately in the API response, while asynchronous processing handles larger files and batch operations by saving results to S3.
Synchronous processing supports text and image inputs. Asynchronous processing supports video, audio, and image inputs.
Use synchronous processing to:
Use asynchronous processing to:
Before you start, ensure you have the following:
boto3 library.Marengo supports base64 encoded strings and S3 URIs for media input. Note that the base64 method has a 36MB file size limit. This guide uses S3 URIs.
Your S3 input and output buckets must be in the same region as the model. If regions don’t match, the API returns a ValidationException error.
To generate embeddings from your content, you use one of two Amazon Bedrock APIs, depending on your processing needs.
The InvokeModel API processes your request synchronously and returns embeddings directly in the response.
The InvokeModel API requires two parameters:
modelId: The inference profile ID for the model.body: A JSON-encoded string containing your input parameters.The request body contains the following fields:
inputType: The type of content. Values: “text”, “image”, or “text_image”.inputText with the text to embed.image with with the following fields:
mediaSource: The image source containing either base64String or s3Locationtext_image with the following fields:
inputText: The text to embedmediaSource: The image source containing either base64String or s3LocationEnsure you replace the placeholders surrounded by <> with your values.
The StartAsyncInvoke API processes your request asynchronously, storing the results in your S3 bucket.
To create embeddings asynchronously, you must complete the following steps:
The StartAsyncInvoke API requires three parameters:
modelId: The model ID.modelInput: A dictionary containing your input parameters.outputDataConfig: A dictionary specifying where to save the resultsThe modelInput dictionary contains the following required fields:
inputType: The type of content (“video”, “audio”, “image”, “text”, or “text_image”)video containing at least the following fields:
mediaSource: The S3 location of your video fileaudio containing at least the following required fields:
mediaSource: The S3 location of your audio fileimage object with:
mediaSource: The S3 location of your image filetext object with:
inputText: The text to embedtext_image object with:inputText: The text to embedmediaSource: The S3 location of your image fileEach invocation creates a unique directory in your S3 bucket with two files:
manifest.json: Contains metadata including the request ID.output.json: Contains the actual embeddings.Ensure you replace the placeholders surrounded by <> with your values.
After generating embeddings, you can store them in a vector database for efficient similarity search and retrieval.
The typical workflow is as follows:
For a complete list of request parameters and response fields, see the TwelveLabs Marengo Embed 3.0 page in the Amazon Bedrock documentation.
Marengo 2.7 will be deprecated. Embeddings created with Marengo 2.7 are not compatible with Marengo 3.0. You must migrate to 3.0 and regenerate all your embeddings. For details, see the Migration guide page.
The request body contains the following fields:
inputType: The type of content. Values: “text” or “image”.inputText: The text to embed. Required for text inputs.mediaSource: The image source containing either base64String or s3Location. Required for image inputs.Replace <YOUR_TEXT> with the text for which you wish to create an embedding.
The modelInput dictionary contains the following fields:
inputType: The type of content (“video”, “audio”, “image”, or “text”)mediaSource: The S3 location of your input file (for video, audio, and image)inputText: The text content (for text inputs only)Replace the following placeholders with your values:
<YOUR_REGION>: Your AWS region<YOUR_ACCOUNT_ID>: Your AWS account ID<YOUR_BUCKET_NAME>: The name of your S3 bucket<YOUR_FILE>: The name of your file<YOUR_INPUT_TYPE>: The type of media (“video”, “audio”, or “image”)For a complete list of request parameters and response fields, see the TwelveLabs Marengo Embed 2.7 page in the Amazon Bedrock documentation.