Create embeddings

Use the Marengo video understanding model to generate embeddings from video, audio, text, and image inputs. These embeddings enable similarity search, content clustering, recommendation systems, and other machine learning applications.

Note

AWS Bedrock supports Marengo 3.0 and Marengo 2.7. Marengo 2.7 will be deprecated in a future release. Migrate to Marengo 3.0 to ensure continued support and access to new features. For details, see the Migration guide page.

For the key enhancements in Marengo 3.0, see the TwelveLabs Marengo Embed 3.0 page in the AWS Bedrock documentation.

Regional availability

Marengo is available in the following regions: US East (N. Virginia), Europe (Ireland), Asia Pacific (Seoul)

Model specification

SpecificationMarengo 3.0Marengo 2.7
Model IDtwelvelabs.marengo-embed-3-0-v1:0twelvelabs.marengo-embed-2-7-v1:0
InputVideo, audio, image, text, image and textVideo, audio, image, text
Input methodsS3 URI or base64 encoded stringS3 URI or base64 encoded string
Output512-dimensional embeddings1024-dimensional embeddings
Similarity metricCosine similarityCosine similarity

The model has two types of limits: the maximum input size you can submit and the portion of content that it embeds.

Input requirements

This table shows the maximum size for each type of input:

Input typeMarengo 3.0Marengo 2.7
Video- S3: 6 GB
- base64: 36 MB
- Duration: 4 hours
- S3: 2 GB
- base64: 36 MB
- Duration: 2 hours
Audio- S3: 6 GB
- base64: 36 MB
- Duration: 4 hours
S3: 2 GB
- base64: 36 MB
- Duration: 2 hours
Image5 MB5 MB
Text500 tokens77 tokens

Embedding coverage per input type

This table shows what portion of your input the model processes into embeddings:

Input typeEmbedding behavior
VideoCreates multiple embeddings for segments throughout the video. Segments are 1-10 seconds each. You can specify which portion of the video to process.
AudioCreates multiple embeddings, dividing the audio into segments as close to 10 seconds as possible. You can specify which portion of the audio to process.
ImageProcesses the entire image.
TextProcesses up to the maximum tokens supported, and automatically truncates text exceeding 500 tokens from the end.
Text with imageProcesses both text and image together to create a single embedding.

Pricing

For details on pricing, see the Amazon Bedrock pricing page.

Choose the processing method

Select the processing method based on your use case and performance requirements. Synchronous processing returns embeddings immediately in the API response, while asynchronous processing handles larger files and batch operations by saving results to S3.

Note

Synchronous processing supports text and image inputs. Asynchronous processing supports video, audio, and image inputs.

Use synchronous processing to:

  • Build real-time applications like chatbots, search, and recommendation systems.
  • Enable interactive features that require immediate results.

Use asynchronous processing to:

  • Build applications that process video, audio, and image files.
  • Run batch operations and background workflows.

Prerequisites

Before you start, ensure you have the following:

  • An AWS account with access to a region where the TwelveLabs models are supported.
  • An AWS IAM principal with sufficient Amazon Bedrock permissions. For details on setting permissions, see the Identity and access management for Amazon Bedrock page.
  • S3 permissions to read input files and write output files for Marengo operations.
  • The AWS CLI and configured with your credentials.
  • Python 3.7 or later with the boto3 library.
  • Access to the model you want to use. Navigate to the AWS Console > Bedrock > Model Access page and request access. Note that the availability of the models varies by region.

Create embeddings

Marengo supports base64 encoded strings and S3 URIs for media input. Note that the base64 method has a 36MB file size limit. This guide uses S3 URIs.

Note

Your S3 input and output buckets must be in the same region as the model. If regions don’t match, the API returns a ValidationException error.

To generate embeddings from your content, you use one of two Amazon Bedrock APIs, depending on your processing needs.

Synchronous processing

The InvokeModel API processes your request synchronously and returns embeddings directly in the response.

The InvokeModel API requires two parameters:

  • modelId: The inference profile ID for the model.
  • body: A JSON-encoded string containing your input parameters.

The request body contains the following fields:

  • inputType: The type of content. Values: “text”, “image”, or “text_image”.
  • For text inputs, include a string named inputText with the text to embed.
  • For image inputs, include an object named image with with the following fields:
    • mediaSource: The image source containing either base64String or s3Location
  • For text with image inputs, include an object named text_image with the following fields:
    • inputText: The text to embed
    • mediaSource: The image source containing either base64String or s3Location

Examples

Ensure you replace the placeholders surrounded by <> with your values.

Python
1import boto3
2import json
3
4INFERENCE_PROFILE_ID = "twelvelabs.marengo-embed-3-0-v1:0"
5REGION_NAME = "<YOUR_REGION_NAME>"
6PROFILE_NAME = "<YOUR_PROFILE_NAME>"
7INPUT_TEXT="<YOUR_TEXT>"
8
9model_input = {
10 "inputType": "text",
11 "text": {
12 "inputText": INPUT_TEXT
13 }
14}
15
16# Initialize the Bedrock Runtime client
17boto3_session = boto3.Session(profile_name=PROFILE_NAME, region_name=REGION_NAME)
18client = boto3_session.client('bedrock-runtime')
19
20# Make the request
21response = client.invoke_model(
22 modelId=INFERENCE_PROFILE_ID,
23 body=json.dumps(model_input)
24)
25
26# Print the response body
27response_body = json.loads(response['body'].read().decode('utf-8'))
28print(response_body)

Asynchronous processing

The StartAsyncInvoke API processes your request asynchronously, storing the results in your S3 bucket.

To create embeddings asynchronously, you must complete the following steps:

1

Submit your request, providing an S3 location for your input media file and an S3 location for the output. Note that this example uses the same bucket.

2

Check the job status using the returned invocation ARN.

3

Retrieve the results from the S3 output location once the job has completed.

The StartAsyncInvoke API requires three parameters:

  • modelId: The model ID.
  • modelInput: A dictionary containing your input parameters.
  • outputDataConfig: A dictionary specifying where to save the results

The modelInput dictionary contains the following required fields:

  • inputType: The type of content (“video”, “audio”, “image”, “text”, or “text_image”)
  • For video inputs, include an object named video containing at least the following fields:
    • mediaSource: The S3 location of your video file
  • For audio inputs, include an object named audio containing at least the following required fields:
    • mediaSource: The S3 location of your audio file
  • For image inputs, include an image object with:
    • mediaSource: The S3 location of your image file
  • For text inputs, include a text object with:
    • inputText: The text to embed
    • For text with image inputs, include a text_image object with:
    • inputText: The text to embed
    • mediaSource: The S3 location of your image file

S3 output structure

Each invocation creates a unique directory in your S3 bucket with two files:

  • manifest.json: Contains metadata including the request ID.
  • output.json: Contains the actual embeddings.

Examples

Ensure you replace the placeholders surrounded by <> with your values.

Python
1import boto3
2import time
3
4REGION_NAME = "<YOUR_REGION_NAME>"
5PROFILE_NAME = "<YOUR_PROFILE_NAME>"
6MODEL_ID = "twelvelabs.marengo-embed-3-0-v1:0"
7ACCOUNT_ID = "<YOUR_ACCOUNT_ID>"
8BUCKET = "<YOUR_BUCKET_NAME>"
9FILE_NAME = "<YOUR_FILE>"
10INPUT_TYPE = "<YOUR_INPUT_TYPE>"
11
12boto3_session = boto3.Session(profile_name=PROFILE_NAME, region_name=REGION_NAME)
13bedrock_client = boto3_session.client('bedrock-runtime')
14
15# Start async video embedding
16model_input = {
17 "inputType": "video",
18 "video": {
19 "mediaSource": {
20 "s3Location": {
21 "uri": f"s3://{BUCKET}/{VIDEO_FILE}",
22 "bucketOwner": ACCOUNT_ID
23 }
24 },
25 }
26}
27
28async_request_response = bedrock_client.start_async_invoke(
29 modelId=MODEL_ID,
30 modelInput=model_input,
31 outputDataConfig={
32 "s3OutputDataConfig": {
33 "s3Uri": f"s3://{BUCKET}",
34 "bucketOwner": ACCOUNT_ID
35 }
36 }
37)
38
39print("async_request_response: ", async_request_response)
40
41# Get the invocation arn
42invocation_arn = async_request_response.get("invocationArn")
43
44# Wait for the async job to complete
45max_retries = 60
46retries = 0
47while True:
48 response = bedrock_client.get_async_invoke(
49 invocationArn=invocation_arn
50 )
51 print(f"status: {response.get('status')}")
52 if response.get("status") == "Completed":
53 break
54 time.sleep(1)
55 retries += 1
56 if retries > max_retries:
57 break
58
59print(response)
60
61# Extract the S3 URI where results are stored
62output_s3_uri = response.get("outputDataConfig", {}).get("s3OutputDataConfig", {}).get("s3Uri")
63print(f"Results stored at: {output_s3_uri}")

Use embeddings

After generating embeddings, you can store them in a vector database for efficient similarity search and retrieval.

The typical workflow is as follows:

1

Generate embeddings for your content.

2

Store embeddings with metadata in your chosen vector database.

3

Generate an embedding for user queries.

4

Use cosine similarity to find the most relevant content.

5

Retrieve the original content or use the results for RAG applications.

Request parameters and response fields

For a complete list of request parameters and response fields, see the TwelveLabs Marengo Embed 3.0 page in the Amazon Bedrock documentation.


Using Marengo 2.7

Note

Marengo 2.7 will be deprecated in a future release. Migrate to Marengo 3.0 to ensure continued support and access to new features. For details, see the Migration guide page.

Synchronous processing

Request body structure

The request body contains the following fields:

  • inputType: The type of content. Values: “text” or “image”.
  • inputText: The text to embed. Required for text inputs.
  • mediaSource: The image source containing either base64String or s3Location. Required for image inputs.

Examples

Replace <YOUR_TEXT> with the text for which you wish to create an embedding.

Python
1import boto3
2import json
3
4# Replace the `us` prefix depending on your region
5INFERENCE_PROFILE_ID = "us.twelvelabs.marengo-embed-2-7-v1:0"
6INPUT_TEXT = "<YOUR_TEXT>"
7
8model_input = {
9 "inputType": "text",
10 "inputText": INPUT_TEXT
11}
12
13# Initialize the Bedrock Runtime client
14client = boto3.client('bedrock-runtime')
15
16# Make the request
17response = client.invoke_model(
18 modelId=INFERENCE_PROFILE_ID,
19 body=json.dumps(model_input)
20)
21
22# Print the response body
23response_body = json.loads(response['body'].read().decode('utf-8'))
24print(response_body)

Asynchronous processing

Model input structure

The modelInput dictionary contains the following fields:

  • inputType: The type of content (“video”, “audio”, “image”, or “text”)
  • mediaSource: The S3 location of your input file (for video, audio, and image)
  • inputText: The text content (for text inputs only)

Examples

Replace the following placeholders with your values:

  • <YOUR_REGION>: Your AWS region
  • <YOUR_ACCOUNT_ID>: Your AWS account ID
  • <YOUR_BUCKET_NAME>: The name of your S3 bucket
  • <YOUR_FILE>: The name of your file
  • <YOUR_INPUT_TYPE>: The type of media (“video”, “audio”, or “image”)
Python
1import boto3
2import time
3
4REGION = "<YOUR_REGION>"
5MODEL_ID = "twelvelabs.marengo-embed-2-7-v1:0"
6ACCOUNT_ID = "<YOUR_ACCOUNT_ID>"
7BUCKET = "<YOUR_BUCKET_NAME>"
8FILE_NAME = "<YOUR_FILE>"
9INPUT_TYPE = "<YOUR_INPUT_TYPE>"
10
11bedrock_client = boto3.client(service_name="bedrock-runtime", region_name=REGION)
12
13# Start async embedding
14model_input = {
15 "mediaSource": {
16 "s3Location": {
17 "uri": f"s3://{BUCKET}/{FILE_NAME}",
18 "bucketOwner": ACCOUNT_ID
19 }
20 },
21 "inputType": INPUT_TYPE
22}
23
24async_request_response = bedrock_client.start_async_invoke(
25 modelId=MODEL_ID,
26 modelInput=model_input,
27 outputDataConfig={
28 "s3OutputDataConfig": {
29 "s3Uri": f"s3://{BUCKET}",
30 "bucketOwner": ACCOUNT_ID
31 }
32 }
33)
34
35print("async_request_response: ", async_request_response)
36
37# Get the invocation ARN
38invocation_arn = async_request_response.get("invocationArn")
39
40# Wait for the async job to complete
41max_retries = 60
42retries = 0
43while True:
44 response = bedrock_client.get_async_invoke(
45 invocationArn=invocation_arn
46 )
47 print(f"status: {response.get('status')}")
48 if response.get("status") == "Completed":
49 break
50 time.sleep(1)
51 retries += 1
52 if retries > max_retries:
53 break
54
55print(response)
56
57# Extract the S3 URI where results are stored
58output_s3_uri = response.get("outputDataConfig", {}).get("s3OutputDataConfig", {}).get("s3Uri")
59print(f"Results stored at: {output_s3_uri}")

Request parameters and response fields

For a complete list of request parameters and response fields, see the TwelveLabs Marengo Embed 2.7 page in the Amazon Bedrock documentation.