Create embeddings for text, image, and audio

post https://api.twelvelabs.io/v1.2/embed

This method creates embedings for text, image, and audio content.

Before you create an embedding, ensure that the following prerequisites are met:

Parameters for embeddings:

Common parameters:
- engine_name: The video understanding engine you want to use. Example: "Marengo-retrieval-2.6".
Text embeddings:
- text: Text for which to create an embedding.
Image embeddings:
Provide one of the following:
- image_url: Publicly accessible URL of your image file.
- image_file: Local image file.
Audio embeddings:
Provide one of the following:
- audio_url: Publicly accessible URL of your audio file.
- audio_file: Local audio file.

NOTES:

The “Marengo-retrieval-2.6” video understanding engine generates embeddings for all modalities in the same latent space. This shared space enables any-to-any searches across different types of content.
You can create multiple types of embeddings in a single API call.
Audio embeddings combine generic sound and human speech in a single embedding. For videos with transcriptions, you can retrieve transcriptions and then create text embeddings from these transcriptions.

🚧
Important
The response includes breaking changes that might require updates to your application code.
Common changes:

The is_success boolean flag has been removed.

Media-specific changes:

Text and audio: The embedding vectors are now nested under an array named segments.