Create sync embeddings
The Embed.V2 interface provides methods to create embeddings synchronously for multimodal content. This endpoint returns embeddings immediately in the response.
Note
This interface only supports Marengo version 3.0 or newer.
When to use this interface:
- Create embeddings for text, images, audio, or video content
- Get immediate results without waiting for background processing
- Process audio or video content up to 10 minutes in duration
Do not use this interface for:
- Audio or video content longer than 10 minutes. Use the
embed.v2.createmethod instead.
Methods
Create embeddings
Description: This method synchronously creates embeddings for multimodal content and returns the results immediately in the response.
Input requirements
Text:
- Maximum length: 500 tokens
Images:
- Formats: JPEG, PNG
- Minimum size: 128x128 pixels
- Maximum file size: 5 MB
Audio and video:
- Maximum duration: 10 minutes
- Maximum file size for base64 encoded strings: 36 MB
- Audio formats: WAV (uncompressed), MP3 (lossy), FLAC (lossless)
- Video formats: FFmpeg supported formats
- Video resolution: 360x360 to 3840x2160 pixels
- Aspect ratio: Between 1:1 and 1:2.4, or between 2.4:1 and 1:1
Function signature and example:
Parameters:
TextInputRequest
The TextInputRequest interface specifies configuration for processing text content. Required when inputType is text.
ImageInputRequest
The ImageInputRequest interface specifies configuration for processing image content. Required when inputType is image.
TextImageInputRequest
The TextImageInputRequest interface specifies configuration for processing combined text and image content. Required when inputType is text_image.
AudioInputRequest
The AudioInputRequest interface specifies configuration for processing audio content. Required when inputType is audio.
VideoInputRequest
The VideoInputRequest interface specifies configuration for processing video content. Required when inputType is video.
MediaSource
The MediaSource interface specifies the source of the media file. Provide exactly one of the following:
AudioSegmentation
The AudioSegmentation interface specifies how the platform divides the audio into segments using fixed-length intervals.
AudioSegmentationFixed
The AudioSegmentationFixed interface configures fixed-length segmentation for audio.
VideoSegmentation
The VideoSegmentation type specifies how the platform divides the video into segments. Use one of the following:
Fixed segmentation: Divides the video into equal-length segments:
Dynamic segmentation: Divides the video into adaptive segments based on scene changes:
VideoSegmentationFixedFixed
The VideoSegmentationFixedFixed interface configures fixed-length segmentation for video.
VideoSegmentationDynamicDynamic
The VideoSegmentationDynamicDynamic interface configures dynamic segmentation for video based on scene changes.
Return value: Returns a Promise that resolves to an EmbeddingSuccessResponse object containing the embedding results.
The EmbeddingSuccessResponse interface contains the following properties:
The EmbeddingData interface contains the following properties:
API Reference: Create sync embeddings