Create text, image, and audio embeddings

The resources.Embed class provides methods to create text, image, and audio embeddings.

Create text, image, and audio embeddings

Description: This method creates a new embedding.

Note that you must specify at least the following parameters:

  • model_name: The name of the video understanding model to use.

  • One or more of the following input types:

    • text: For text embeddings
    • audio_url or audio_file: For audio embeddings. If you specify both, the audio_url parameter takes precedence.
    • image_url or image_file: For image embeddings. If you specify both, the image_url parameter takes precedence.

You must provide at least one input type, but you can include multiple types in a single function call.

Function signature and example:

def create(
    self,
    model_name: Literal["Marengo-retrieval-2.7"],
    *,
    # text params
    text: str = None,
    text_truncate: Literal["none", "start", "end"] = None,
    # audio params
    audio_url: str = None,
    audio_file: Union[str, BinaryIO, None] = None,
    # image params
    image_url: str = None,
    image_file: Union[str, BinaryIO, None] = None,
    **kwargs,
) -> models.CreateEmbeddingsResult
def print_segments(segments: List[SegmentEmbedding], max_elements: int = 5):
    for segment in segments:
        print(
            f"  embedding_scope={segment.embedding_scope} start_offset_sec={segment.start_offset_sec} end_offset_sec={segment.end_offset_sec}"
        )
        print(f"  embeddings: {segment.embeddings_float[:max_elements]}")
            
          
res = client.embed.create(
  model_name="Marengo-retrieval-2.7",
  text_truncate="start",
  text="<YOUR_TEXT>",
  audio_url="<YOUR_AUDIO_URL>",
  image_url="<YOUR_IMAGE_URL>"
)

print(f" Model: {res.model_name}")
if res.text_embedding is not None and res.text_embedding.segments is not None:
  print("Created text embeddings:")
  print_segments(res.text_embedding.segments)
if res.image_embedding is not None and res.image_embedding.segments is not None:
  print("Created image embeddings:")
  print_segments(res.image_embedding.segments)
if res.audio_embedding is not None and res.audio_embedding.segments is not None:
  print("Created audio embeddings:")
  print_segments(res.audio_embedding.segments)

Parameters:

NameTypeRequiredDescription
model_nameLiteral["Marengo-retrieval-2.7"YesThe name of the video understanding model to use. Example: "Marengo-retrieval-2.7".
textstrYesThe text for which you want to create an embedding.
text_truncateLiteral["none", "start", "end"]YesSpecifies how to truncate the text if it exceeds the maximum length of 77 tokens.
audio_urlstrA publicly accessible URL of your audio file.
audio_fileUnion[str, BinaryIO, None]A local audio file.
image_urlstrA publicly accessible URL of your image file.
image_fileUnion[str, BinaryIO, None]A local image file.
**kwargsdictNoAdditional keyword arguments for the request.

Return value: Returns a models.CreateEmbeddingsResult object containing the embedding results.

API Reference: For a description of each field in the request and response, see the Create text embeddings page.

Related guides: