This guide shows how you can create text embeddings.

The following table lists the available models for generating text embeddings and their key characteristics:

Model	Description	Dimensions	Max tokens	Similarity metric
Marengo-retrieval-2.6	Use this model to create embeddings that you can use in various downstream tasks	1024	77	Cosine similarity

The “Marengo-retrieval-2.6” video understanding engine generates embeddings for all modalities in the same latent space. This shared space enables any-to-any searches across different types of content.

To create text embeddings, invoke the create method of the embed class specifying the following parameters:

engine_name: The name of the video understanding engine you want to use.
text: The text for which you want to create an embedding.
(Optional) text_truncate: Specifies the behavior for text that exceeds 77 tokens. It can take one of the following values:
- start: Truncate the beginning of the text.
- end: Truncate the end of the text (default).
- none: Return an error if the text exceeds the token limit.

The response is an object containing the following fields:

engine_name: The name of the engine the platform has used to create this embedding.
text_embedding: An object that contains the embedding.

For a description of each field in the request and response, see the Create embeddings for text, image, and audio page.

Prerequisites

You’re familiar with the concepts that are described on the Platform overview page.
To use the platform, you need an API key:
1
If you don’t have an account, sign up for a free account.
2
Go to the API Key page.
3
Select the Copy icon next to your key.

Example

The example code below creates a text embedding using the default behavior for handling text that is too long. Ensure you replace the placeholders surrounded by <> with your values.

1 from twelvelabs import TwelveLabs
2 from typing import List
3 from twelvelabs.models.embed import SegmentEmbedding
4 
5 def print_segments(segments: List[SegmentEmbedding], max_elements: int = 5):
6     for segment in segments:
7         print(
8             f"  embedding_scope={segment.embedding_scope} start_offset_sec={segment.start_offset_sec} end_offset_sec={segment.end_offset_sec}"
9         )
10         print(f"  embeddings: {segment.embeddings_float[:max_elements]}")
11         
12 client = TwelveLabs(api_key="<YOUR_API_KEY>")
13             
14 res = client.embed.create(
15   engine_name="Marengo-retrieval-2.6",
16   text="<YOUR_TEXT>",
17 )
18 
19 print("Created a text embedding")
20 print(f" Engine: {res.engine_name}")
21 if res.text_embedding is not None and res.text_embedding.segments is not None:
22         print_segments(res.text_embedding.segments)

The output should look similar to the following one:

Created a text embedding
 Engine: Marengo-retrieval-2.6
  embedding_scope=None start_offset_sec=None end_offset_sec=None
  embeddings: [, -, 0, ., 0, (truncated)]