This guide shows how you can create text embeddings.

The following table lists the available models for generating text embeddings and their key characteristics:

Model	Description	Dimensions	Max tokens	Similarity metric
Marengo-retrieval-2.7	Use this model to create embeddings that you can use in various downstream tasks	1024	77	Cosine similarity

The “Marengo-retrieval-2.7” video understanding model generates embeddings for all modalities in the same latent space. This shared space enables any-to-any searches across different types of content.

Prerequisites

To use the platform, you need an API key:

1
If you don’t have an account, sign up for a free account.
2
Go to the API Key page.
3
Select the Copy icon next to your key.

Ensure the TwelveLabs SDK is installed on your computer:

$ pip install twelvelabs

Complete example

This complete example shows how you can create text embeddings. Ensure you replace the placeholders surrounded by <> with your values.

1 from typing import List
2 
3 from twelvelabs import TwelveLabs
4 from twelvelabs.types import BaseSegment
5 
6 # 1. Initialize the client
7 client = TwelveLabs(api_key="<YOUR_API_KEY>")
8 
9 # 2. Create text embeddings
10 res = client.embed.create(
11     model_name="Marengo-retrieval-2.7",
12     text="<YOUR_TEXT>",
13     # text_truncate="start"
14 )
15 
16 # 3. Process the results
17 def print_segments(segments: List[BaseSegment], max_elements: int = 5):
18     for segment in segments:
19         first_few = segment.float_[:max_elements]
20         print(
21             f"  embeddings: [{', '.join(str(x) for x in first_few)}...] (total: {len(segment.float_)} values)"
22         )
23 
24 print("Created text embedding")
25 if res.text_embedding is not None and res.text_embedding.segments is not None:
26     print_segments(res.text_embedding.segments)

Step-by-step guide

Python

Node.js

Import the SDK and initialize the client

Create a client instance to interact with the TwelveLabs Video Understanding Platform.
Function call: You call the constructor of the TwelveLabs class.
Parameters:

api_key: The API key to authenticate your requests to the platform.

Return value: An object of type TwelveLabs configured for making API calls.

Create text embeddings

Function call: You call the embed.create function.
Parameters:

model_name: The name of the model you want to use (“Marengo-retrieval-2.7”).
text: The text for which you wish to create an embedding.
(Optional) text_truncate: A string that specifies how the platform truncates text that exceeds 77 tokens to fit the maximum length allowed for an embedding. This parameter can take one of the following values:
- start: The platform will truncate the start of the provided text.
- end: The platform will truncate the end of the provided text. This is the default value.
- none: The platform will return an error if the text is longer than the maximum token limit.

Return value: The response contains the following fields:

text_embedding: An object that contains the embedding data for your text. It includes the following fields:
- segments: An object that contains the following:
  - float_: An array of floats representing the embedding
- metadata: An object that contains metadata about the embedding.
model_name: The name of the video understanding model the platform has used to create this embedding.

Process the results

This example prints the results to the standard output.