Embeddings for indexed videos
The platform allows you to retrieve embeddings for videos you’ve already uploaded and indexed. The embeddings are generated using video scene detection. Video scene detection enables the segmentation of videos into semantically meaningful parts. It involves identifying boundaries between scenes, defined as a series of frames depicting a continuous action or theme. Each segment is between 2 and 10 seconds.
Prerequisites
Your video must be indexed with the Marengo video understanding model version 2.7 or later. For details on enabling this model for an index, see the Create an index page.
Complete example
Code explanation
Python
Node.js
Retrieve the embeddings
Retrieve the embeddings for an indexed video.
Function call: You call the indexes.indexed_assets.retrieve function.
Parameters:
-
index_id: The unique identifier of the index containing your video. -
indexed_asset_id: The unique identifier of your indexed video. -
embedding_option: The types of embeddings to retrieve. Valid values arevisual,audio, andtranscription. You can specify multiple options. This example uses["visual", "audio", "transcription"].See the Embedding options section for details.
Return value: The response contains, among other information, an object named embedding that contains the embedding data for your video. The embedding.video_embedding.segments field is a list of segment objects. Each segment object includes:
float_: The embedding vector (a list of floats).embedding_option: The type of embedding (visual,audio, ortranscription).embedding_scope: The scope of the embedding (clip).start_offset_sec: The start time of the segment in seconds.end_offset_sec: The end time of the segment in seconds.