The video embedding object

The video embedding object has the following fields:

  • engine_name: A string representing the name of the video understanding engine the platform has used to create the embedding.
  • video_embeddings: An array of objects containing your embeddings. Each object corresponds to an embedding and has the following fields:
    • start_offset_time: The start time of the video segment for this embedding. If the embedding scope is video, this field equals 0.
    • end_offset_time: The end time of the video segment for this embedding. If the embedding scope is video, this field equals the duration of the video.
    • embedding_scope: Indicates the scope of the embeddings. It can take the following values:
      • video: The platform has generated an embedding for the entire video.
      • clip: The platform generated embeddings for a specific segment.
    • float: An array of floating point numbers representing an embedding. This array has 1024 dimensions, and you can use it with cosine similarity for various downstream tasks.
  • metadata: An object containing metadata about the video. This object contains the following fields:
    • input_filename: The name of the video file. The platform returns this field when you upload a video from your local file system.
    • input_url: The URL of the video. The platform returns this field when you upload a video from a publicly accessible URL.
    • video_clip_length: The duration for each clip in seconds, as specified by the video_clip_length parameter of the POST method of the /embed/task endpoint. Note that the platform automatically truncates video segments shorter than 2 seconds. For a 31-second video divided into 6-second segments, the final 1-second segment will be truncated. This truncation only applies to the last segment if it does not meet the minimum length requirement of 2 seconds.
    • video_embedding_scope: The scope of the video embedding. It can take one of the following values: ['clip'] or ['clip', 'video].
    • duration: The total duration of the video in seconds.