An engine indexes your videos and allows you to find specific moments in your video library.

While using the API, you interact with an engine in the following ways:

  • Creating indexes: Each engine uses its own family of deep learning models to index videos. When you create an index, you assign it to an engine and specify how it'll process your videos by passing the index_options parameter in the body of the request. These settings apply to all the videos you upload to your index and cannot be changed.

  • Performing searches: When you perform a search, you pass at least the following parameters:

    • Your search query. Note that the API supports full natural language-based search. The following examples are valid queries: "birds flying near a castle", "sun shining on water", "chickens on the road", "an officer holding a child's hand.", "crowd cheering in the stadium."
    • The unique identifier of the index that you want to search
    • The source of information the engine uses when performing a search. For details, see the Search options page.

    The engine uses these parameters to find the moments in your videos that match your requirements and returns an array of objects. Depending on whether you're using simple or complex queries, the structure and fields of the response are described on the API Reference > Search or API Reference > Combined queries page.

To handle different use cases and to improve the performance of the API service, Twelve Labs has developed the engines described in the sections below.


The second version of the Marengo engine offers performance improvements over the first version. Its state-of-the-art AI technology provides a 70% relative performance increase when performing visual searches compared to the first version.



Twelve Labs strongly recommends you use marengo2.


Marengo is the engine that was available when the API service launched. It allows you to find the exact moments in your videos by writing semantic queries in everyday language.