Provide transcriptions from URLs

In this guide you'll learn how to upload a video file and provide a transcription when both files are available from public URLs. The platform will retrieve the files directly from the specified URLs, so your application doesn't have to store them locally and upload the bytes. For a description of each field in the request and response, see the API Reference > Create a video indexing task page.

Prerequisites

  • You’re familiar with the concepts that are described on the Platform overview page.
  • You’ve created at least one index, and the unique identifier of your index is stored in a variable named INDEX_ID. For details, see the Create indexes page.
  • Your transcription must be in the SRT or VTT format.
  • Your video must meet the following requirements:
    • Video resolution: Must be greater or equal than 360p and less or equal than 4K.
    • Duration: For Marengo, it must be between 4 seconds and 2 hours (7,200s). For Pegasus, it must be between 5 seconds and 30 minutes (1800s).
    • File size: Must not exceed 2 GB.
      If you require different options, send us an email at [email protected].
    • Audio track: If the conversation engine option is selected, the video you're uploading must contain an audio track.

📘

Note

For consistent search results, Twelve Labs recommends you upload 360p videos.

Procedure

  1. Declare the /tasks endpoint:

    TASKS_URL = f"{API_URL}/tasks"
    
    const TASKS_URL = `${API_URL}/tasks`
    
  2. To provide a transcription file, you must set the provide_transcription parameter to true, and specify the index ID, the language of your video, and the URLs of your video and transcription file. If you're using Python, declare a dictionary named data and use it to store all the required parameters. If you're using Node.js, declare a variable named formData of type FormData and use it to store all the required parameters.

    data = {
      "provide_transcription": True,
      "index_id": INDEX_ID,
      "language": "en",
      "video_url": "<YOUR_VIDEO_URL>",
      "transcription_url": "<YOUR_TRANSCRIPTION_URL>"
    }
    
    let formData = new FormData()
    formData.append('provide_transcription', 'true')
    formData.append('index_id', INDEX_ID)
    formData.append('language', 'en')
    formData.append('video_url', '<YOUR_VIDEO_URL>')
    formData.append('transcription_url', '<YOUR_TRANSCRIPTION_URL>')
    
  3. Upload your video. Call the POST method of the /tasks endpoint and store the result in a variable named response:

    response = requests.post(TASKS_URL, headers=headers, data=data)
    
    let config = {
        method: 'post',
        url: TASKS_URL,
        headers: headers,
        data : formData
     }
    let resp = await axios(config)
    let response = await resp.data
    
  4. Store the ID of your task in a variable named TASK_ID and print the status code and the response:

    TASK_ID = response.json().get("_id")
    print (f"Status code: {response.status_code}")
    pprint (response.json())
    
    const TASK_ID = response._id
    console.log(`Status code: ${resp.status}`)
    console.log(response)
    
  5. (Optional) You can use the GET method of the /tasks/{_id} endpoint to monitor the indexing process. Construct the URL for retrieving the status of your video indexing task based on the TASK_ID variable you’ve declared in the previous step, and wait until the status shows as ready:

    TASK_STATUS_URL = f"{API_URL}/tasks/{TASK_ID}"
    while True:
        response = requests.get(TASK_STATUS_URL, headers=headers)
        STATUS = response.json().get("status")
        if STATUS == "ready":
            print (f"Status code: {STATUS}")
            pprint (response.json())
            break
        time.sleep(10)
    
    const TASK_STATUS_URL = `${API_URL}/tasks/${TASK_ID}`
    config = {
      method: 'get',
      url: TASK_STATUS_URL,
      headers: headers,
    }
    let STATUS
    do {
      resp = await axios(config)
      response = await resp.data
      STATUS = response.status
      if (STATUS !== 'ready')
        await new Promise(r => setTimeout(r, 10000))
    } while (STATUS !== 'ready')
    console.log(`Status code: ${STATUS}`)
    console.log(response)
    

    The output should look similar to the following one:

    Status code: 200
    {
      "_id": "6391c89afb14854546891276",
      "created_at": "2022-12-08T11:20:58.859Z",
      "estimated_time": "2022-12-08T11:34:54.585Z",
      "index_id": "6391c88bfb14854546891275",
      "metadata": {
        "duration": 810.84,
        "filename": "best-racing-moments.mp4",
        "height": 480,
        "width": 854
      },
      "status": "ready",
      "type": "index_task_info",
      "updated_at": "2022-12-08T11:34:50.474Z",
      "video_id": "6391c8a669ff3402ec515aba"
    }
    

📘

Note

For details about the possible statuses, see the API Reference > Tasks page.