Provide transcriptions from the local file system

In this guide you'll learn how to upload a video and provide a transcription when both files reside on the local file system. For a description of each field in the request and response, see the API Reference > Create a video indexing task page.

Prerequisites

  • You’re familiar with the concepts that are described on the Platform overview page.
  • You’ve created at least one index, and the unique identifier of your index is stored in a variable named INDEX_ID. For details, see the Create indexes page.
  • Your transcription must be in the SRT or VTT format.
  • Your video must meet the following requirements:
    • Video resolution: Must be greater or equal than 360p and less or equal than 4K.
    • Duration: For Marengo, it must be between 4 seconds and 2 hours (7,200s). For Pegasus, it must be between 5 seconds and 30 minutes (1800s).
    • File size: Must not exceed 2 GB.
      If you require different options, send us an email at [email protected].
    • Audio track: If the conversation engine option is selected, the video you're uploading must contain an audio track.

📘

Note

For consistent search results, Twelve Labs recommends you upload 360p videos.

Procedure

  1. Declare the /tasks endpoint:

    TASKS_URL = f"{API_URL}/tasks"
    
    const TASKS_URL = `${API_URL}/tasks`
    
  2. Read your video file. Open a stream, making sure to replace the placeholders surrounded by <> with your values:

    video_file_path = "<YOUR_VIDEO_FILE_PATH>"
    video_file_name = "<YOUR_VIDEO_FILE_NAME>"
    video_file_stream = open(video_file_path, "rb")
    
    const video_file_path = '<YOUR_VIDEO_FILE_PATH>'
    const video_file_stream = fs.createReadStream(video_file_path)
    
  3. Read your transcription file. Open a stream, making sure to replace the placeholders surrounded by <> with your values:

    transcription_file_path = "<YOUR_TRANSCRIPTION_FILE_PATH>"
    transcription_file_name = "<YOUR_TRANSCRIPTION_FILE_NAME>"
    transcription_file_stream = open(transcription_file_path, "rb")
    
    const transcription_file_path = '<YOUR_TRANSCRIPTION_FILE_PATH>'
    const transcription_file_stream = fs.createReadStream(transcription_file_path)
    
  4. To provide a transcription file, you must set the provide_transcription parameter to true and specify the index ID, the language of your video, and the video and transcription files.
    If you're using Python, store the provide_transcription parameter, the index ID, and the language of your video in a dictionary named data. Then, store the video and transcription files in an array named file_param.
    If you're using Node.js, store all the required parameters in a variable named formData of type FormData:

    data = {
       "provide_transcription": True,
       "index_id": INDEX_ID,
       "language": "en",
    }
    file_param = [
        ("video_file", (video_file_name, video_file_stream, "application/octet-stream")),
        ("transcription_file", (transcription_file_name, transcription_file_stream, "application/octet-stream")),
    ]
    
      let formData = new FormData()
      formData.append('provide_transcription', 'true')
      formData.append('INDEX_ID', INDEX_ID)
      formData.append('language', 'en')
      formData.append('video_file', video_file_stream)
      formData.append('transcription_file', transcription_file_stream)
    
  5. Upload your video and transcription files. Call the POST method of the /tasks endpoint and store the result in a variable named response:

    response = requests.post(TASKS_URL, headers=headers, data=data, files=file_param)
    
    let config = {
          method: 'post',
          url: TASKS_URL,
          headers: headers,
          data : formData
    };
    let resp = await axios(config)
    let response = await resp.data
    
  6. Store the ID of your task in a variable named TASK_ID and print the status code and response:

    TASK_ID = response.json().get("_id")
    print (f"Status code: {response.status_code}")
    pprint (response.json())
    
    const TASK_ID = response._id
    console.log(`Status code: ${resp.status}`)
    console.log(response)
    
  7. (Optional) You can use the GET method of the /tasks/{_id} endpoint to monitor the indexing process. Construct the URL for retrieving the status of your video indexing task based on the TASK_ID variable you’ve declared in the previous step, and wait until the status shows as ready:

    TASK_STATUS_URL = f"{API_URL}/tasks/{TASK_ID}"
    while True:
        response = requests.get(TASK_STATUS_URL, headers=headers)
        STATUS = response.json().get("status")
        if STATUS == "ready":
            print (f"Status code: {STATUS}")
            pprint (response.json())
            break
        time.sleep(10)
    
    const TASK_STATUS_URL = `${API_URL}/tasks/${TASK_ID}`
    config = {
      method: 'get',
      url: TASK_STATUS_URL,
      headers: headers,
    }
    let STATUS
    do {
      resp = await axios(config)
      response = await resp.data
      STATUS = response.status
      if (STATUS !== 'ready')
        await new Promise(r => setTimeout(r, 10000))
    } while (STATUS !== 'ready')
    console.log(`Status code: ${STATUS}`)
    console.log(response)
    

    The output should look similar to the following one:

    Status code: 200
    {
      "_id": "6391c89afb14854546891276",
      "created_at": "2022-12-08T11:20:58.859Z",
      "estimated_time": "2022-12-08T11:34:54.585Z",
      "index_id": "6391c88bfb14854546891275",
      "metadata": {
        "duration": 810.84,
        "filename": "best-racing-moments.mp4",
        "height": 480,
        "width": 854
      },
      "status": "ready",
      "type": "index_task_info",
      "updated_at": "2022-12-08T11:34:50.474Z",
      "video_id": "6391c8a669ff3402ec515aba"
    }
    

📘

Note

For details about the possible statuses, see the API Reference > Tasks page.