Backblaze B2 - Media management application

Summary:The Media Asset Management example application demonstrates how you can add video understanding capabilities to a typical media asset management application using Twelve Labs for video understanding and Backblaze B2 for cloud storage.

Description: The application allows you to upload videos and perform deep semantic searches across multiple modalities such as visual, conversation, text-in-video, and logo. The matching video segments in the response are grouped by video, and you can view details about each segment.

Step-by-step guide: The b2-twelvelabs-example GitHub repository provides detailed instructions on setting up the example application on your computer and using it.

GitHub: Backblaze B2 + Twelve Labs Media Asset Management Example


Integration with Twelve Labs

The integration with Twelve Labs video enhances the functionalities of a typical media management application by adding video understanding capabilities. The process is segmented into two main steps:

  • Upload and index videos
  • Search videos

Upload and index videos

A Huey task automates the process of uploading and indexing videos. For each video, the application invokes the create method of the task object with the following parameters and values:

  • index_id: A string representing unique identifier of the index to which the video will be updated.
  • url: A string representing the URL of the video to be uploaded.
  • disable_video_stream: A boolean indicating that the platform shouldn't store the video for streaming.
@huey.db_task()
def do_video_indexing(video_tasks):
    print(f'Creating tasks: {video_tasks}')

    # Create a task for each video we want to index
    for video_task in video_tasks:
        task = TWELVE_LABS_CLIENT.task.create(
            TWELVE_LABS_INDEX_ID,
            url=default_storage.url(video_task['video']),
            disable_video_stream=True
        )
        print(f'Created task: {task}')
        video_task['task_id'] = task.id

    print(f'Created {len(video_tasks)} tasks')

Then, the application monitors the status of the upload process by invoking the retrieve method of the task object with the unique identifier of a task as a parameter:

    print(f'Polling Twelve Labs for {video_tasks}')

    # Do a single database query for all the videos we're interested in
    video_ids = [video_task['id'] for video_task in video_tasks]
    videos = Video.objects.filter(id__in=video_ids)

    while True:
        done = True
        videos_to_save = []

        # Retrieve status for each task we created
        for video_task in video_tasks:
            # What's our current state for this video?
            video = videos.get(video__exact=video_task['video'])

            # Do we still need to retrieve status for this task?
            if video.status != 'Ready':
                task = TWELVE_LABS_CLIENT.task.retrieve(video_task['task_id'])
                if task.status != 'ready':
                    # We'll need to go round the loop again
                    done = False

                # Do we need to write a new status to the DB?
                if video.status.lower() != task.status:
                    # We store the status in the DB in title case, so it's ready to render on the page
                    new_status = task.status.title()
                    print(f'Updating status for {video_task["video"]} from {video.status} to {new_status}')
                    video.status = new_status
                    if task.status == 'ready':
                        video.video_id = task.video_id
                        get_all_video_data(video)
                    videos_to_save.append(video)

        if len(videos_to_save) > 0:
            Video.objects.bulk_update(videos_to_save, ['status', 'video_id', THUMBNAILS_PATH, TRANSCRIPTS_PATH, TEXT_PATH, LOGOS_PATH])

        if done:
            break

        sleep(TWELVE_LABS_POLL_INTERVAL)

    print(f'Done polling {video_tasks}')

Search videos


The application invokes the query method of the search object with the following parameters:

  • index_id: A string representing the unique identifier of the index containing the videos to be searched.
  • query: A string representing the query the user has provided.
  • options: An array of strings representing the sources of information the Twelve Labs video understanding platform should consider when performing the search.
  • group_by: A string specifying that the matching video clips in the response must be grouped by video.
  • threshold: A string specifying the sstrictness of the thresholds for assigning the high, medium, or low confidence levels to search results. See the Filter on the level of confidence section for details.
def get_queryset(self):
    """
    Search Twelve Labs for videos matching the query
    """
    query = self.request.GET.get("query", None)

    result = TWELVE_LABS_CLIENT.search.query(
        TWELVE_LABS_INDEX_ID,
        query,
        ["visual", "conversation", "text_in_video", "logo"],
        group_by="video",
        threshold="medium"
    )

    # Search results may be in multiple pages, so we need to loop until we're done retrieving them
    search_data = result.data
    print(f"First page's data: {search_data}")

    search_results = []
    while True:
        # Do a database query to get the videos for each page of results
        video_ids = [group.id for group in search_data]
        videos = Video.objects.filter(video_id__in=video_ids)
        for group in search_data:
            try:
                search_results.append(SearchResult(video=videos.get(video_id__exact=group.id),
                                                   clip_count=len(group.clips),
                                                   clips=group.clips.model_dump_json()))
            except self.model.DoesNotExist:
                # There is a video in Twelve Labs, but no corresponding row in the database.
                # Just report it and carry on.
                print(f'Can\'t find match for video_id {group.id}')

        # Is there another page?
        try:
            search_data = next(result)
            print(f"Next page's data: {search_data}")
        except StopIteration:
            print("There is no next page in search result")
            break

    return search_results

Next steps

After reading this page, you have several options:

  • Customize and use the example application: Explore the Media Asset Management example application on GitHub to understand its features and implementation. You can make changes to the application and add more functionalities to suit your specific use case.
  • Explore further: Try the applications built by the community or our sample applications to get more insights into the Twelve Labs Video Understanding Platform's diverse capabilities and learn more about integrating the platform into your applications.