Build a content agent

Build a multi-step agent that analyzes video footage and produces structured creative output - combining domain-specific ingestion, instructions, structured output, and multi-turn refinement.

What you’ll build

An agent workflow that analyzes a video collection for narrative elements and produces structured output. This example builds a micro-drama outline with scene breakdowns, character arcs, and clip references. The same pattern applies to any agent that synthesizes from video - training curricula, compliance audits, content plans.

Prerequisites

  • Complete the Quickstart to create a knowledge store with at least one item in ready status.
  • Read Create a response to understand the request and response format.
  • For best results, configure an ingestion config tailored to your domain.

Step 1: Create a domain-focused knowledge store

Tailor the ingestion config to what your agent needs to extract. This example focuses on narrative elements - characters, emotions, conflicts, and dramatic turning points.

1import json
2import requests
3
4API_KEY = "YOUR_API_KEY"
5BASE_URL = "https://api.twelvelabs.io/v1.3"
6HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}
7
8response = requests.post(
9 f"{BASE_URL}/knowledge-stores",
10 headers=HEADERS,
11 json={
12 "name": "Microdrama Source Material",
13 "ingestion_config": {
14 "enrichment_config": {
15 "description": "Focus on characters and their emotions, interpersonal dynamics, conflicts, tension points, visual mood shifts, dialogue tone, and dramatic turning points. Track recurring characters across videos."
16 }
17 }
18 }
19)
20
21store_id = response.json()["_id"]

After creating the store, add your video assets and wait for indexing to complete. See the Quickstart for the full upload and indexing workflow.

Step 2: Generate structured output with domain instructions

Combine domain instructions with a schema to extract structured narrative elements. The instructions set Jockey’s creative perspective; the schema defines the output format.

1drama_schema = {
2 "type": "object",
3 "properties": {
4 "title": {"type": "string"},
5 "logline": {"type": "string"},
6 "characters": {
7 "type": "array",
8 "items": {
9 "type": "object",
10 "properties": {
11 "name": {"type": "string"},
12 "role": {"type": "string"},
13 "arc": {"type": "string"}
14 }
15 }
16 },
17 "scenes": {
18 "type": "array",
19 "items": {
20 "type": "object",
21 "properties": {
22 "scene_number": {"type": "integer"},
23 "description": {"type": "string"},
24 "video_reference": {"type": "string"},
25 "timestamp": {"type": "string"},
26 "dramatic_function": {"type": "string"}
27 }
28 }
29 },
30 "central_conflict": {"type": "string"},
31 "resolution": {"type": "string"}
32 }
33}
34
35response = requests.post(
36 f"{BASE_URL}/responses",
37 headers=HEADERS,
38 json={
39 "model": "jockey1.0",
40 "instructions": "You are a micro-drama creator. Analyze the video collection for narrative potential. Identify characters, conflicts, and dramatic moments. Construct a compelling 60-second micro-drama outline.",
41 "input": [
42 {
43 "type": "message",
44 "role": "user",
45 "content": "Create a micro-drama from these videos. Focus on the strongest emotional arc you can find."
46 }
47 ],
48 "tools": [
49 {"type": "knowledge_store", "knowledge_store_id": store_id}
50 ],
51 "text": {"format": "json_schema", "json_schema": drama_schema}
52 }
53)
54
55result = response.json()
56session_id = result["session_id"]
57
58for output in result["output"]:
59 if output["type"] == "message":
60 for content in output["content"]:
61 drama = json.loads(content["text"])
62 print(f"Title: {drama['title']}")
63 print(f"Logline: {drama['logline']}")
64 print(f"Conflict: {drama['central_conflict']}")
65 for scene in drama["scenes"]:
66 print(f" Scene {scene['scene_number']}: {scene['description']}")

Step 3: Refine in a multi-turn session

Use the session to iterate toward the desired output. Jockey remembers the previous results and adjusts based on your feedback.

1response = requests.post(
2 f"{BASE_URL}/responses",
3 headers=HEADERS,
4 json={
5 "model": "jockey1.0",
6 "session_id": session_id,
7 "input": [
8 {
9 "type": "message",
10 "role": "user",
11 "content": "Make the conflict sharper. Can you find a stronger turning point moment?"
12 }
13 ],
14 "tools": [
15 {"type": "knowledge_store", "knowledge_store_id": store_id}
16 ],
17 "text": {"format": "json_schema", "json_schema": drama_schema}
18 }
19)

The pattern

This workflow demonstrates a reusable agent pattern you can adapt to any domain:

  1. Domain ingestion config - tell Jockey what to focus on during indexing
  2. Domain instructions - specialize the response-time behavior
  3. Structured output - get typed, parseable results
  4. Multi-turn refinement - iterate toward the desired output

Swap the domain to build different agents:

Agent typeIngestion focusInstructionsOutput schema
Micro-dramaCharacters, emotions, conflicts”You are a micro-drama creator…”Scenes, characters, arcs
Training curriculumLearning objectives, demonstrations”You are an instructional designer…”Modules, objectives, assessments
Compliance auditorClaims, disclosures, regulations”You are a compliance reviewer…”Findings, severity, recommendations
Content plannerTopics, audience signals, gaps”You are a content strategist…”Calendar, themes, priorities

Variations

  • Genre shift: Change instructions to “horror micro-drama”, “comedy sketch”, or “documentary short”
  • Character focus: “Build the drama around the most frequently appearing person”
  • Multi-episode: Use session continuity to create a series: “Now create episode 2 continuing from this ending”

Jupyter notebook

Download the notebook to run this recipe interactively.

See also