Dataset examples
A good fine-tuning dataset improves the performance of the model in specific domains. This section provides examples of effective datasets across various fields or industries. Each example includes the following elements:
- Domain: The specific field or industry.
- Dataset content: Types of data to include.
- Annotation guidance: How to label or annotate the data.
- Specific examples: Detailed instances within the domain.
Healthcare
Domain: Different surgeries and medical procedures.
Dataset content: Include diverse videos showing different surgeries and procedures.
Annotation guidance: Identify key steps, label medical tools, and define basic terms.
Specific examples:
- Procedure: Cardiac catheterization.
- Key steps: Patient preparation, catheter insertion, contrast dye injection, image acquisition, etc.
- Annotation: Mark procedural phases, label catheter types, and identify anatomical landmarks.
- Procedure: Annual health check-up.
- Key steps: Check vital signs, examine body systems, discuss health history, etc.
- Annotation: Mark examination steps, label medical tools, and identify patient-doctor interactions.
Sports
Domain: Various sporting events and activities
Dataset content: Include videos of different sports, focusing on specific actions, plays, and rules.
Annotation guidance: Label techniques, mark key events, and annotate rule applications.
Specific examples:
- American football: Stiff arm.
- Definition: Defensive maneuver to fend off a tackler.
Annotation: Label the technique, ensuring spatial and temporal tightness.
- Definition: Defensive maneuver to fend off a tackler.
- Ice hockey: Slapshot
- Definition: A powerful shot where the player swings their stick in a wide arc before striking the puck.
- Annotation: Label the technique, ensuring spatial and temporal tightness.
- Basketball: Pick-and-roll
- Definition: A play involving a player setting a screen (pick) for a teammate and then moving toward the basket (roll).
- Annotation: Label the technique, ensuring spatial and temporal tightness.
Media & Entertainment (M&E)
Domain: Television shows, movies, and live performances.
Dataset content: Include a variety of videos, such as TV episodes, movie scenes, music videos, and live concert recordings.
Annotation Guidance: Identify key scenes, label dialogue segments, and identify special effects.
Specific Examples:
- Scenario: Golden Buzzer Moment from talent shows like "America's Got Talent"
- Key Elements: Performance build-up, emotional peak, judges' and host's reactions, special effects.
- Annotation: Label the exact moment the golden buzzer is pressed, the visual effects (confetti, lighting), and the ensuing reactions.
- Scenario: Dramatic TV series scene "Big Reveal Moment"
- Key elements: The moment when a major plot twist or hidden truth is revealed, drastically changing the narrative or characters' understanding.
- Annotation: Label the suspenseful dialogue and visual cues leading up to the reveal, and mark the exact moment the critical information is disclosed
Updated 4 months ago