The Activity Grouping extractor identifies human activities in videos, tracks them across frames, and groups similar activities for semantic video understanding.

Overview

The Activity Grouping feature extractor analyzes video content to detect human actions and activities, classifies them into categories, and groups similar activities throughout the video. This enables activity-based search and retrieval in video content.

Required Inputs

ParameterTypeRequiredDefaultDescription
video_urlstringYes-URL pointing to video file. Supported formats: MP4, MOV, AVI
min_activity_durationfloatNo1.0Minimum activity duration in seconds
detection_thresholdfloatNo0.7Confidence threshold for activity detection (0.0-1.0)
similarity_thresholdfloatNo0.8Similarity threshold for grouping activities (0.0-1.0)

Configurations

Detection Modes

The extractor supports different detection modes based on use case requirements:

ModeDescriptionBest For
standardBalanced accuracy and performanceGeneral activity detection
high-accuracyPrioritizes detection accuracyAction recognition
high-performancePrioritizes processing speedReal-time applications

Configuration Examples

Standard
{
  "detection_mode": "standard",
  "min_activity_duration": 1.0,
  "detection_threshold": 0.7,
  "similarity_threshold": 0.8
}
High Accuracy
{
  "detection_mode": "high-accuracy",
  "min_activity_duration": 0.5,
  "detection_threshold": 0.85,
  "similarity_threshold": 0.9,
  "detect_fine_grained": true
}
High Performance
{
  "detection_mode": "high-performance",
  "min_activity_duration": 1.5,
  "detection_threshold": 0.6,
  "similarity_threshold": 0.7,
  "frame_sampling": 5
}

Activity Grouping Options

OptionTypeDefaultDescription
detect_fine_grainedbooleanfalseDetect detailed sub-categories of activities
track_participantsbooleantrueTrack the people participating in activities
generate_embeddingbooleantrueGenerate activity embedding for similarity matching
extract_representative_framesbooleantrueExtract key frames representing the activity
frame_samplinginteger1Process every Nth frame for performance

Configuration Examples

Sample
{
  "detect_fine_grained": true,             // Detect detailed activity types
  "track_participants": true,              // Track people in activities
  "generate_embedding": true,              // Generate activity embedding
  "extract_representative_frames": true,   // Extract key frames
  "frame_sampling": 2                      // Process every 2nd frame
}

Processing Flow

Output Schema

This feature extractor will output as features in the feature store.

{
  "document_id": "doc_abc123",
  "collection_id": "col_xyz789",
  "source_object_id": "obj_def456",
  
  // Activity grouping results
  "activity": {
    "id": "activity_12345",
    "class": "dancing",
    "fine_grained_class": "ballroom_dancing",
    "confidence": 0.92,
    "instances": [
      {
        "start_frame": 120,
        "end_frame": 180,
        "start_time": 4.0,
        "end_time": 6.0,
        "duration": 2.0,
        "participants": [
          "person_123", "person_456"
        ],
        "confidence": 0.94,
        "bounding_box": {
          "x": 125,
          "y": 80,
          "width": 320,
          "height": 450
        }
      },
      {
        "start_frame": 300,
        "end_frame": 390,
        "start_time": 10.0,
        "end_time": 13.0,
        "duration": 3.0,
        "participants": [
          "person_123", "person_456"
        ],
        "confidence": 0.91,
        "bounding_box": {
          "x": 145,
          "y": 85,
          "width": 310,
          "height": 440
        }
      }
      // Additional instances...
    ],
    "embedding": {
      "model": "activity_encoder_v1",
      "dimension": 512,
      "vector": [0.12, 0.34, ...], // Truncated for brevity
      "normalized": true
    },
    "total_duration": 5.0,              // Total duration of all instances in seconds
    "representative_frame_urls": [
      "https://storage.example.com/activities/activity_12345_1.jpg",
      "https://storage.example.com/activities/activity_12345_2.jpg"
    ]
  },
  
  // Video metadata
  "video": {
    "filename": "dance_performance.mp4",
    "width": 1920,
    "height": 1080,
    "fps": 30,
    "duration": 180.5
  }
}