Activity Grouping - Mixpeek

The Activity Grouping extractor identifies human activities in videos, tracks them across frames, and groups similar activities for semantic video understanding.

Overview

The Activity Grouping feature extractor analyzes video content to detect human actions and activities, classifies them into categories, and groups similar activities throughout the video. This enables activity-based search and retrieval in video content.

Required Inputs

Parameter	Type	Required	Default	Description
video_url	string	Yes	-	URL pointing to video file. Supported formats: MP4, MOV, AVI
min_activity_duration	float	No	1.0	Minimum activity duration in seconds
detection_threshold	float	No	0.7	Confidence threshold for activity detection (0.0-1.0)
similarity_threshold	float	No	0.8	Similarity threshold for grouping activities (0.0-1.0)

Configurations

Detection Modes

The extractor supports different detection modes based on use case requirements:

Mode	Description	Best For
`standard`	Balanced accuracy and performance	General activity detection
`high-accuracy`	Prioritizes detection accuracy	Action recognition
`high-performance`	Prioritizes processing speed	Real-time applications

Configuration Examples

Standard

{
  "detection_mode": "standard",
  "min_activity_duration": 1.0,
  "detection_threshold": 0.7,
  "similarity_threshold": 0.8
}

High Accuracy

{
  "detection_mode": "high-accuracy",
  "min_activity_duration": 0.5,
  "detection_threshold": 0.85,
  "similarity_threshold": 0.9,
  "detect_fine_grained": true
}

High Performance

{
  "detection_mode": "high-performance",
  "min_activity_duration": 1.5,
  "detection_threshold": 0.6,
  "similarity_threshold": 0.7,
  "frame_sampling": 5
}

Activity Grouping Options

Option	Type	Default	Description
`detect_fine_grained`	boolean	`false`	Detect detailed sub-categories of activities
`track_participants`	boolean	`true`	Track the people participating in activities
`generate_embedding`	boolean	`true`	Generate activity embedding for similarity matching
`extract_representative_frames`	boolean	`true`	Extract key frames representing the activity
`frame_sampling`	integer	`1`	Process every Nth frame for performance

Configuration Examples

Sample

{
  "detect_fine_grained": true,             // Detect detailed activity types
  "track_participants": true,              // Track people in activities
  "generate_embedding": true,              // Generate activity embedding
  "extract_representative_frames": true,   // Extract key frames
  "frame_sampling": 2                      // Process every 2nd frame
}

Processing Flow

Output Schema

This feature extractor will output as features in the feature store.

{
  "document_id": "doc_abc123",
  "collection_id": "col_xyz789",
  "source_object_id": "obj_def456",
  
  // Activity grouping results
  "activity": {
    "id": "activity_12345",
    "class": "dancing",
    "fine_grained_class": "ballroom_dancing",
    "confidence": 0.92,
    "instances": [
      {
        "start_frame": 120,
        "end_frame": 180,
        "start_time": 4.0,
        "end_time": 6.0,
        "duration": 2.0,
        "participants": [
          "person_123", "person_456"
        ],
        "confidence": 0.94,
        "bounding_box": {
          "x": 125,
          "y": 80,
          "width": 320,
          "height": 450
        }
      },
      {
        "start_frame": 300,
        "end_frame": 390,
        "start_time": 10.0,
        "end_time": 13.0,
        "duration": 3.0,
        "participants": [
          "person_123", "person_456"
        ],
        "confidence": 0.91,
        "bounding_box": {
          "x": 145,
          "y": 85,
          "width": 310,
          "height": 440
        }
      }
      // Additional instances...
    ],
    "embedding": {
      "model": "activity_encoder_v1",
      "dimension": 512,
      "vector": [0.12, 0.34, ...], // Truncated for brevity
      "normalized": true
    },
    "total_duration": 5.0,              // Total duration of all instances in seconds
    "representative_frame_urls": [
      "https://storage.example.com/activities/activity_12345_1.jpg",
      "https://storage.example.com/activities/activity_12345_2.jpg"
    ]
  },
  
  // Video metadata
  "video": {
    "filename": "dance_performance.mp4",
    "width": 1920,
    "height": 1080,
    "fps": 30,
    "duration": 180.5
  }
}

On this page

Overview
Required Inputs
Configurations
Detection Modes
Configuration Examples
Activity Grouping Options
Configuration Examples
Processing Flow
Output Schema