Introduction - Mixpeek

Welcome to the Mixpeek documentation! We’re happy to empower you to build multimodal understanding applications.

Before we dive in, let’s quickly define what a Multimodal Warehouse is:

Multimodal Warehouse: A specialized data platform that ingests, processes, and enables retrieval across diverse media types (text, images, videos, audio, PDFs) by extracting features and storing them in optimized collections and feature stores.

It serves as the foundation for building AI-powered search and analysis applications that can work seamlessly across different content types.

Within Mixpeek, developers have access to pre-built feature extraction pipelines and retrieval stages that enables ad-hoc search and discovery across any file type.

Unified Multimodal Platform

Process and analyze any content type with a single platform
Example: Finding unique faces in videos that mention topics present in pdfs

Custom Processing Pipelines

Define exactly how your content is processed with customizable feature extraction
Example: Creating a pipeline that extracts speaker identity, sentiment, and product mentions from customer call recordings

Advanced Retrieval

Combine vector similarity with metadata filtering for precise results across modalities
Example: Searching for “marketing videos featuring outdoor scenes that align with our brand guidelines document”

Data Organization

Apply taxonomies and clustering to bring structure to unstructured content
Example: Automatically organizing product images into categories based on visual similarity and metadata from product descriptions

Always State-of-the-Art

Continuous Improvement

Mixpeek manages all feature extractors and retriever stages, continuously updating them to incorporate the latest advancements in AI and ML.

Tightly Coupled

Our extractors and retrievers are designed as an integrated system. This tight coupling enables advanced techniques like late interaction models (e.g., ColBERT).

Unlike traditional solutions that require rebuilding your entire index when upgrading to new models, Mixpeek performs upgrades seamlessly behind the scenes—keeping you on the cutting edge without disruption.

How It Works

Upload Objects

Upload and store your multimodal data in buckets, organizing related files as objects

Extract Features

Extract features using custom pipelines with specialized extractors for different content types

Enrich Documents

Apply taxonomies (joins) and clustering (groups) to categorize and group related content

Retrieve & Analyze

Search and retrieve content using advanced multimodal search capabilities

Getting Started

The fastest way to start using Mixpeek is to follow our Quickstart Guide which will walk you through setting up your first project.

For a deeper understanding of how Mixpeek works, check out our Core Concepts page.

Common Use Cases

Organize and store content to enable efficient search across different modalities:

Storage Pattern

Group related images, videos, and text documents in single objects
Store raw files alongside their extracted features
Maintain indexes for cross-modal querying

Example Object Structure

{
  "product_id": "shoe_123",
  "blobs": [
    {"type": "image", "url": "front_view.jpg"},
    {"type": "image", "url": "side_view.jpg"},
    {"type": "text", "url": "description.txt"},
    {"type": "video", "url": "rotation.mp4"}
  ]
}

Organize and store content to enable efficient search across different modalities:

Storage Pattern

Group related images, videos, and text documents in single objects
Store raw files alongside their extracted features
Maintain indexes for cross-modal querying

Example Object Structure

{
  "product_id": "shoe_123",
  "blobs": [
    {"type": "image", "url": "front_view.jpg"},
    {"type": "image", "url": "side_view.jpg"},
    {"type": "text", "url": "description.txt"},
    {"type": "video", "url": "rotation.mp4"}
  ]
}

Automated Analysis Workflows

Structure content to support comprehensive analytics processing:

Storage Pattern

Raw media files paired with extracted metadata
Processing state tracking for each stage
Version control for processed outputs

Example Object Structure

{
  "video_id": "stream_456",
  "blobs": [
    {"type": "video", "url": "source.mp4"},
    {"type": "json", "url": "scene_metadata.json"},
    {"type": "text", "url": "transcript.txt"},
    {"type": "json", "url": "analytics_results.json"}
  ]
}

ML Training Data Management

Organize datasets for machine learning model training:

Storage Pattern

Paired multimodal training examples
Ground truth data alongside source files
Version control for dataset iterations

Example Object Structure

{
  "training_pair_id": "train_789",
  "blobs": [
    {"type": "image", "url": "input.jpg"},
    {"type": "json", "url": "labels.json"},
    {"type": "text", "url": "annotations.txt"},
    {"type": "json", "url": "metadata.json"}
  ]
}

Content Processing Workflows

Manage data through multi-stage processing pipelines:

Storage Pattern

Source files organized for parallel processing
Intermediate results stored with clear lineage
Output files linked to source data

Example Object Structure

{
  "batch_id": "process_101",
  "blobs": [
    {"type": "video", "url": "raw_footage.mp4"},
    {"type": "json", "url": "processing_state.json"},
    {"type": "json", "url": "extracted_features.json"},
    {"type": "text", "url": "processing_logs.txt"}
  ]
}

Each use case leverages Mixpeek’s ability to process and understand relationships across different content types - from text and images to video and audio - providing a unified view of your data.

Ready to get started? Create your first project →

On this page

Always State-of-the-Art
How It Works
Getting Started
Common Use Cases

Welcome to the Mixpeek documentation! We’re happy to empower you to build multimodal understanding applications.

Before we dive in, let’s quickly define what a Multimodal Warehouse is:

Multimodal Warehouse: A specialized data platform that ingests, processes, and enables retrieval across diverse media types (text, images, videos, audio, PDFs) by extracting features and storing them in optimized collections and feature stores.

It serves as the foundation for building AI-powered search and analysis applications that can work seamlessly across different content types.

Within Mixpeek, developers have access to pre-built feature extraction pipelines and retrieval stages that enables ad-hoc search and discovery across any file type.

Unified Multimodal Platform

Process and analyze any content type with a single platform
Example: Finding unique faces in videos that mention topics present in pdfs

Custom Processing Pipelines

Define exactly how your content is processed with customizable feature extraction
Example: Creating a pipeline that extracts speaker identity, sentiment, and product mentions from customer call recordings

Advanced Retrieval

Combine vector similarity with metadata filtering for precise results across modalities
Example: Searching for “marketing videos featuring outdoor scenes that align with our brand guidelines document”

Data Organization

Apply taxonomies and clustering to bring structure to unstructured content
Example: Automatically organizing product images into categories based on visual similarity and metadata from product descriptions

Always State-of-the-Art

Continuous Improvement

Mixpeek manages all feature extractors and retriever stages, continuously updating them to incorporate the latest advancements in AI and ML.

Tightly Coupled

Our extractors and retrievers are designed as an integrated system. This tight coupling enables advanced techniques like late interaction models (e.g., ColBERT).

How It Works

Upload Objects

Upload and store your multimodal data in buckets, organizing related files as objects

Extract Features

Extract features using custom pipelines with specialized extractors for different content types

Enrich Documents

Apply taxonomies (joins) and clustering (groups) to categorize and group related content

Retrieve & Analyze

Search and retrieve content using advanced multimodal search capabilities

Getting Started

The fastest way to start using Mixpeek is to follow our Quickstart Guide which will walk you through setting up your first project.

For a deeper understanding of how Mixpeek works, check out our Core Concepts page.

Common Use Cases

Organize and store content to enable efficient search across different modalities:

Storage Pattern

Group related images, videos, and text documents in single objects
Store raw files alongside their extracted features
Maintain indexes for cross-modal querying

Example Object Structure

{
  "product_id": "shoe_123",
  "blobs": [
    {"type": "image", "url": "front_view.jpg"},
    {"type": "image", "url": "side_view.jpg"},
    {"type": "text", "url": "description.txt"},
    {"type": "video", "url": "rotation.mp4"}
  ]
}

Organize and store content to enable efficient search across different modalities:

Storage Pattern

Group related images, videos, and text documents in single objects
Store raw files alongside their extracted features
Maintain indexes for cross-modal querying

Example Object Structure

{
  "product_id": "shoe_123",
  "blobs": [
    {"type": "image", "url": "front_view.jpg"},
    {"type": "image", "url": "side_view.jpg"},
    {"type": "text", "url": "description.txt"},
    {"type": "video", "url": "rotation.mp4"}
  ]
}

Automated Analysis Workflows

Structure content to support comprehensive analytics processing:

Storage Pattern

Raw media files paired with extracted metadata
Processing state tracking for each stage
Version control for processed outputs

Example Object Structure

{
  "video_id": "stream_456",
  "blobs": [
    {"type": "video", "url": "source.mp4"},
    {"type": "json", "url": "scene_metadata.json"},
    {"type": "text", "url": "transcript.txt"},
    {"type": "json", "url": "analytics_results.json"}
  ]
}

ML Training Data Management

Organize datasets for machine learning model training:

Storage Pattern

Paired multimodal training examples
Ground truth data alongside source files
Version control for dataset iterations

Example Object Structure

{
  "training_pair_id": "train_789",
  "blobs": [
    {"type": "image", "url": "input.jpg"},
    {"type": "json", "url": "labels.json"},
    {"type": "text", "url": "annotations.txt"},
    {"type": "json", "url": "metadata.json"}
  ]
}

Content Processing Workflows

Manage data through multi-stage processing pipelines:

Storage Pattern

Source files organized for parallel processing
Intermediate results stored with clear lineage
Output files linked to source data

Example Object Structure

{
  "batch_id": "process_101",
  "blobs": [
    {"type": "video", "url": "raw_footage.mp4"},
    {"type": "json", "url": "processing_state.json"},
    {"type": "json", "url": "extracted_features.json"},
    {"type": "text", "url": "processing_logs.txt"}
  ]
}

Each use case leverages Mixpeek’s ability to process and understand relationships across different content types - from text and images to video and audio - providing a unified view of your data.

Ready to get started? Create your first project →

On this page

Always State-of-the-Art
How It Works
Getting Started
Common Use Cases

Unified Multimodal Platform

Custom Processing Pipelines

Advanced Retrieval

Data Organization

​Always State-of-the-Art

Continuous Improvement

Tightly Coupled

​How It Works

​Getting Started

​Common Use Cases

​Cross-Modal Search Operations

​Cross-Modal Search Operations

​Automated Analysis Workflows

​ML Training Data Management

​Content Processing Workflows

Pages

Unified Multimodal Platform

Custom Processing Pipelines

Advanced Retrieval

Data Organization

​Always State-of-the-Art

Continuous Improvement

Tightly Coupled

​How It Works

​Getting Started

​Common Use Cases

​Cross-Modal Search Operations

​Cross-Modal Search Operations

​Automated Analysis Workflows

​ML Training Data Management

​Content Processing Workflows

Always State-of-the-Art

How It Works

Getting Started

Common Use Cases

Cross-Modal Search Operations

Cross-Modal Search Operations

Automated Analysis Workflows

ML Training Data Management

Content Processing Workflows

Always State-of-the-Art

How It Works

Getting Started

Common Use Cases

Cross-Modal Search Operations

Cross-Modal Search Operations

Automated Analysis Workflows

ML Training Data Management

Content Processing Workflows