Welcome to the Mixpeek documentation! Use our API to ingest multimodal data, extract features, enrich documents, and retrieve results with hybrid search.
Mixpeek converts unstructured media files (text, images, video, audio, PDFs) into structured documents and optimized feature stores. Query your data using flexible, multi-stage retrievers through a clean API with primitives for data isolation, ingestion, enrichment, and retrieval.
Unified Multimodal Platform
- Store raw files as Objects in Buckets
- Generate Documents in Collections with configured Feature Extractors
- Track lineage from object → features → document → query
Programmable Feature Extraction
- Define Collections with input/output schemas and Feature Extractors
- Ingest in bulk with Batches and bulk uploads
Advanced Retrieval
- Compose Retrievers from stages (KNN, filters, reranking, LLM gen, etc.)
- Execute with filters, grouping, and URL generation via Retrievers
- Learn about stage building blocks on Retrievers
Organization & Isolation
- Multi-tenant isolation via Namespaces and the
X-Namespace
header - Manage API keys under Organizations (see Quickstart for setup)
- Notify external systems with webhooks (see Quickstart for setup)
Data Organization & Enrichment
Taxonomies (Similarity Joins)
Attach or validate Taxonomies to enrich documents with versioning support.
Clustering (Grouping)
Build Clusters over features with multiple algorithms and apply enrichments.
Headers You’ll Use
How It Works
1
Create a Namespace (optional)
Use Namespaces to establish isolation boundaries for data and queries. Pass
X-Namespace
on subsequent calls.2
3
Define Collections & Feature Extractors
Create a Collection with input/output schemas and Feature Extractors. Describe available features to understand addresses and indexes.
4
Enrich with Taxonomies & Clustering
Configure Taxonomies and Clusters to enrich and organize documents.
5
Retrieve & Analyze
Use Retrievers to combine vector search, filters, grouping, and ranking, optionally generating presigned URLs.
Getting Started
The fastest path is the Quickstart Guide: create a namespace, bucket, collection, upload objects, and run your first retriever. For platform concepts, see Core Concepts.Common Use Cases
For ready-to-use multimodal patterns, see Recipes.- Index images, PDFs, video, and text into Collections with appropriate feature_extractors
- Execute a Retriever that combines vector similarity, metadata filters, and reranking
- Optional: return presigned URLs for assets in results (
return_urls=true
)
Mixpeek’s API gives you typed resources to build reliable multimodal applications end‑to‑end:
- Namespaces
- Buckets, Objects, Batches
- Collections, Documents, Feature Extractors
- Taxonomies
- Clusters
- Retrievers
- Tasks
- Webhooks