Retrievers
Configure and use retrieval pipelines for powerful multimodal search
Retrievers are the core search components of Mixpeek, providing flexible and powerful ways to search across your multimodal content with customizable pipelines.
Overview
Retrievers in Mixpeek are configurable search pipelines that allow you to search across your processed content using a combination of vector similarity, metadata filtering, and other search techniques.
They provide a flexible way to build sophisticated search experiences tailored to your specific use cases.
Define Retriever Query Schema
Specify the structure of queries that your retriever will accept, including required and optional parameters.
Select Stages
Choose which retrieval stages to include in your pipeline, such as vector search, filtering, reranking, or fusion stages.
Configure Inputs and Outputs
For each stage, define how it receives inputs from previous stages and how its outputs will be passed to subsequent stages.
Save Retriever
Save your configured retriever to make it available for queries within your namespace.
Execute Query
Run search operations using your retriever with queries that match the defined schema structure.
Search Pipelines
Create multi-stage search pipelines that combine different search techniques
Multimodal Retrieval
Search across text, images, videos, and other content types seamlessly
Key Concepts
Retriever Architecture
Creating a Basic Retriever
Searching with a Retriever
Once you’ve created a retriever, you can use it to search your content:
Query Parameters
Different retriever stages can utilize different query parameters:
Retriever Use Cases
Content-Based Semantic Search
Retrieve documents based on meaning rather than exact keyword matches:
Implementation Pattern
- Use embedding-based retrievers for semantic understanding
- Optimize for capturing conceptual relationships
- Configure appropriate similarity thresholds
Content-Based Semantic Search
Retrieve documents based on meaning rather than exact keyword matches:
Implementation Pattern
- Use embedding-based retrievers for semantic understanding
- Optimize for capturing conceptual relationships
- Configure appropriate similarity thresholds
Cross-Modal Content Discovery
Search across different content types using a unified query approach:
Implementation Pattern
- Configure multi-stage retrievers that understand different modalities
- Balance weighting between different feature types
- Optimize for cross-modal relevance
Combined Vector and Metadata Filtering
Leverage both semantic understanding and structured metadata:
Implementation Pattern
- Configure vector retrieval with metadata post-filtering
- Use metadata to narrow results after semantic matching
- Balance recall and precision through stage configuration
Retrieval-Augmented Generation
Enhance generative AI with contextually relevant retrieved information:
Implementation Pattern
- Configure high-precision retrievers for factual accuracy
- Implement multi-stage retrieval for comprehensive context
- Optimize for diverse and representative results
Filters and Query Operators
Numeric and Date Comparisons
Operators for comparing numeric values and dates:
Available Operators
eq
- Equal toneq
- Not equal togt
- Greater thangte
- Greater than or equal tolt
- Less thanlte
- Less than or equal tobetween
- Within range (inclusive)
Example Usage
Numeric and Date Comparisons
Operators for comparing numeric values and dates:
Available Operators
eq
- Equal toneq
- Not equal togt
- Greater thangte
- Greater than or equal tolt
- Less thanlte
- Less than or equal tobetween
- Within range (inclusive)
Example Usage
String Matching
Operators for text field filtering:
Available Operators
exact
- Exact string matchcontains
- Contains substringstartswith
- String starts withendswith
- String ends withregex
- Regular expression matchiexact
,icontains
, etc. - Case-insensitive variants
Example Usage
Boolean Logic
Operators for combining multiple conditions:
Available Operators
and
- All conditions must matchor
- Any condition can matchnot
- Negate a conditionnor
- None of the conditions should match
Example Usage
List Field Operations
Operators for working with array/list fields:
Available Operators
contains_any
- Array contains any of the specified valuescontains_all
- Array contains all of the specified valuessize
- Array has specific lengthempty
- Check if array is empty
Example Usage
Advanced Filtering
Specialized operators for complex conditions:
Available Operators
exists
- Field exists and is not nulltype
- Field is of specific typewithin_radius
- Geo point within distance of coordinateswithin_box
- Geo point within bounding boxsimilarity
- Similar to reference value (fuzzy matching)
Example Usage
Best Practices
Start Simple
Begin with a simple retriever design and add complexity as needed. Often a basic vector search with filtering is sufficient.
Use Appropriate Indexes
Choose the right vector indexes for your content type. Use “text” for text-heavy content, “multimodal” for mixed content, and “image” for visual search.
Pre-filter When Possible
Apply metadata filters early in the pipeline to reduce the number of documents that need vector similarity calculation.
Mind Your Limits
Set appropriate limits at each stage. Start with larger limits in early stages and narrow down in later stages.
Leverage Caching
Use caching to improve performance for frequently accessed queries. See the Caching documentation for details.
Complex retrievers with many stages can impact search latency. Start with a simple design and add complexity only when needed for your use case.
Retrievers vs Direct Document Queries
When to Use Retrievers
- Semantic search based on meaning
- Multimodal search across different content types
- Complex search pipelines with multiple stages
- When relevance ranking is important
When to Use Document Queries
- Simple metadata filtering
- Exact match requirements
- When performance is critical for simple queries
- For administrative operations
API Reference
For complete details on working with retrievers, see our Retrievers API Reference.
Was this page helpful?