Filters allow you to narrow down search results based on document metadata, enhancing search precision by combining semantic search with exact metadata matching.
Overview
Filters in Mixpeek enable you to refine search results by applying conditions to document metadata fields. They complement semantic vector search by allowing precise matching on structured data, such as dates, categories, numerical values, and tags.Metadata Filtering
Apply conditions to document metadata fields to narrow down search results
Hybrid Search
Combine semantic similarity with metadata filtering for precise retrieval
Filter Usage
Filters can be used in two main contexts within Mixpeek:Filter Stages in Retrievers
Filter Stages in Retrievers
Filters can be applied as dedicated stages within retriever pipelines, allowing you to refine results from previous stages based on metadata criteria.
Search Query Filters
Search Query Filters
Filters can be applied directly in search queries to filter results at query time.
Filter Operators
Mixpeek supports a comprehensive set of filter operators for different data types:Operator | Description | Example |
---|---|---|
$eq | Equals | {"field": {"$eq": value}} |
$ne | Not equals | {"field": {"$ne": value}} |
$gt | Greater than | {"field": {"$gt": value}} |
$gte | Greater than or equal | {"field": {"$gte": value}} |
$lt | Less than | {"field": {"$lt": value}} |
$lte | Less than or equal | {"field": {"$lte": value}} |
Basic Filter Examples
Simple Equality Filter
Numeric Range Filter
Date Range Filter
Array Contains Filter
Advanced Filter Examples
Logical Combinations
Negation
Complex Array Operations
Using Filters in Retrievers
Filter Stage in a Retriever Pipeline
Creating a Retriever with Filters
Using Filters in Search Queries
Filters can be applied directly in search queries to filter results at query time:Filter Optimization
Pre-filtering vs. Post-filtering
Pre-filtering
When to use:
- To reduce the dataset size before vector search
- For filters that can significantly reduce the number of candidates
- When metadata fields have indexes
Post-filtering
When to use:
- After vector search to refine semantically relevant results
- For more complex filters or combinations
- When vector similarity is the primary ranking factor
Indexing for Filters
For optimal filter performance, ensure that frequently filtered fields are properly indexed in your collections. Index the following types of fields:Equality Filters
Equality Filters
Fields used in equality filters (
$eq
, $in
, exact matches) should be indexed as keyword
type.Range Filters
Range Filters
Fields used in range filters (
$gt
, $lt
, etc.) should be indexed as their appropriate numeric or date types.Text Filters
Text Filters
Fields used in text search filters (
$text
, $regex
) should be indexed as text
type.Array Filters
Array Filters
Fields used in array filters (
$contains
, $all
) should be indexed as keyword
arrays.Best Practices
1
Filter Early
Apply filters as early as possible in the pipeline to reduce the dataset size before more expensive operations.
2
Index Key Fields
Ensure fields used frequently in filters are properly indexed in your collections.
3
Use Precise Filters
Be as specific as possible with filter criteria to narrow down results effectively.
4
Avoid Over-Filtering
Balance filter specificity with result diversity. Overly restrictive filters may eliminate potentially relevant results.
Complex filters with many nested logical operations can impact query performance. When possible, simplify filters and ensure indexed fields are used for optimal performance.