The MMR (Maximal Marginal Relevance) stage diversifies search results by iteratively selecting documents that are both relevant to the query and different from already-selected documents. This prevents redundant results and surfaces a broader range of relevant content.
Stage Category : SORT (Reorders with diversity)Transformation : N documents → top_n documents (diverse selection)
When to Use
Use Case Description Reduce redundancy Avoid showing near-duplicate results Exploration Surface different aspects of a topic Coverage Ensure results span multiple subtopics Recommendation diversity Show varied options
When NOT to Use
Scenario Recommended Alternative Pure relevance ranking rerankSimple sorting sort_by_fieldAlready diverse corpus Skip MMR Very small result sets Not enough to diversify
Parameters
Parameter Type Default Description lambdafloat 0.5Balance: 0 = max diversity, 1 = max relevance top_ninteger 10Number of results to return embedding_fieldstring auto Field containing document embeddings
Lambda Parameter
The lambda parameter controls the relevance-diversity trade-off:
Lambda Behavior 1.0Pure relevance (no diversification) 0.7Slightly favor relevance 0.5Balanced (default) 0.3Favor diversity 0.0Maximum diversity
Configuration Examples
Balanced MMR
High Diversity
Relevance-Focused
Custom Embedding Field
{
"stage_type" : "sort" ,
"stage_id" : "mmr" ,
"parameters" : {
"lambda" : 0.5 ,
"top_n" : 10
}
}
How MMR Works
The MMR algorithm iteratively selects documents using:
MMR = argmax[λ × Sim(doc, query) - (1-λ) × max(Sim(doc, selected_docs))]
First Selection : Choose the most relevant document
Subsequent Selections : Balance relevance against similarity to already-selected documents
Repeat : Until top_n documents are selected
Example Selection Process
Iteration Selected Reason 1 Doc A (0.95) Highest relevance 2 Doc C (0.82) High relevance, different from A 3 Doc E (0.78) Good relevance, different from A & C 4 Doc B (0.90) Skipped earlier due to similarity to A
Output Schema
{
"document_id" : "doc_123" ,
"content" : "Document content..." ,
"score" : 0.85 ,
"mmr" : {
"relevance_score" : 0.92 ,
"diversity_penalty" : 0.07 ,
"mmr_score" : 0.85 ,
"selection_order" : 3
}
}
Metric Value Latency 10-50ms Complexity O(N × top_n) Memory O(N) embeddings Cost Free (no API calls)
Common Pipeline Patterns
Search + MMR
[
{
"stage_type" : "filter" ,
"stage_id" : "semantic_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 100
}
},
{
"stage_type" : "sort" ,
"stage_id" : "mmr" ,
"parameters" : {
"lambda" : 0.5 ,
"top_n" : 10
}
}
]
Search + Filter + MMR + Enrich
[
{
"stage_type" : "filter" ,
"stage_id" : "hybrid_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 50
}
},
{
"stage_type" : "filter" ,
"stage_id" : "structured_filter" ,
"parameters" : {
"conditions" : {
"field" : "metadata.category" ,
"operator" : "in" ,
"value" : [ "tech" , "science" , "business" ]
}
}
},
{
"stage_type" : "sort" ,
"stage_id" : "mmr" ,
"parameters" : {
"lambda" : 0.6 ,
"top_n" : 15
}
},
{
"stage_type" : "apply" ,
"stage_id" : "llm_enrichment" ,
"parameters" : {
"model" : "gpt-4o-mini" ,
"prompt" : "Summarize in one sentence" ,
"output_field" : "summary"
}
}
]
Diverse RAG Pipeline
[
{
"stage_type" : "filter" ,
"stage_id" : "semantic_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 50
}
},
{
"stage_type" : "sort" ,
"stage_id" : "mmr" ,
"parameters" : {
"lambda" : 0.4 ,
"top_n" : 8
}
},
{
"stage_type" : "reduce" ,
"stage_id" : "summarize" ,
"parameters" : {
"model" : "gpt-4o" ,
"prompt" : "Synthesize diverse perspectives on: {{INPUT.query}}"
}
}
]
MMR vs Rerank
Aspect MMR Rerank Goal Diversity + relevance Maximum relevance Method Embedding similarity Cross-encoder scoring Speed Fast (10-50ms) Slower (50-100ms) Best for Exploration, coverage Precision, accuracy
Tuning Lambda
Use Case Recommended Lambda News aggregation 0.3-0.4 (high diversity) Product search 0.5-0.6 (balanced) Technical docs 0.7-0.8 (relevance focus) Legal/compliance 0.8-0.9 (high precision)
Start with lambda=0.5 and adjust based on user feedback. If users complain about redundant results, lower lambda. If they miss relevant results, raise it.
Error Handling
Error Behavior Missing embeddings Fall back to relevance sort Empty input Return empty top_n > input size Return all documents Invalid lambda Clamp to [0, 1]