Skip to main content
MMR stage showing diversity-aware result selection
The MMR (Maximal Marginal Relevance) stage diversifies search results by iteratively selecting documents that are both relevant to the query and different from already-selected documents. This prevents redundant results and surfaces a broader range of relevant content.
Stage Category: SORT (Reorders with diversity)Transformation: N documents → top_n documents (diverse selection)

When to Use

Use CaseDescription
Reduce redundancyAvoid showing near-duplicate results
ExplorationSurface different aspects of a topic
CoverageEnsure results span multiple subtopics
Recommendation diversityShow varied options

When NOT to Use

ScenarioRecommended Alternative
Pure relevance rankingrerank
Simple sortingsort_by_field
Already diverse corpusSkip MMR
Very small result setsNot enough to diversify

Parameters

ParameterTypeDefaultDescription
lambdafloat0.5Balance: 0 = max diversity, 1 = max relevance
top_ninteger10Number of results to return
embedding_fieldstringautoField containing document embeddings

Lambda Parameter

The lambda parameter controls the relevance-diversity trade-off:
LambdaBehavior
1.0Pure relevance (no diversification)
0.7Slightly favor relevance
0.5Balanced (default)
0.3Favor diversity
0.0Maximum diversity

Configuration Examples

{
  "stage_type": "sort",
  "stage_id": "mmr",
  "parameters": {
    "lambda": 0.5,
    "top_n": 10
  }
}

How MMR Works

The MMR algorithm iteratively selects documents using:
MMR = argmax[λ × Sim(doc, query) - (1-λ) × max(Sim(doc, selected_docs))]
  1. First Selection: Choose the most relevant document
  2. Subsequent Selections: Balance relevance against similarity to already-selected documents
  3. Repeat: Until top_n documents are selected

Example Selection Process

IterationSelectedReason
1Doc A (0.95)Highest relevance
2Doc C (0.82)High relevance, different from A
3Doc E (0.78)Good relevance, different from A & C
4Doc B (0.90)Skipped earlier due to similarity to A

Output Schema

{
  "document_id": "doc_123",
  "content": "Document content...",
  "score": 0.85,
  "mmr": {
    "relevance_score": 0.92,
    "diversity_penalty": 0.07,
    "mmr_score": 0.85,
    "selection_order": 3
  }
}

Performance

MetricValue
Latency10-50ms
ComplexityO(N × top_n)
MemoryO(N) embeddings
CostFree (no API calls)

Common Pipeline Patterns

Search + MMR

[
  {
    "stage_type": "filter",
    "stage_id": "semantic_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 100
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "mmr",
    "parameters": {
      "lambda": 0.5,
      "top_n": 10
    }
  }
]

Search + Filter + MMR + Enrich

[
  {
    "stage_type": "filter",
    "stage_id": "hybrid_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 50
    }
  },
  {
    "stage_type": "filter",
    "stage_id": "structured_filter",
    "parameters": {
      "conditions": {
        "field": "metadata.category",
        "operator": "in",
        "value": ["tech", "science", "business"]
      }
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "mmr",
    "parameters": {
      "lambda": 0.6,
      "top_n": 15
    }
  },
  {
    "stage_type": "apply",
    "stage_id": "llm_enrichment",
    "parameters": {
      "model": "gpt-4o-mini",
      "prompt": "Summarize in one sentence",
      "output_field": "summary"
    }
  }
]

Diverse RAG Pipeline

[
  {
    "stage_type": "filter",
    "stage_id": "semantic_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 50
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "mmr",
    "parameters": {
      "lambda": 0.4,
      "top_n": 8
    }
  },
  {
    "stage_type": "reduce",
    "stage_id": "summarize",
    "parameters": {
      "model": "gpt-4o",
      "prompt": "Synthesize diverse perspectives on: {{INPUT.query}}"
    }
  }
]

MMR vs Rerank

AspectMMRRerank
GoalDiversity + relevanceMaximum relevance
MethodEmbedding similarityCross-encoder scoring
SpeedFast (10-50ms)Slower (50-100ms)
Best forExploration, coveragePrecision, accuracy

Tuning Lambda

Use CaseRecommended Lambda
News aggregation0.3-0.4 (high diversity)
Product search0.5-0.6 (balanced)
Technical docs0.7-0.8 (relevance focus)
Legal/compliance0.8-0.9 (high precision)
Start with lambda=0.5 and adjust based on user feedback. If users complain about redundant results, lower lambda. If they miss relevant results, raise it.

Error Handling

ErrorBehavior
Missing embeddingsFall back to relevance sort
Empty inputReturn empty
top_n > input sizeReturn all documents
Invalid lambdaClamp to [0, 1]