Skip to main content
Limit stage showing result truncation to top-N documents
The Limit stage truncates the document set to a maximum number of results, optionally with an offset for pagination-style behavior. This is the retriever pipeline equivalent of SQL’s LIMIT/OFFSET clause.
Stage Category: REDUCE (Truncates documents)Transformation: N documents → min(N, limit) documents

When to Use

Use CaseDescription
Top-K resultsReturn only the best N results after reranking
PaginationImplement page-based result access with offset
Cost controlCap document count before expensive LLM stages
Fixed outputGuarantee exactly N results for downstream consumers
Mid-pipeline trimReduce candidates between expensive stages

When NOT to Use

ScenarioRecommended Alternative
Random samplingsample stage
Filtering by criteriaattribute_filter or llm_filter
Initial retrieval limitSet limit in feature_search directly
Statistical reductionaggregate stage
Grouping resultsgroup_by stage

Parameters

ParameterTypeDefaultDescription
limitinteger10Maximum number of documents to return (1-10000)
offsetinteger0Number of documents to skip from the beginning (0-10000)

Configuration Examples

{
  "stage_type": "reduce",
  "stage_id": "limit",
  "parameters": {
    "limit": 10
  }
}
Place the limit stage after sorting/reranking to ensure you’re keeping the highest-quality results. Limiting before reranking loses potentially relevant documents.

Performance

MetricValue
Latency< 1ms
MemoryO(1)
CostFree
ComplexityO(1) list slicing

Common Pipeline Patterns

Rerank Then Limit

[
  {
    "stage_type": "filter",
    "stage_id": "feature_search",
    "parameters": {
      "feature_uris": [{"input": {"text": "{{INPUT.query}}"}, "uri": "mixpeek://text_extractor@v1/embedding"}],
      "limit": 100
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "inference_name": "baai_bge_reranker_v2_m3",
      "query": "{{INPUT.query}}",
      "document_field": "content"
    }
  },
  {
    "stage_type": "reduce",
    "stage_id": "limit",
    "parameters": {
      "limit": 10
    }
  }
]

Cost-Controlled LLM Pipeline

[
  {
    "stage_type": "filter",
    "stage_id": "feature_search",
    "parameters": {
      "feature_uris": [{"input": {"text": "{{INPUT.query}}"}, "uri": "mixpeek://text_extractor@v1/embedding"}],
      "limit": 200
    }
  },
  {
    "stage_type": "reduce",
    "stage_id": "limit",
    "parameters": {
      "limit": 20
    }
  },
  {
    "stage_type": "apply",
    "stage_id": "llm_enrich",
    "parameters": {
      "provider": "openai",
      "model_name": "gpt-4o-mini",
      "prompt": "Summarize: {{DOC.content}}",
      "output_field": "summary"
    }
  }
]

Error Handling

ErrorBehavior
Limit > input countReturns all available documents
Offset > input countReturns empty result set
Empty inputReturns empty result set
Offset + Limit > countReturns documents from offset to end
  • Sample - Random or stratified sampling
  • Deduplicate - Remove duplicates before limiting
  • Rerank - Re-score before limiting to ensure best results