Skip to main content
Score normalize stage showing score rescaling from raw to normalized range
The Score Normalize stage rescales document scores using statistical normalization methods, enabling meaningful comparison across different scoring sources and consistent downstream thresholding.
Stage Category: SORT (Rescales scores)Transformation: N documents → N documents (same order, normalized scores)

When to Use

Use CaseDescription
Hybrid search fusionNormalize text and vector scores before combining
Score thresholdingSet consistent cutoffs across different retrievers
Cross-model comparisonMake scores from different models comparable
Probability rankingConvert scores to probability distribution
Multi-stage pipelinesNormalize between reranking stages

When NOT to Use

ScenarioRecommended Alternative
Reordering by relevancesort_relevance
Reranking with cross-encodersrerank
Filtering by score thresholdattribute_filter on score field
Single scoring sourceScores are already comparable

Parameters

ParameterTypeDefaultDescription
methodstringmin_maxNormalization method: min_max, z_score, softmax, l2
score_fieldstringscoreField containing the score to normalize
output_fieldstringnullWrite normalized score to this field (preserves original)
min_valuefloatnullCustom minimum for min_max (uses actual min if null)
max_valuefloatnullCustom maximum for min_max (uses actual max if null)

Normalization Methods

MethodFormulaOutput RangeBest For
min_max(x - min) / (max - min)[0, 1]Bounded comparison
z_score(x - mean) / std(-∞, +∞)Statistical thresholding
softmaxexp(x) / Σexp(0, 1), sum=1Probability distribution
l2x / ‖x‖₂[-1, 1]Geometric comparison

Configuration Examples

{
  "stage_type": "sort",
  "stage_id": "score_normalize",
  "parameters": {
    "method": "min_max",
    "score_field": "score"
  }
}
Use output_field to preserve the original score alongside the normalized value. This is useful for debugging or when you need both raw and normalized scores downstream.

Performance

MetricValue
Latency< 1ms
MemoryO(N) for score array
CostFree
ComplexityO(N) (two passes: stats + normalize)

Common Pipeline Patterns

Hybrid Search Fusion

[
  {
    "stage_type": "filter",
    "stage_id": "feature_search",
    "parameters": {
      "feature_uris": [{"input": {"text": "{{INPUT.query}}"}, "uri": "mixpeek://text_extractor@v1/embedding"}],
      "limit": 50
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "score_normalize",
    "parameters": {
      "method": "min_max",
      "score_field": "score"
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "inference_name": "baai_bge_reranker_v2_m3",
      "query": "{{INPUT.query}}",
      "document_field": "content"
    }
  }
]

Score Thresholding After Normalization

[
  {
    "stage_type": "filter",
    "stage_id": "feature_search",
    "parameters": {
      "feature_uris": [{"input": {"text": "{{INPUT.query}}"}, "uri": "mixpeek://text_extractor@v1/embedding"}],
      "limit": 100
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "score_normalize",
    "parameters": {
      "method": "min_max"
    }
  },
  {
    "stage_type": "filter",
    "stage_id": "attribute_filter",
    "parameters": {
      "AND": [
        {"field": "score", "operator": "gte", "value": 0.5}
      ]
    }
  }
]

Error Handling

ErrorBehavior
Single documentmin_max returns 1.0; z_score returns 0.0
All same scoresmin_max returns 1.0 for all; z_score returns 0.0 for all
Score field missingTreated as 0.0
Non-numeric scoreTreated as 0.0