Skip to main content
Learned Rerank stage showing personalized feature-level reranking
The Learned Rerank stage uses machine learning to personalize search results based on historical user interactions. It learns which document features predict engagement for different user segments and reranks results accordingly.
Stage Category: SORT (Personalized reordering)Transformation: N documents → top_n documents (personalized ranking)

When to Use

Use CaseDescription
PersonalizationRank based on user preferences
A/B optimizationLearn from click/conversion data
Marketplace rankingOptimize for engagement metrics
Content recommendationsPersonalize content feeds

When NOT to Use

ScenarioRecommended Alternative
No interaction datarerank (static model)
Privacy-sensitive contextsNon-personalized ranking
Cold-start usersFall back to rerank
Explainability requiredStandard scoring

Parameters

ParameterTypeDefaultDescription
model_idstringRequiredTrained personalization model
user_idstring{{INPUT.user_id}}User for personalization
featuresarrayautoDocument features to consider
top_ninteger10Number of results to return
exploration_ratefloat0.1Exploration vs exploitation (0-1)
fallbackstringrerankFallback for cold-start

Features

The learned rerank stage can use various document features:
Feature TypeExamples
ContentCategory, topic, length, sentiment
MetadataAuthor, date, source, price
EngagementHistorical CTR, conversion rate
ContextualTime of day, device, location

Configuration Examples

{
  "stage_type": "sort",
  "stage_id": "learned_rerank",
  "parameters": {
    "model_id": "search_personalization_v1",
    "user_id": "{{INPUT.user_id}}",
    "top_n": 10
  }
}

How Learned Rerank Works

  1. Feature Extraction: Extract relevant features from each document
  2. User Profile Lookup: Retrieve learned preferences for the user
  3. Score Prediction: Predict engagement probability for each doc
  4. Exploration/Exploitation: Balance known preferences with discovery
  5. Rank: Order by predicted engagement score

Bandit Learning

The system uses contextual bandits to:
  • Exploit: Rank highly items the user is likely to engage with
  • Explore: Occasionally show diverse items to learn preferences
  • Update: Learn from user interactions (clicks, purchases, etc.)

Output Schema

{
  "document_id": "doc_123",
  "content": "Document content...",
  "score": 0.85,
  "learned_rerank": {
    "personalization_score": 0.92,
    "exploration_boost": 0.0,
    "user_segment": "tech_enthusiast",
    "top_features": ["category:technology", "source:verified"]
  }
}

Performance

MetricValue
Latency20-50ms
Model inferenceReal-time
Cold-start fallbackAutomatic
Update frequencyNear real-time

Training the Model

Learned rerank models are trained on interaction data:
{
  "interaction_type": "click",
  "user_id": "user_123",
  "document_id": "doc_456",
  "query": "machine learning tutorials",
  "position": 3,
  "features": {
    "category": "education",
    "author_verified": true,
    "length": "long"
  }
}
Contact Mixpeek support to set up learned rerank model training for your use case.

Common Pipeline Patterns

[
  {
    "stage_type": "filter",
    "stage_id": "semantic_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 100
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "learned_rerank",
    "parameters": {
      "model_id": "search_personalization_v1",
      "user_id": "{{INPUT.user_id}}",
      "top_n": 20
    }
  }
]

Hybrid Personalization

[
  {
    "stage_type": "filter",
    "stage_id": "hybrid_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 50
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "model": "bge-reranker-v2-m3",
      "top_n": 30
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "learned_rerank",
    "parameters": {
      "model_id": "user_preferences_v2",
      "user_id": "{{INPUT.user_id}}",
      "exploration_rate": 0.15,
      "top_n": 10
    }
  }
]

E-Commerce Product Ranking

[
  {
    "stage_type": "filter",
    "stage_id": "semantic_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "product_embedding",
      "top_k": 100
    }
  },
  {
    "stage_type": "filter",
    "stage_id": "structured_filter",
    "parameters": {
      "conditions": {
        "field": "metadata.in_stock",
        "operator": "eq",
        "value": true
      }
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "learned_rerank",
    "parameters": {
      "model_id": "product_ranker",
      "user_id": "{{INPUT.user_id}}",
      "features": [
        "metadata.category",
        "metadata.price",
        "metadata.brand",
        "engagement.purchase_rate"
      ],
      "top_n": 24
    }
  }
]

Comparison with Other Reranking

StagePersonalizedLearningLatency
rerankNoStatic model50-100ms
learned_rerankYesOnline learning20-50ms
mmrNoN/A10-50ms

Privacy Considerations

AspectImplementation
User dataAggregated features only
Opt-outSupported via user settings
Data retentionConfigurable per organization
AnonymizationUser IDs can be hashed

Error Handling

ErrorBehavior
Unknown user_idUse fallback ranker
Model not foundFail with error
Missing featuresUse available features
TimeoutReturn unranked results
  • Rerank - Static cross-encoder reranking
  • MMR - Diversity-aware ranking
  • Interactions - Logging user interactions