The Reranking stage refines search results by applying more sophisticated scoring models to reorder documents based on their relevance to the query.

Overview

Reranking improves search quality by applying a more computationally intensive model to reorder an initial set of search results. While initial retrieval stages like KNN or keyword search focus on recall (finding potentially relevant documents), reranking focuses on precision (sorting results by true relevance). This two-stage approach balances efficiency with accuracy.

Required Inputs

ParameterTypeRequiredDefaultDescription
querystringYes-The search query text
documentsarrayYes-Initial set of document IDs or search results to rerank
modelstringNo”mixpeek/reranker-v1”Reranking model to use
kintegerNo10Number of top results to return after reranking
score_thresholdfloatNo0.0Minimum relevance score threshold for results

Configurations

Reranker Types

TypeDescriptionUse Case
cross_encoderUses query-document pairs for direct relevance scoringHighest precision needs
pointwiseScores documents individually without comparing themFaster processing
listwiseConsiders entire result set for optimal orderingComplex ranking needs
hybridCombines multiple reranking approachesBalancing precision and recall

Model Options

ModelDescriptionStrength
mixpeek/reranker-v1General-purpose cross-encoder modelBalanced performance
mixpeek/reranker-domain-v1Domain-optimized for specific content typesIndustry/domain-specific content
mixpeek/reranker-multilingualOptimized for cross-language searchMulti-language content
customUser-provided custom reranker modelSpecialized use cases

Configuration Examples

Basic Reranking
{
  "model": "mixpeek/reranker-v1",
  "k": 10,
  "reranker_type": "cross_encoder",
  "score_threshold": 0.5
}
Advanced Configuration
{
  "model": "mixpeek/reranker-domain-v1",
  "k": 25,
  "reranker_type": "hybrid",
  "score_threshold": 0.4,
  "weights": {
    "cross_encoder": 0.7,
    "semantic_similarity": 0.3
  },
  "max_context_length": 512,
  "normalize_scores": true,
  "preserve_original_order_weight": 0.1
}

Advanced Options

OptionTypeDefaultDescription
max_context_lengthinteger512Maximum token length for document context
normalize_scoresbooleantrueWhether to normalize final scores to 0-1 range
preserve_original_order_weightfloat0.0Weight given to preserving original result order
batchingbooleantrueWhether to process documents in batches for efficiency
batch_sizeinteger16Number of documents to process in each batch

Processing Flow

Output Schema

{
  "results": [
    {
      "document_id": "doc_abc123",
      "collection_id": "col_xyz789",
      "reranker_score": 0.953,
      "original_score": 0.821,
      "original_rank": 3,
      "metadata": {
        "title": "Advanced Search Result Ranking",
        "timestamp": "2023-05-18T09:12:43Z"
      },
      "content": "Reranking search results is a powerful technique for improving relevance..."
    },
    {
      "document_id": "doc_def456",
      "collection_id": "col_xyz789",
      "reranker_score": 0.891,
      "original_score": 0.865,
      "original_rank": 1,
      "metadata": {
        "title": "Implementing Cross-Encoders for Search",
        "timestamp": "2023-06-02T14:35:22Z"
      },
      "content": "Cross-encoder models evaluate query-document pairs directly, providing more accurate..."
    }
    // Additional results...
  ],
  "metadata": {
    "query": "how to improve search relevance",
    "total_results": 2,
    "original_results_count": 25,
    "processing_time_ms": 205.3,
    "model": "mixpeek/reranker-v1",
    "reranker_type": "cross_encoder"
  }
}