Stages
Hybrid Search
Combined vector and keyword search for optimal retrieval results
The Hybrid Search retriever stage combines semantic vector search with keyword-based search to balance precision and recall.
Overview
Hybrid Search combines the strengths of both semantic (vector) and lexical (keyword) search methods. This approach leverages vector embeddings for understanding context and meaning, while using keyword matching for precision with specific terms. The combined approach provides more robust search results than either method alone.
Required Inputs
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
query | string | Yes | - | The search query text |
k | integer | No | 10 | Number of results to retrieve |
feature_store_id | string | Yes | - | ID of the feature store containing vector embeddings |
index_id | string | Yes | - | ID of the keyword index |
vector_weight | float | No | 0.5 | Weight given to vector search results (0.0-1.0) |
keyword_weight | float | No | 0.5 | Weight given to keyword search results (0.0-1.0) |
Configurations
Search Weighting
The hybrid search combines results from both methods using a weighted approach:
Parameter | Description | Impact |
---|---|---|
vector_weight | Weight assigned to vector search results | Higher values favor semantic similarity |
keyword_weight | Weight assigned to keyword search results | Higher values favor exact keyword matches |
Merging Methods
Method | Description | Use Case |
---|---|---|
linear_combination | Weighted average of both search scores | General purpose, balanced approach |
reciprocal_rank_fusion | Combines result rankings rather than scores | When score scales differ significantly |
cross_encoder_reranking | Uses a model to rerank combined results | When highest precision is required |
Configuration Examples
Basic Hybrid Search
Advanced Configuration
Advanced Options
Option | Type | Default | Description |
---|---|---|---|
min_score | float | 0.1 | Minimum combined score threshold for results |
vector_k | integer | k * 3 | Number of candidates to retrieve from vector search |
keyword_k | integer | k * 3 | Number of candidates to retrieve from keyword search |
reranker_model | string | null | Model identifier for cross-encoder reranking |
Processing Flow
Output Schema
Was this page helpful?