Stages
KNN Search
Vector-based K-nearest neighbor search for semantic similarity
The KNN Search retriever stage performs vector similarity search to find the most semantically similar documents to a query.
Overview
KNN Search utilizes vector embeddings to find documents that are semantically similar to a query. It calculates the distance between the query vector and document vectors in a high-dimensional space, returning the k-nearest neighbors according to the specified distance metric.
Required Inputs
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
query | string | Yes | - | The search query text that will be converted to a vector embedding |
k | integer | No | 10 | Number of nearest neighbors to retrieve |
feature_store_id | string | Yes | - | ID of the feature store containing the vector embeddings |
distance_metric | string | No | ”cosine” | Distance metric for similarity calculation |
score_threshold | float | No | 0.75 | Minimum similarity score threshold for results |
Configurations
Distance Metrics
Metric | Description | Use Case |
---|---|---|
cosine | Measures the cosine of the angle between vectors | Text similarity, general purpose |
euclidean | Computes the Euclidean distance between vectors | Geometric embeddings |
dot_product | Calculates the dot product of vectors | When vectors are normalized |
manhattan | Computes Manhattan (L1) distance | Feature spaces with distinct dimensions |
Configuration Examples
Basic KNN Search
Advanced Configuration
Performance Tuning
Option | Type | Default | Description |
---|---|---|---|
ef_search | integer | 100 | HNSW index search depth (higher values: more accurate, slower) |
ef_construction | integer | 200 | HNSW index construction parameter (higher values: more accurate index) |
m | integer | 16 | HNSW index maximum number of connections per layer |
Processing Flow
Output Schema
Was this page helpful?