The Document Enrich stage performs collection joins by looking up related documents from other Mixpeek collections. This enables cross-reference enrichment without external database calls.
Stage Category : APPLY (Enriches documents)Transformation : N documents → N documents (with joined data added)
When to Use
Use Case Description Cross-collection joins Link products to reviews, users to profiles Reference resolution Expand foreign keys to full documents Denormalization Flatten related data for display Multi-index search Combine results from different collections
When NOT to Use
Scenario Recommended Alternative External database joins sql_lookupSingle collection search No enrichment needed Real-time external data api_call
Parameters
Parameter Type Default Description collection_idstring Required Target collection to join from lookup_fieldstring Required Field in target to match against source_fieldstring Required Field in source document to use as key result_fieldstring enriched_dataField to store joined data select_fieldsarray nullSpecific fields to return (null = all) multipleboolean falseReturn multiple matching documents limitinteger 10Max documents when multiple: true
Configuration Examples
Basic Collection Join
Select Specific Fields
Multiple Related Documents
Nested Field Lookup
{
"stage_type" : "apply" ,
"stage_id" : "document_enrich" ,
"parameters" : {
"collection_id" : "user_profiles" ,
"lookup_field" : "user_id" ,
"source_field" : "metadata.author_id" ,
"result_field" : "author"
}
}
Output Schema
Single Document Join
{
"document_id" : "doc_123" ,
"content" : "Product review content..." ,
"metadata" : {
"product_id" : "prod_456"
},
"product_details" : {
"name" : "Wireless Headphones" ,
"price" : 199.99 ,
"category" : "Electronics" ,
"image_url" : "https://..."
}
}
Multiple Documents Join
{
"document_id" : "prod_456" ,
"content" : "Product description..." ,
"reviews" : [
{
"document_id" : "rev_1" ,
"rating" : 5 ,
"text" : "Great product!"
},
{
"document_id" : "rev_2" ,
"rating" : 4 ,
"text" : "Good value"
}
]
}
No Match Found
{
"document_id" : "doc_123" ,
"content" : "..." ,
"enriched_data" : null
}
Metric Value Latency 5-20ms per document Batch processing Automatic batching Index usage Uses collection indexes Parallel execution Up to 10 concurrent lookups
Ensure lookup_field is indexed in the target collection for optimal performance. Use select_fields to reduce payload size.
Common Pipeline Patterns
Search + Author Enrichment
[
{
"stage_type" : "filter" ,
"stage_id" : "semantic_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 20
}
},
{
"stage_type" : "apply" ,
"stage_id" : "document_enrich" ,
"parameters" : {
"collection_id" : "users" ,
"lookup_field" : "user_id" ,
"source_field" : "metadata.author_id" ,
"result_field" : "author" ,
"select_fields" : [ "name" , "avatar" , "bio" ]
}
}
]
Product Search with Reviews
[
{
"stage_type" : "filter" ,
"stage_id" : "hybrid_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 10
}
},
{
"stage_type" : "apply" ,
"stage_id" : "document_enrich" ,
"parameters" : {
"collection_id" : "reviews" ,
"lookup_field" : "product_id" ,
"source_field" : "document_id" ,
"result_field" : "recent_reviews" ,
"multiple" : true ,
"limit" : 3 ,
"select_fields" : [ "rating" , "text" , "author_name" , "created_at" ]
}
}
]
Hierarchical Category Enrichment
[
{
"stage_type" : "filter" ,
"stage_id" : "semantic_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 50
}
},
{
"stage_type" : "apply" ,
"stage_id" : "document_enrich" ,
"parameters" : {
"collection_id" : "categories" ,
"lookup_field" : "category_id" ,
"source_field" : "metadata.category_id" ,
"result_field" : "category"
}
},
{
"stage_type" : "apply" ,
"stage_id" : "document_enrich" ,
"parameters" : {
"collection_id" : "categories" ,
"lookup_field" : "category_id" ,
"source_field" : "category.parent_id" ,
"result_field" : "parent_category"
}
}
]
Error Handling
Error Behavior Collection not found Stage fails No matching document result_field set to nullInvalid field path Stage fails with error Timeout Continues with null result
vs Other Enrichment Stages
Feature document_enrich sql_lookup api_call Data source Mixpeek collections SQL database External API Latency 5-20ms 10-100ms 50-500ms Best for Cross-collection External relational REST APIs Setup None Connection config Endpoint config