Skip to main content
Document Enrich stage showing collection joins and cross-reference lookups
The Document Enrich stage performs collection joins by looking up related documents from other Mixpeek collections. This enables cross-reference enrichment without external database calls.
Stage Category: APPLY (Enriches documents)Transformation: N documents → N documents (with joined data added)

When to Use

Use CaseDescription
Cross-collection joinsLink products to reviews, users to profiles
Reference resolutionExpand foreign keys to full documents
DenormalizationFlatten related data for display
Multi-index searchCombine results from different collections

When NOT to Use

ScenarioRecommended Alternative
External database joinssql_lookup
Single collection searchNo enrichment needed
Real-time external dataapi_call

Parameters

ParameterTypeDefaultDescription
collection_idstringRequiredTarget collection to join from
lookup_fieldstringRequiredField in target to match against
source_fieldstringRequiredField in source document to use as key
result_fieldstringenriched_dataField to store joined data
select_fieldsarraynullSpecific fields to return (null = all)
multiplebooleanfalseReturn multiple matching documents
limitinteger10Max documents when multiple: true

Configuration Examples

{
  "stage_type": "apply",
  "stage_id": "document_enrich",
  "parameters": {
    "collection_id": "user_profiles",
    "lookup_field": "user_id",
    "source_field": "metadata.author_id",
    "result_field": "author"
  }
}

Output Schema

Single Document Join

{
  "document_id": "doc_123",
  "content": "Product review content...",
  "metadata": {
    "product_id": "prod_456"
  },
  "product_details": {
    "name": "Wireless Headphones",
    "price": 199.99,
    "category": "Electronics",
    "image_url": "https://..."
  }
}

Multiple Documents Join

{
  "document_id": "prod_456",
  "content": "Product description...",
  "reviews": [
    {
      "document_id": "rev_1",
      "rating": 5,
      "text": "Great product!"
    },
    {
      "document_id": "rev_2",
      "rating": 4,
      "text": "Good value"
    }
  ]
}

No Match Found

{
  "document_id": "doc_123",
  "content": "...",
  "enriched_data": null
}

Performance

MetricValue
Latency5-20ms per document
Batch processingAutomatic batching
Index usageUses collection indexes
Parallel executionUp to 10 concurrent lookups
Ensure lookup_field is indexed in the target collection for optimal performance. Use select_fields to reduce payload size.

Common Pipeline Patterns

Search + Author Enrichment

[
  {
    "stage_type": "filter",
    "stage_id": "semantic_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 20
    }
  },
  {
    "stage_type": "apply",
    "stage_id": "document_enrich",
    "parameters": {
      "collection_id": "users",
      "lookup_field": "user_id",
      "source_field": "metadata.author_id",
      "result_field": "author",
      "select_fields": ["name", "avatar", "bio"]
    }
  }
]

Product Search with Reviews

[
  {
    "stage_type": "filter",
    "stage_id": "hybrid_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 10
    }
  },
  {
    "stage_type": "apply",
    "stage_id": "document_enrich",
    "parameters": {
      "collection_id": "reviews",
      "lookup_field": "product_id",
      "source_field": "document_id",
      "result_field": "recent_reviews",
      "multiple": true,
      "limit": 3,
      "select_fields": ["rating", "text", "author_name", "created_at"]
    }
  }
]

Hierarchical Category Enrichment

[
  {
    "stage_type": "filter",
    "stage_id": "semantic_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 50
    }
  },
  {
    "stage_type": "apply",
    "stage_id": "document_enrich",
    "parameters": {
      "collection_id": "categories",
      "lookup_field": "category_id",
      "source_field": "metadata.category_id",
      "result_field": "category"
    }
  },
  {
    "stage_type": "apply",
    "stage_id": "document_enrich",
    "parameters": {
      "collection_id": "categories",
      "lookup_field": "category_id",
      "source_field": "category.parent_id",
      "result_field": "parent_category"
    }
  }
]

Error Handling

ErrorBehavior
Collection not foundStage fails
No matching documentresult_field set to null
Invalid field pathStage fails with error
TimeoutContinues with null result

vs Other Enrichment Stages

Featuredocument_enrichsql_lookupapi_call
Data sourceMixpeek collectionsSQL databaseExternal API
Latency5-20ms10-100ms50-500ms
Best forCross-collectionExternal relationalREST APIs
SetupNoneConnection configEndpoint config