Unwind

Unwind stage showing array decomposition into separate documents

The Unwind stage decomposes array fields into separate documents, producing one output document per array element. This is the retriever pipeline equivalent of MongoDB’s $unwind, Snowflake’s LATERAL FLATTEN, and Spark’s explode().

Stage Category: APPLY (Expands documents)Transformation: N documents → M documents (where M ≥ N, one per array element)

When to Use

Use Case	Description
Tag expansion	Decompose multi-tag documents for per-tag analysis
Segment decomposition	Flatten video/audio segments into individual results
Author attribution	Expand author lists for per-author scoring
Chunk flattening	Convert grouped chunks back into individual documents
Category expansion	Expand multi-category items for faceted search

When NOT to Use

Scenario	Recommended Alternative
Filtering documents	`attribute_filter` or `llm_filter`
Restructuring without expansion	`json_transform`
Sorting documents	`sort_attribute` or `sort_relevance`
Grouping documents	`group_by` (inverse operation)

Parameters

Parameter	Type	Default	Description
`field`	string	required	Dot-notation path to the array field to unwind
`preserve_null_and_empty`	boolean	`false`	Keep documents where array is null/missing/empty
`include_array_index`	string	`null`	Field name to store the element’s array index
`output_field`	string	`null`	Place unwound element in this field instead of replacing

Configuration Examples

{
  "stage_type": "apply",
  "stage_id": "unwind",
  "parameters": {
    "field": "metadata.tags"
  }
}

How It Works

For each input document, extracts the array value at the specified field path
If the value is an array with K elements, produces K output documents
Each output document preserves all original fields, with the array field replaced by a single element
Documents with null/empty arrays are either dropped or preserved based on preserve_null_and_empty
Non-array values are passed through unchanged

Use include_array_index when you need to reconstruct the original order later, such as when reassembling video segments after per-segment scoring.

Performance

Metric	Value
Latency	< 5ms
Memory	Proportional to output count
Cost	Free
Complexity	O(total array elements)

Common Pipeline Patterns

Per-Tag Scoring

[
  {
    "stage_type": "filter",
    "stage_id": "feature_search",
    "parameters": {
      "feature_uris": [{"input": {"text": "{{INPUT.query}}"}, "uri": "mixpeek://text_extractor@v1/embedding"}],
      "limit": 50
    }
  },
  {
    "stage_type": "apply",
    "stage_id": "unwind",
    "parameters": {
      "field": "metadata.tags",
      "include_array_index": "tag_index"
    }
  },
  {
    "stage_type": "group",
    "stage_id": "group_by",
    "parameters": {
      "field": "metadata.tags"
    }
  }
]

Segment-Level Retrieval

[
  {
    "stage_type": "filter",
    "stage_id": "feature_search",
    "parameters": {
      "feature_uris": [{"input": {"text": "{{INPUT.query}}"}, "uri": "mixpeek://text_extractor@v1/embedding"}],
      "limit": 20
    }
  },
  {
    "stage_type": "apply",
    "stage_id": "unwind",
    "parameters": {
      "field": "content.segments",
      "output_field": "current_segment"
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "inference_name": "baai_bge_reranker_v2_m3",
      "query": "{{INPUT.query}}",
      "document_field": "current_segment"
    }
  }
]

Error Handling

Error	Behavior
Field path doesn’t exist	Document dropped (or preserved if `preserve_null_and_empty=true`)
Field is not an array	Document passed through unchanged
Empty array	Document dropped (or preserved with null if `preserve_null_and_empty=true`)
Null field value	Same as empty array behavior

JSON Transform - Restructure document fields without expansion
Group By - Inverse operation: group documents by field value
Deduplicate - Remove duplicates after expansion

Getting Started

Ingest Data

Process Data

Search & Retrieve

Relevance & Personalization

Enrich & Organize

Operate in Production

Best Practices

Troubleshoot

When to Use

When NOT to Use

Parameters

Configuration Examples

How It Works

Performance

Common Pipeline Patterns

Per-Tag Scoring

Segment-Level Retrieval

Error Handling

Getting Started

Ingest Data

Process Data

Search & Retrieve

Relevance & Personalization

Enrich & Organize

Operate in Production

Best Practices

Troubleshoot

​When to Use

​When NOT to Use

​Parameters

​Configuration Examples

​How It Works

​Performance

​Common Pipeline Patterns

​Per-Tag Scoring

​Segment-Level Retrieval

​Error Handling

​Related

When to Use

When NOT to Use

Parameters

Configuration Examples

How It Works

Performance

Common Pipeline Patterns

Per-Tag Scoring

Segment-Level Retrieval

Error Handling

Related