Passthrough Extractor

The passthrough extractor copies source fields without any ML processing. By default, all source fields are included. Use field_passthrough to specify specific fields or include_all_source_fields to control behavior. Supports vector passthrough from collection to collection.

View extractor details at api.mixpeek.com/v1/collections/features/extractors/passthrough_extractor_v1 or fetch programmatically with GET /v1/collections/features/extractors/{feature_extractor_id}.

When to Use

Use Case	Description
Metadata propagation	Pass metadata fields from bucket objects to collection documents without transformation
Vector passthrough	Copy pre-computed embeddings from one collection to another
Schema normalization	Select specific fields to include in the output schema
Collection-to-collection pipelines	Route data through multi-tier processing without re-embedding

When NOT to Use

When you need to generate embeddings → Use text_extractor or multimodal_extractor
When you need to transform or enrich data → Use extractors with ML models
When you need to decompose content (chunking, video splitting) → Use appropriate extractors

Input Schema

The passthrough extractor accepts any input type and copies fields as-is.

{
  "type": "object",
  "properties": {},
  "description": "Accepts any fields - no required inputs"
}

Output Schema

The output mirrors the input based on configuration:

Configuration	Behavior
Default (`include_all_source_fields: true`)	All source fields copied to output
`field_passthrough` specified	Only listed fields copied
`include_all_source_fields: false`	Must specify `field_passthrough`

Parameters

The passthrough extractor has no required parameters. Configuration is handled through field_passthrough and input_mappings at the collection level.

Configuration Examples

{
  "feature_extractor": {
    "feature_extractor_name": "passthrough_extractor",
    "version": "v1",
    "input_mappings": {},
    "field_passthrough": [],
    "parameters": {}
  }
}

Performance & Costs

Metric	Value
Latency	< 1ms
Cost	Free
GPU Required	No
Max Throughput	Unlimited (no ML processing)

Vector Indexes

The passthrough extractor creates no vector indexes. If you pass through existing embeddings, they retain their original index configuration from the source collection.

Best Practices

Use for multi-tier pipelines – When downstream collections need upstream data without reprocessing
Minimize field selection – Only pass through fields you need to reduce storage and query overhead
Preserve lineage – The passthrough extractor maintains root_object_id and source_collection_id for data lineage tracking
Combine with other extractors – Use passthrough fields alongside ML extractors in the same collection to include metadata with generated features

Getting Started

Ingest Data

Process Data

Search & Retrieve

Enrich & Organize

Operate in Production

Best Practices

Troubleshoot

Passthrough Extractor

When to Use

When NOT to Use

Input Schema

Output Schema

Parameters

Configuration Examples

Performance & Costs

Vector Indexes

Best Practices

Getting Started

Ingest Data

Process Data

Search & Retrieve

Enrich & Organize

Operate in Production

Best Practices

Troubleshoot

​When to Use

​When NOT to Use

​Input Schema

​Output Schema

​Parameters

​Configuration Examples

​Performance & Costs

​Vector Indexes

​Best Practices

​Related

When to Use

When NOT to Use

Input Schema

Output Schema

Parameters

Configuration Examples

Performance & Costs

Vector Indexes

Best Practices

Related