Skip to main content
Passthrough extractor showing direct field copy from source to output
The passthrough extractor copies source fields without any ML processing. By default, all source fields are included. Use field_passthrough to specify specific fields or include_all_source_fields to control behavior. Supports vector passthrough from collection to collection.
View extractor details at api.mixpeek.com/v1/collections/features/extractors/passthrough_extractor_v1 or fetch programmatically with GET /v1/collections/features/extractors/{feature_extractor_id}.

When to Use

Use CaseDescription
Metadata propagationPass metadata fields from bucket objects to collection documents without transformation
Vector passthroughCopy pre-computed embeddings from one collection to another
Schema normalizationSelect specific fields to include in the output schema
Collection-to-collection pipelinesRoute data through multi-tier processing without re-embedding

When NOT to Use

  • When you need to generate embeddings → Use text_extractor or multimodal_extractor
  • When you need to transform or enrich data → Use extractors with ML models
  • When you need to decompose content (chunking, video splitting) → Use appropriate extractors

Input Schema

The passthrough extractor accepts any input type and copies fields as-is.
{
  "type": "object",
  "properties": {},
  "description": "Accepts any fields - no required inputs"
}

Output Schema

The output mirrors the input based on configuration:
ConfigurationBehavior
Default (include_all_source_fields: true)All source fields copied to output
field_passthrough specifiedOnly listed fields copied
include_all_source_fields: falseMust specify field_passthrough

Parameters

The passthrough extractor has no required parameters. Configuration is handled through field_passthrough and input_mappings at the collection level.

Configuration Examples

{
  "feature_extractor": {
    "feature_extractor_name": "passthrough_extractor",
    "version": "v1",
    "input_mappings": {},
    "field_passthrough": [],
    "parameters": {}
  }
}

Performance & Costs

MetricValue
Latency< 1ms
CostFree
GPU RequiredNo
Max ThroughputUnlimited (no ML processing)

Vector Indexes

The passthrough extractor creates no vector indexes. If you pass through existing embeddings, they retain their original index configuration from the source collection.

Best Practices

  1. Use for multi-tier pipelines – When downstream collections need upstream data without reprocessing
  2. Minimize field selection – Only pass through fields you need to reduce storage and query overhead
  3. Preserve lineage – The passthrough extractor maintains root_object_id and source_collection_id for data lineage tracking
  4. Combine with other extractors – Use passthrough fields alongside ML extractors in the same collection to include metadata with generated features