Feature extractors transform raw content from buckets or collections into features stored inside documents. Availability varies by account; see the public catalog and contact support to enable additional extractors as needed.
Overview
- What they do: Run models to generate vectors and structured fields inside documents.
- Where they run: As part of ingestion pipelines for a target collection.
- Outputs: Vectors (dense, sparse, multi) and payload fields, with index definitions applied by the collection.
Discover extractors
- Browse catalog: mixpeek.com/extractors
- Availability: Some extractors may require enablement. Contact support: mixpeek.com/contact
- API:
Configure in a collection
Attach extractors when creating a collection. Each entry declares the extractor name and version (plus optional parameters and mappings).- Extractor outputs determine feature field names and vector index requirements for the collection.
- Use Describe Collection Features to see resolved addresses and metadata.
Behavior & availability
- Account‑dependent: Certain extractors are not enabled by default; request access if needed.
- Versioned: Changing model versions typically requires reprocessing to keep features consistent.
- Indexes: Required vector/payload indexes are applied by the engine based on extractor outputs.
Used by
- Collections: Store the produced features in documents (Collections).
- Retrievers: Search across vectors and payloads (Retrievers).
- Taxonomies: Join and enrich using feature fields (Taxonomies).
- Clusters: Build similarity groups over feature vectors (Clusters).
Manage and inspect
- Describe Collection Features to see addresses and indexes.
- List Feature Extractors to view available extractors.