Feature URIs
mixpeek://text_extractor@v1/text_embeddingmixpeek://clip_vit_l_14@v1/image_embeddingmixpeek://splade_extractor@v1/splade_vectormixpeek://colbert_extractor@v1/colbert_embeddings
- Defining retriever stages (
feature_address) - Configuring taxonomy input mappings
- Building clustering jobs
- Inspecting collection output schemas
Feature Anatomy
Documents store features as regular fields plus vector payloads:Inspect Available Features
GET /v1/collections/{collection_id}– returns the deterministicoutput_schema.GET /v1/collections/{collection_id}/features– enumerates feature URIs, dimensions, and metadata.GET /v1/feature-extractors– discover available extractors, versions, and output fields.
Feature Types
| Extractor | Feature Type | Typical Use |
|---|---|---|
text_extractor | Dense embedding (1024–1536 dims) | Semantic text search |
splade_extractor | Sparse vector (indices + weights) | Lexical / hybrid search |
colbert_extractor | Multi-vector (per-token) | Late interaction search |
clip_vit_l_14 | Dense multimodal embedding | Image & text similarity |
video_extractor | Scene embeddings + metadata | Video retrieval & analytics |
whisper_large_v3 | Transcription + timestamps | Audio search & diarization |
Working with Feature URIs
Best Practices
- Version carefully – upgrading an extractor version creates new feature URIs. Re-index collections or create new ones for breaking changes.
- Name consistently – stick to canonical URIs in retrievers and enrichment jobs to avoid mismatches.
- Store passthrough metadata – combine features with metadata fields (category, locale) for precise filters and joins.
- Monitor extractor performance – Analytics endpoints (
/v1/analytics/extractors/performance) help validate throughput and latency. - Leverage inference caching – repeated calls to the same feature URI benefit from the Engine’s inference cache.

