curl --request POST \
--url https://api.mixpeek.com/v1/taxonomies/execute/{taxonomy_identifier} \
--header 'Authorization: <authorization>' \
--header 'Content-Type: application/json' \
--header 'X-Namespace: <x-namespace>' \
--data '
{
"batch_size": 1000,
"join_mode": "on_demand",
"source_collection_id": "col_catalog_v2",
"target_collection_id": "col_catalog_enriched_v2",
"taxonomy": {
"config": {
"input_mappings": [
{
"input_key": "image_vector",
"path": "features.clip",
"source_type": "vector"
}
],
"retriever_id": "ret_clip_v1",
"source_collection": {
"collection_id": "col_products_v1"
},
"taxonomy_type": "flat"
},
"input_mappings": [
{
"input_key": "image_vector",
"path": "features.clip",
"source_type": "vector"
}
],
"namespace_id": "ns_123",
"retriever_id": "ret_clip_v1",
"taxonomy_name": "product_tags"
}
}
'{
"stats": {
"processed_docs": 0,
"batches": 0,
"errors": 0
},
"results": [
{}
]
}⚠️ VALIDATION ENDPOINT ONLY - Not for production enrichment!
This endpoint validates taxonomy configuration with 1-5 sample documents. Results are returned immediately and NOT persisted to any collection.
❌ DO NOT USE FOR:
✅ USE THIS FOR:
📚 FOR PRODUCTION ENRICHMENT:
Automatic (during ingestion):
On-the-fly (during retrieval):
See API documentation for Collections and Retrievers for details.
curl --request POST \
--url https://api.mixpeek.com/v1/taxonomies/execute/{taxonomy_identifier} \
--header 'Authorization: <authorization>' \
--header 'Content-Type: application/json' \
--header 'X-Namespace: <x-namespace>' \
--data '
{
"batch_size": 1000,
"join_mode": "on_demand",
"source_collection_id": "col_catalog_v2",
"target_collection_id": "col_catalog_enriched_v2",
"taxonomy": {
"config": {
"input_mappings": [
{
"input_key": "image_vector",
"path": "features.clip",
"source_type": "vector"
}
],
"retriever_id": "ret_clip_v1",
"source_collection": {
"collection_id": "col_products_v1"
},
"taxonomy_type": "flat"
},
"input_mappings": [
{
"input_key": "image_vector",
"path": "features.clip",
"source_type": "vector"
}
],
"namespace_id": "ns_123",
"retriever_id": "ret_clip_v1",
"taxonomy_name": "product_tags"
}
}
'{
"stats": {
"processed_docs": 0,
"batches": 0,
"errors": 0
},
"results": [
{}
]
}REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'
Taxonomy ID or name to validate
Optional taxonomy version (defaults to latest)
Request model for on-demand taxonomy validation and testing ONLY.
⚠️ IMPORTANT: This endpoint is ONLY for testing taxonomy configuration with sample documents.
DO NOT USE THIS FOR BATCH ENRICHMENT: ❌ Do NOT use this to enrich an entire collection ❌ Do NOT use source_collection_id expecting batch processing ❌ Do NOT use target_collection_id expecting persistence
HOW TAXONOMY ENRICHMENT ACTUALLY WORKS:
✅ Automatic during ingestion: Attach taxonomies to collections via taxonomy_applications
✅ On-the-fly in retrieval: Add taxonomy_join stage to retriever pipelines
This endpoint validates:
For production enrichment, see:
taxonomy_applications fieldtaxonomy_join stage for on-the-fly enrichmentFull taxonomy model with configuration (fetched from DB by controller)
Show child attributes
A unique name for the taxonomy within the namespace.
Configuration specific to the taxonomy type.
Show child attributes
The retriever to use for matching against the source collection.
Input mappings defining how to construct retriever inputs.
Show child attributes
Key used in the constructed inputs payload.
Source of the value (payload, literal, vector).
payload, literal, vector Dot-notation path inside payload/vector when source_type is PAYLOAD or VECTOR.
Static value used when source_type is LITERAL. Overrides any path.
The single source collection for this flat taxonomy.
Show child attributes
The ID of the source collection for the taxonomy.
Fields to copy from matched taxonomy node when enriching (append/replace semantics). If omitted, the full payload is copied.
Show child attributes
Dot-notation path of the field to copy from the taxonomy node.
Optional target field name in the enriched document. If specified, the source field will be renamed to this name. If not specified, the field_path is used as the target name. Use this to rename fields during enrichment (e.g., label → visual_style).
"visual_style"
Whether to overwrite the target's value or append (for arrays).
replace, append Discriminator identifying this as a flat taxonomy.
"flat"{
"input_mappings": [
{
"input_key": "image_vector",
"path": "features.clip_vit_l_14",
"source_type": "vector"
}
],
"retriever_id": "ret_clip_v1",
"source_collection": {
"collection_id": "col_products_v1",
"enrichment_fields": [
{
"field_path": "metadata.tags",
"merge_mode": "append"
}
]
},
"taxonomy_type": "flat"
}Unique identifier for the taxonomy
Monotonic version number of the taxonomy configuration
x >= 1Optional human-readable description.
Optional taxonomy-level retriever (prefer per-layer).
Optional taxonomy-level inputs (prefer per-layer).
Show child attributes
Key used in the constructed inputs payload.
Source of the value (payload, literal, vector).
payload, literal, vector Dot-notation path inside payload/vector when source_type is PAYLOAD or VECTOR.
Static value used when source_type is LITERAL. Overrides any path.
Whether the taxonomy is ready for use. False for async inference (cluster/LLM) that needs processing. True for flat/explicit hierarchies.
Creation timestamp for this taxonomy record
Additional user-defined metadata for the taxonomy
Optional retriever configuration override for testing. If omitted, uses the retriever configured in the taxonomy.
Show child attributes
Name of the retriever
Input schema for the retriever
Show child attributes
Schema properties for retriever inputs
Show child attributes
Schema field definition for bucket objects.
Show child attributes
Supported data types for bucket schema fields.
Types fall into two categories:
Explicit Types (Type-Safe):
Automatic Type (Flexible):
string, number, integer, float, boolean, object, array, date, datetime, json, file, text, image, audio, video, pdf, document, spreadsheet, presentation, dense_vector, sparse_vector, int8_vector, automatic OPTIONAL. List of example values for this field. Used by Apps to show example inputs in the UI. Provide multiple diverse examples when possible.
List of collection IDs
List of stage configurations
Show child attributes
Stage implementation ID (overrides stage_name for lookups)
Stage parameters
Filters to apply to the documents before this stage is executed.These filters are combined with any global retriever filters.
Show child attributes
Logical AND operation - all conditions must be true
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "name",
"operator": "eq",
"value": "John"
},
{
"field": "age",
"operator": "gte",
"value": 30
}
]Logical OR operation - at least one condition must be true
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "status",
"operator": "eq",
"value": "active"
},
{
"field": "role",
"operator": "eq",
"value": "admin"
}
]Logical NOT operation - all conditions must be false
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "department",
"operator": "eq",
"value": "HR"
},
{
"field": "location",
"operator": "eq",
"value": "remote"
}
]Whether to perform case-sensitive matching
true
Filters to apply to the documents after this stage is executed.These filters are applied to the results of this stage before passing to the next.
Show child attributes
Logical AND operation - all conditions must be true
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "name",
"operator": "eq",
"value": "John"
},
{
"field": "age",
"operator": "gte",
"value": 30
}
]Logical OR operation - at least one condition must be true
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "status",
"operator": "eq",
"value": "active"
},
{
"field": "role",
"operator": "eq",
"value": "admin"
}
]Logical NOT operation - all conditions must be false
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "department",
"operator": "eq",
"value": "HR"
},
{
"field": "location",
"operator": "eq",
"value": "remote"
}
]Whether to perform case-sensitive matching
true
Performance statistics for this stage
Show child attributes
Average execution time in milliseconds
Number of times executed
Number of errors encountered
Last time this stage was executed
Unique identifier for the retriever
Description of the retriever
Cache configuration for this retriever. If not provided, caching is disabled.
Show child attributes
Whether caching is enabled for this retriever
Time-to-live for cached results in seconds. Default: 1 hour
x >= 0List of stage names to cache results after. Stage names must match the stage_name field in the retriever's stages. If not specified, caches the final results after all stages. Examples: ['semantic_search'], ['semantic_search', 'rerank']
Fields to exclude from caching (e.g., PII fields)
Cache performance statistics
Show child attributes
Number of cache hits
Number of cache misses
Cache hit rate (0.0 - 1.0)
0 <= x <= 1Total size of cached data in bytes
Number of entries in cache
When the cache was last invalidated
When the retriever was created
When the retriever was last modified
When the retriever was last executed
Whether the retriever is enabled (can be toggled on/off)
Current operational status
active, draft, disabled, error Usage and performance statistics
Show child attributes
Total number of queries executed
Number of queries in the last 24 hours
Average latency in milliseconds
Error rate as a fraction (0.0 - 1.0)
0 <= x <= 1Most recent error message for debugging
Cache hit rate if caching is enabled (0.0 - 1.0)
0 <= x <= 1Expanded collection details with names and metadata
Show child attributes
Collection identifier
Human-readable collection name
Number of documents in the collection
Whether the collection is active
When the collection was last indexed
Custom key-value metadata
Tags for organization and filtering
Version number (increments on each update)
History of changes (optional, last N changes)
Show child attributes
Version number
When this version was created
User who made the change
Description of changes made
Health status and diagnostics
Sample documents to test enrichment (typically 1-5 docs). Results are returned immediately, not persisted. ⚠️ Do NOT pass collection_id expecting batch processing!
⚠️ IGNORED IN ON_DEMAND MODE. This field exists for legacy compatibility only. To enrich collections, use taxonomy_applications on the collection.
⚠️ IGNORED IN ON_DEMAND MODE. This field exists for legacy compatibility only. Results are never persisted via this endpoint.
Must be 'on_demand'. BATCH mode is NOT supported via API. Batch enrichment is automatic (triggered by engine during ingestion).
on_demand, batch Batch size for the scroll iterator
1 <= x <= 10000Additional filters applied to the source collection prior to enrichment.
Show child attributes
Logical AND operation - all conditions must be true
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "name",
"operator": "eq",
"value": "John"
},
{
"field": "age",
"operator": "gte",
"value": 30
}
]Logical OR operation - at least one condition must be true
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "status",
"operator": "eq",
"value": "active"
},
{
"field": "role",
"operator": "eq",
"value": "admin"
}
]Logical NOT operation - all conditions must be false
Represents a single filter condition.
Attributes: field: The field to filter on operator: The comparison operator value: The value to compare against
Show child attributes
Field name to filter on
Comparison operator
eq, ne, gt, lt, gte, lte, in, nin, contains, starts_with, ends_with, regex, exists, is_null, text [
{
"field": "department",
"operator": "eq",
"value": "HR"
},
{
"field": "location",
"operator": "eq",
"value": "remote"
}
]Whether to perform case-sensitive matching
true
Was this page helpful?