Taxonomies specify information you want to add to each document. For each taxonomy field, you choose how it is populated (usually by running a retriever that searches a reference collection). During enrichment, Mixpeek finds the best match and writes the selected fields from the reference document into the target document or the retrieval result.
Overview
Flat vs Hierarchical
Single‑level joins or multi‑level trees with inheritance
Explicit & Implicit
Define nodes manually or infer structure (schema/cluster/LLM)
Execution Modes
Materialized (batch) or On‑Demand (query‑time)
Retriever‑powered
Matching is driven by a configured retriever and input mappings
Quick start
1
Create
Define a flat or hierarchical taxonomy with retriever and input mappings
2
Attach
- Materialize: add to a collection via
taxonomy_applications
- On‑Demand: add a taxonomy join stage in a retriever
3
Run
- Preview with Execute Taxonomy (on‑demand test)
- Or execute the retriever pipeline in production
4
Inspect
Verify enriched fields on documents or in results
What gets copied
- Selected enrichment fields are copied from the matched reference document into your target document or into the retrieval result, depending on execution mode.
- Use
merge_mode: "replace"
to overwrite a scalar or object field on the target document. - Use
merge_mode: "append"
to add values to an array field on the target document.
How it works
1
Define
Create a taxonomy (flat or hierarchical) and specify a retriever plus input mappings
2
Attach
Add to a collection (materialize) or a retriever stage (on‑demand)
3
Execute
Run batch enrichment after extraction, or enrich at query‑time during retrieval
Types and modes
- One source collection
- 1:1 join semantics (best match)
- Copy configured enrichment fields from the reference document into the target document (replace or append)
Create a taxonomy
- API: Create Taxonomy
- Method: POST
- Path:
/v1/taxonomies
- Reference: API Reference
Flat example
Hierarchical (explicit) example
Manage taxonomies
Versions
- Immutable snapshots: Create versioned snapshots for reproducible enrichment
- Manage: [Create Version]/api-reference/taxonomies/create-taxonomy-version, [List Versions]/api-reference/taxonomies/list-taxonomy-versions, [Get Taxonomy]/api-reference/taxonomies/get-taxonomy
- Usage: Pin a specific version when attaching to collections or executing on‑demand to ensure stable outputs; update the reference to roll forward or roll back as needed
Attach and execute
Attach to a collection (materialize)
Includetaxonomy_applications
when creating or updating a collection. The engine materializes enrichment after extraction completes.
Execute on‑demand (test/preview)
Use the execute endpoint to validate configuration and preview enrichment (on‑demand only).- API: Execute Taxonomy
- Method: POST
- Path:
/v1/taxonomies/execute/{taxonomy_identifier}
- Reference: API Reference
Best practices
1
Keep mappings precise
Ensure input mappings point to existing fields and types in target documents
2
Minimize copied fields
Copy only required enrichment fields; prefer
append
for arrays3
Choose the right mode
Use materialize for stable, high‑QPS paths; on‑demand for dynamic or exploratory flows
4
Iterate hierarchies
Start with explicit nodes, then add inference or overrides as needed
FAQ
Flat vs hierarchical – when to choose?
Flat vs hierarchical – when to choose?
Use flat when a single reference collection provides all needed fields. Choose hierarchical when enrichment depends on multi‑level concepts (e.g., Employee → Executive → Board).
Materialized vs on‑demand – which is better?
Materialized vs on‑demand – which is better?
Materialized is ideal for stable, hot paths and high QPS; on‑demand is better for dynamic reference data or exploratory flows where persistence is not required.
Can I mix explicit nodes with inferred structure?
Can I mix explicit nodes with inferred structure?
Yes. Hierarchical taxonomies are hybrid by design: infer a base tree via schema/cluster/LLM, then override or extend with explicit nodes as needed.