Retrieve a cluster by ID or name.
Returns cluster metadata including:
REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
"Bearer YOUR_API_KEY"
"Bearer YOUR_STRIPE_API_KEY"
REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'
"ns_abc123def456"
"production"
"my-namespace"
Cluster ID or name
Successful Response
Cluster job metadata stored in MongoDB clusters collection.
This is separate from cluster documents themselves. Tracks job-level configuration, status, and summary statistics.
Supports both vector and attribute clustering with appropriate metadata.
Human-readable cluster name
Namespace this cluster belongs to
Organization ID (internal_id)
Source collection IDs that were clustered
Type of clustering: vector (embedding-based) or attribute (metadata-based)
vector, attribute Unique cluster job identifier
Source bucket IDs that the input collections originated from. Enables bucket lineage tracking.
Optional filters that were applied to pre-filter documents before clustering
Feature URIs that were clustered (mixpeek://{extractor}@{version}/{output}). Only for vector clustering.
Strategy used if multiple features (concatenate/independent/weighted). Only for vector clustering.
Automatically learned feature weights (when multi_feature_strategy='weighted'). Keys are feature URIs, values are learned weights. Only populated after clustering execution completes.
Clustering quality score from weight learning (e.g., silhouette score). Only populated when multi_feature_strategy='weighted' and weights were learned.
Method for calculating cluster centroids (mean/median/medoid). Only for vector clustering.
Attribute field names that were clustered. Only for attribute clustering.
Whether hierarchical clustering was used. Only for attribute clustering.
Method for aggregating attributes (most_frequent/first/last). Only for attribute clustering.
Collection IDs where cluster documents are stored. For single output: list with one collection ID. For per-feature output: list with one collection ID per feature.
Names of output collections. Corresponds to output_collection_ids.
Clustering algorithm used (hdbscan, kmeans, attribute_based, etc.)
Algorithm-specific parameters (not used for attribute_based)
Whether source documents were enriched with cluster_id
Configuration for source enrichment (if enrich_source=True)
{
"field_mappings": [
{
"source_field": "cluster_id",
"target_field": "category_id"
},
{
"source_field": "cluster_label",
"target_field": "category_name"
},
{
"source_field": "distance_to_centroid",
"target_field": "category_confidence"
}
]
}Configuration for LLM-based cluster labeling (applies to all cluster types)
{
"description": "Text-only labeling with multiple fields",
"enabled": true,
"include_keywords": true,
"include_summary": true,
"labeling_inputs": {
"input_mappings": [
{
"input_key": "title",
"path": "title",
"source_type": "payload"
},
{
"input_key": "description",
"path": "description",
"source_type": "payload"
},
{
"input_key": "text",
"path": "text",
"source_type": "payload"
}
]
},
"model_name": "gpt-4o-mini-2024-07-18",
"provider": "openai"
}Number of clusters found (excludes noise/outliers, populated after execution)
Total documents processed
Time taken to complete clustering
Whether implicit hierarchy was detected (multi-feature independent) or created (hierarchical attributes)
For child clusters in hierarchy
For parent clusters
Parent-child relationships detected from cluster membership overlap
Cluster job status (propagated from TaskService)
PENDING, IN_PROGRESS, PROCESSING, COMPLETED, COMPLETED_WITH_ERRORS, FAILED, CANCELED, UNKNOWN, SKIPPED, DRAFT, ACTIVE, ARCHIVED, SUSPENDED Most recent task ID for this cluster
When cluster was created
When cluster was last updated
Last execution timestamp
When clustering completed successfully
List of errors encountered during LLM labeling (if any). Stored in MongoDB cluster metadata only, NOT in Qdrant cluster documents. Used to track LLM failures while allowing fallback labels to work.
Additional user-defined metadata