This endpoint partially updates a cluster (PATCH operation). Only provided fields will be updated. At minimum, metadata can always be updated. Immutable fields like cluster_id, status, and computed fields cannot be modified.
REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
"Bearer YOUR_API_KEY"
"Bearer YOUR_STRIPE_API_KEY"
REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'
"ns_abc123def456"
"production"
"my-namespace"
Cluster ID or name
Successful Response
Cluster metadata stored in MongoDB.
Collections to cluster together
1Optional human-friendly name for the clustering job
Vector or attribute clustering
vector, attribute Required when cluster_type is 'vector'
{
"algorithm_params": { "min_cluster_size": 10, "min_samples": 5 },
"clustering_method": "hdbscan",
"description": "HDBSCAN clustering with multimodal embeddings",
"feature_uri": "mixpeek://multimodal_extractor@v1/vertex_multimodal_embedding",
"sample_size": 1000
}Required when cluster_type is 'attribute'
{
"attributes": ["category"],
"description": "Simple category clustering",
"hierarchical_grouping": false
}Optional filters to pre-filter documents before clustering (same format as list documents). Applied during Qdrant scroll before parquet export. Useful for clustering subsets like: status='active', category='electronics', etc.
Optional configuration for LLM-based cluster labeling. When provided with enabled=True, clusters will have semantic labels generated by LLM instead of generic labels like 'Cluster 0'. When not provided or enabled=False, uses fallback labels.
{
"description": "Text-only labeling with multiple fields",
"enabled": true,
"include_keywords": true,
"include_summary": true,
"labeling_inputs": {
"input_mappings": [
{
"input_key": "title",
"path": "title",
"source_type": "payload"
},
{
"input_key": "description",
"path": "description",
"source_type": "payload"
},
{
"input_key": "text",
"path": "text",
"source_type": "payload"
}
]
},
"model_name": "gpt-4o-mini-2024-07-18",
"provider": "openai"
}If True, cluster results are written back to source collection(s) in-place instead of creating new output collections. Documents will be enriched with cluster_id, cluster_label, distance_to_centroid, and optionally other metadata. Similar to taxonomy enrichment pattern.
Configuration for source collection enrichment (only used if enrich_source_collection=True). Controls which fields are added to source documents and field naming conventions.
{
"field_mappings": [
{
"source_field": "cluster_id",
"target_field": "category_id"
},
{
"source_field": "cluster_label",
"target_field": "category_name"
},
{
"source_field": "distance_to_centroid",
"target_field": "category_confidence"
}
]
}Unique cluster identifier
S3 path to parquet files with cluster data
S3 key to members.parquet (if saved)
Number of clusters found
Clustering quality metrics
Clustering job status
PENDING, IN_PROGRESS, PROCESSING, COMPLETED, COMPLETED_WITH_ERRORS, FAILED, CANCELED, UNKNOWN, SKIPPED, DRAFT, ACTIVE, ARCHIVED, SUSPENDED Associated task ID for clustering job
Run ID of the most recent successful clustering execution. Used to retrieve execution results.
When the cluster was created
When the cluster was last updated
Additional user-defined metadata for the cluster