Authorizations
Bearer token authentication using your API key. Format: 'Bearer your_api_key'. To get an API key, create an account at mixpeek.com/start and generate a key in your account settings.
Headers
REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
"Bearer sk_live_abc123def456"
"Bearer sk_test_xyz789"
REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'
"ns_abc123def456"
"production"
"my-namespace"
Body
Request to execute clustering on one or more collections.
IDs of the collections to cluster together
1Clustering configuration including algorithm and parameters
{
"algorithm": "kmeans",
"algorithm_params": {
"max_iter": 300,
"n_clusters": 5,
"random_state": 42
},
"description": "Vector-based clustering with K-means",
"feature_vector": {
"feature_address": "mixpeek://text_extractor@v1/text_extractor_v1_embedding"
},
"llm_labeling": {
"enabled": true,
"model_name": "gpt-4o-mini",
"provider": "openai"
},
"normalize_features": true
}{
"algorithm": "hdbscan",
"algorithm_params": { "min_cluster_size": 10, "min_samples": 5 },
"description": "Vector-based clustering with HDBSCAN",
"feature_vector": {
"feature_address": "mixpeek://image_extractor@v1/image_extractor_v1_embedding"
},
"normalize_features": false
}{
"algorithm": "attribute_based",
"attribute_config": {
"attributes": ["category"],
"hierarchical_grouping": false
},
"description": "Attribute-based clustering (simple category)",
"llm_labeling": {
"enabled": true,
"include_keywords": true,
"include_summary": true,
"provider": "openai"
}
}{
"algorithm": "attribute_based",
"attribute_config": {
"aggregation_method": "most_frequent",
"attributes": ["category", "brand"],
"hierarchical_grouping": true
},
"description": "Attribute-based clustering (hierarchical category → brand)"
}{
"algorithm": "attribute_based",
"attribute_config": {
"attributes": ["metadata.status", "metadata.priority"],
"hierarchical_grouping": false
},
"description": "Attribute-based clustering (nested attributes)"
}Namespace ID for the request
Internal ID for the request
Number of documents to sample for clustering
Whether to store clustering results
Whether to include cluster membership in results
Whether to compute clustering quality metrics
Whether to save clustering artifacts (e.g., parquet) to S3
Response
Successful Response
Response from cluster execution.
Whether clustering was successful
Algorithm used for clustering
kmeans, dbscan, hdbscan, agglomerative, spectral, gaussian_mixture, mean_shift, optics, attribute_based Number of clusters found
Number of documents clustered
Cluster centroids with features
Total execution time in milliseconds
Unique identifier for this clustering run
Clustering quality metrics
S3 key path to parquet file with full results
S3 key to members.parquet (if saved)
Timestamp of clustering

