Authorizations
Bearer token authentication using your API key. Format: 'Bearer your_api_key'. To get an API key, create an account at mixpeek.com/start and generate a key in your account settings.
Headers
REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
"Bearer sk_live_abc123def456"
"Bearer sk_test_xyz789"
REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'
"ns_abc123def456"
"production"
"my-namespace"
Response
Successful Response
Complete results from a single clustering execution.
Represents the outcome of running a clustering algorithm on a collection's documents. Each execution creates a snapshot of clustering results at a point in time, including the clusters found, quality metrics, and semantic labels.
Use Cases: - Display clustering execution history in UI - Compare clustering quality across multiple runs - Track execution status for long-running jobs - Debug failed clustering attempts - View cluster summaries and labels for analysis
Workflow: 1. Create cluster configuration → POST /clusters 2. Execute clustering → POST /clusters/{id}/execute 3. Poll execution status → GET /clusters/{id}/executions 4. View execution history → POST /clusters/{id}/executions/list
Status Lifecycle: pending → processing → completed (or failed)
Note: Execution results are immutable once completed. Re-running clustering creates a new execution result with a new run_id.
REQUIRED. Unique identifier for this specific clustering execution. Format: 'run_' prefix followed by random alphanumeric string. Used to retrieve specific execution artifacts and results. Each re-execution of the same cluster creates a new run_id. References execution artifacts in S3 and MongoDB.
"run_a8e270953254754b"
"run_b3f58210ab"
"run_xyz789"
REQUIRED. Parent cluster configuration that was executed. Format: 'clust_' prefix followed by random alphanumeric string. Links this execution back to the cluster definition. Multiple executions can share the same cluster_id.
"clust_ae3e28a429"
"clust_xyz789"
"clust_abc123"
REQUIRED. Current status of the clustering execution. Values: 'pending' = Job queued, waiting to start. 'processing' = Clustering algorithm running (may take minutes for large datasets). 'completed' = Clustering finished successfully, results available. 'failed' = Clustering failed, check error_message for details. Status changes: pending → processing → (completed OR failed). Poll this field to track job progress.
pending, processing, completed, failed "completed"
"processing"
"pending"
"failed"
REQUIRED. Number of clusters found by the clustering algorithm. Range: 1 to num_points (though typically much lower). Interpretation: Too few clusters = overgeneralization, may need lower n_clusters param. Too many clusters = overfitting, may need higher n_clusters param. Optimal value depends on dataset and use case. Available immediately upon completion, even if metrics fail.
x >= 03
5
10
25
REQUIRED. Total number of documents/points that were clustered. Equals the count of documents in the collection at execution time. Note: This may differ across executions if documents were added/removed. Used to calculate metrics and validate clustering quality. Minimum 2 points required for clustering (1 cluster per point otherwise).
x >= 0100
1000
50000
REQUIRED. Timestamp when the clustering execution started. ISO 8601 format with timezone (UTC). Used to: - Sort executions chronologically. - Calculate execution duration (completed_at - created_at). - Filter execution history by date range. Always present, even for failed executions.
"2025-11-13T13:20:40.122000Z"
"2025-11-13T10:00:00.000000Z"
OPTIONAL. Quality metrics evaluating clustering performance. NOT REQUIRED - only present for successful executions. null if: - Execution is still pending/processing. - Execution failed. - Too few points to calculate metrics (need 2+ points). Contains silhouette_score, davies_bouldin_index, calinski_harabasz_score. Use to compare quality across multiple executions. Quality metrics for evaluating clustering execution performance.
Provides statistical measures to assess the quality of the clustering results. Higher quality clusters have better cohesion (documents within clusters are similar) and separation (clusters are distinct from each other).
Use Cases: - Compare quality across multiple clustering executions - Determine optimal number of clusters for a dataset - Validate clustering algorithm performance - Track clustering quality over time - Debug clustering issues (poor metrics indicate problems)
Interpretation: - Use silhouette_score as primary quality indicator (0.5+ = good, 0.7+ = excellent) - Lower davies_bouldin_index indicates better-separated clusters - Higher calinski_harabasz_score indicates denser, better-separated clusters
Note: All metrics are OPTIONAL and only present if clustering completed successfully. Failed executions return null for all metrics.
{
"calinski_harabasz_score": 1234.56,
"davies_bouldin_index": 0.42,
"description": "Excellent clustering quality",
"silhouette_score": 0.85
}{
"calinski_harabasz_score": 678.34,
"davies_bouldin_index": 0.89,
"description": "Good clustering quality",
"silhouette_score": 0.67
}{
"calinski_harabasz_score": 45.12,
"davies_bouldin_index": 2.45,
"description": "Poor clustering quality (needs tuning)",
"silhouette_score": 0.23
}{
"description": "Metrics not available (failed execution)"
}OPTIONAL. List of cluster centroids with semantic labels. NOT REQUIRED - only present for completed executions with LLM labeling enabled. Length: equals num_clusters. Each centroid contains: - cluster_id: Identifier for the cluster (e.g., 'cl_0'). - num_members: Count of documents in this cluster. - label: Human-readable cluster name (e.g., 'Product Reviews'). - summary: Brief description of cluster content. - keywords: Array of representative terms. null if: - Execution pending/processing/failed. - LLM labeling not configured. Use for: Displaying cluster summaries in UI, filtering by cluster.
OPTIONAL. Timestamp when the clustering execution finished. ISO 8601 format with timezone (UTC). NOT REQUIRED - only present for completed or failed executions. null if: status is 'pending' or 'processing'. Use to: - Calculate execution duration (completed_at - created_at). - Show when results became available. Present for both successful and failed executions.
"2025-11-13T13:25:40.122000Z"
"2025-11-13T10:05:32.456000Z"
null
OPTIONAL. Error message if the clustering execution failed. NOT REQUIRED - only present when status is 'failed'. null if: execution succeeded or is still in progress. Contains: - Human-readable error description. - Possible causes and suggested fixes. - Stack trace details (for debugging). Common errors: - 'Insufficient documents for clustering' (need 2+ docs). - 'Feature extractor not found' (invalid collection config). - 'Out of memory' (dataset too large for algorithm). Use for: Debugging failed executions and user error messages.
"Insufficient documents for clustering: need at least 2 documents"
"Feature extractor 'invalid_extractor' not found in collection"
"Clustering algorithm failed: NaN values in embeddings"
null

