Skip to main content
GET
/
v1
/
clusters
/
{cluster_id}
/
executions
/
{run_id}
Get Specific Cluster Execution
curl --request GET \
  --url https://api.mixpeek.com/v1/clusters/{cluster_id}/executions/{run_id} \
  --header 'Authorization: Bearer <token>' \
  --header 'X-Namespace: <x-namespace>'
{
  "centroids": [
    {
      "cluster_id": "cl_0",
      "keywords": [
        "product",
        "review",
        "quality"
      ],
      "label": "Product Reviews",
      "num_members": 45,
      "summary": "Customer feedback about products"
    },
    {
      "cluster_id": "cl_1",
      "keywords": [
        "help",
        "issue",
        "support"
      ],
      "label": "Support Tickets",
      "num_members": 35,
      "summary": "Technical support requests"
    },
    {
      "cluster_id": "cl_2",
      "keywords": [
        "feature",
        "request",
        "suggestion"
      ],
      "label": "Feature Requests",
      "num_members": 20,
      "summary": "User feature suggestions"
    }
  ],
  "cluster_id": "clust_ae3e28a429",
  "completed_at": "2025-11-13T13:25:40.122000Z",
  "created_at": "2025-11-13T13:20:40.122000Z",
  "description": "Completed execution with excellent metrics",
  "metrics": {
    "calinski_harabasz_score": 1234.56,
    "davies_bouldin_index": 0.42,
    "silhouette_score": 0.85
  },
  "num_clusters": 3,
  "num_points": 100,
  "run_id": "run_a8e270953254754b",
  "status": "completed"
}

Authorizations

Authorization
string
header
required

Bearer token authentication using your API key. Format: 'Bearer your_api_key'. To get an API key, create an account at mixpeek.com/start and generate a key in your account settings.

Headers

Authorization
string
required

REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.

Examples:

"Bearer sk_live_abc123def456"

"Bearer sk_test_xyz789"

X-Namespace
string
required

REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'

Examples:

"ns_abc123def456"

"production"

"my-namespace"

Path Parameters

cluster_id
string
required

Cluster ID

run_id
string
required

Run ID

Response

Successful Response

Complete results from a single clustering execution.

Represents the outcome of running a clustering algorithm on a collection's documents. Each execution creates a snapshot of clustering results at a point in time, including the clusters found, quality metrics, and semantic labels.

Use Cases: - Display clustering execution history in UI - Compare clustering quality across multiple runs - Track execution status for long-running jobs - Debug failed clustering attempts - View cluster summaries and labels for analysis

Workflow: 1. Create cluster configuration → POST /clusters 2. Execute clustering → POST /clusters/{id}/execute 3. Poll execution status → GET /clusters/{id}/executions 4. View execution history → POST /clusters/{id}/executions/list

Status Lifecycle: pending → processing → completed (or failed)

Note: Execution results are immutable once completed. Re-running clustering creates a new execution result with a new run_id.

run_id
string
required

REQUIRED. Unique identifier for this specific clustering execution. Format: 'run_' prefix followed by random alphanumeric string. Used to retrieve specific execution artifacts and results. Each re-execution of the same cluster creates a new run_id. References execution artifacts in S3 and MongoDB.

Examples:

"run_a8e270953254754b"

"run_b3f58210ab"

"run_xyz789"

cluster_id
string
required

REQUIRED. Parent cluster configuration that was executed. Format: 'clust_' prefix followed by random alphanumeric string. Links this execution back to the cluster definition. Multiple executions can share the same cluster_id.

Examples:

"clust_ae3e28a429"

"clust_xyz789"

"clust_abc123"

status
enum<string>
required

REQUIRED. Current status of the clustering execution. Values: 'pending' = Job queued, waiting to start. 'processing' = Clustering algorithm running (may take minutes for large datasets). 'completed' = Clustering finished successfully, results available. 'failed' = Clustering failed, check error_message for details. Status changes: pending → processing → (completed OR failed). Poll this field to track job progress.

Available options:
pending,
processing,
completed,
failed
Examples:

"completed"

"processing"

"pending"

"failed"

num_clusters
integer
required

REQUIRED. Number of clusters found by the clustering algorithm. Range: 1 to num_points (though typically much lower). Interpretation: Too few clusters = overgeneralization, may need lower n_clusters param. Too many clusters = overfitting, may need higher n_clusters param. Optimal value depends on dataset and use case. Available immediately upon completion, even if metrics fail.

Required range: x >= 0
Examples:

3

5

10

25

num_points
integer
required

REQUIRED. Total number of documents/points that were clustered. Equals the count of documents in the collection at execution time. Note: This may differ across executions if documents were added/removed. Used to calculate metrics and validate clustering quality. Minimum 2 points required for clustering (1 cluster per point otherwise).

Required range: x >= 0
Examples:

100

1000

50000

created_at
string<date-time>
required

REQUIRED. Timestamp when the clustering execution started. ISO 8601 format with timezone (UTC). Used to: - Sort executions chronologically. - Calculate execution duration (completed_at - created_at). - Filter execution history by date range. Always present, even for failed executions.

Examples:

"2025-11-13T13:20:40.122000Z"

"2025-11-13T10:00:00.000000Z"

metrics
object | null

OPTIONAL. Quality metrics evaluating clustering performance. NOT REQUIRED - only present for successful executions. null if: - Execution is still pending/processing. - Execution failed. - Too few points to calculate metrics (need 2+ points). Contains silhouette_score, davies_bouldin_index, calinski_harabasz_score. Use to compare quality across multiple executions. Quality metrics for evaluating clustering execution performance.

Provides statistical measures to assess the quality of the clustering results. Higher quality clusters have better cohesion (documents within clusters are similar) and separation (clusters are distinct from each other).

Use Cases: - Compare quality across multiple clustering executions - Determine optimal number of clusters for a dataset - Validate clustering algorithm performance - Track clustering quality over time - Debug clustering issues (poor metrics indicate problems)

Interpretation: - Use silhouette_score as primary quality indicator (0.5+ = good, 0.7+ = excellent) - Lower davies_bouldin_index indicates better-separated clusters - Higher calinski_harabasz_score indicates denser, better-separated clusters

Note: All metrics are OPTIONAL and only present if clustering completed successfully. Failed executions return null for all metrics.

Examples:
{
"calinski_harabasz_score": 1234.56,
"davies_bouldin_index": 0.42,
"description": "Excellent clustering quality",
"silhouette_score": 0.85
}
{
"calinski_harabasz_score": 678.34,
"davies_bouldin_index": 0.89,
"description": "Good clustering quality",
"silhouette_score": 0.67
}
{
"calinski_harabasz_score": 45.12,
"davies_bouldin_index": 2.45,
"description": "Poor clustering quality (needs tuning)",
"silhouette_score": 0.23
}
{
"description": "Metrics not available (failed execution)"
}
centroids
ClusterExecutionCentroid · object[] | null

OPTIONAL. List of cluster centroids with semantic labels. NOT REQUIRED - only present for completed executions with LLM labeling enabled. Length: equals num_clusters. Each centroid contains: - cluster_id: Identifier for the cluster (e.g., 'cl_0'). - num_members: Count of documents in this cluster. - label: Human-readable cluster name (e.g., 'Product Reviews'). - summary: Brief description of cluster content. - keywords: Array of representative terms. null if: - Execution pending/processing/failed. - LLM labeling not configured. Use for: Displaying cluster summaries in UI, filtering by cluster.

completed_at
string<date-time> | null

OPTIONAL. Timestamp when the clustering execution finished. ISO 8601 format with timezone (UTC). NOT REQUIRED - only present for completed or failed executions. null if: status is 'pending' or 'processing'. Use to: - Calculate execution duration (completed_at - created_at). - Show when results became available. Present for both successful and failed executions.

Examples:

"2025-11-13T13:25:40.122000Z"

"2025-11-13T10:05:32.456000Z"

null

error_message
string | null

OPTIONAL. Error message if the clustering execution failed. NOT REQUIRED - only present when status is 'failed'. null if: execution succeeded or is still in progress. Contains: - Human-readable error description. - Possible causes and suggested fixes. - Stack trace details (for debugging). Common errors: - 'Insufficient documents for clustering' (need 2+ docs). - 'Feature extractor not found' (invalid collection config). - 'Out of memory' (dataset too large for algorithm). Use for: Debugging failed executions and user error messages.

Examples:

"Insufficient documents for clustering: need at least 2 documents"

"Feature extractor 'invalid_extractor' not found in collection"

"Clustering algorithm failed: NaN values in embeddings"

null