Get Latest Cluster Execution

curl --request GET \
  --url https://api.mixpeek.com/v1/clusters/{cluster_id}/executions

{
  "run_id": "<string>",
  "cluster_id": "<string>",
  "status": "pending",
  "num_clusters": 1,
  "num_points": 1,
  "created_at": "2023-11-07T05:31:56Z",
  "metrics": {
    "calinski_harabasz_score": 1234.56,
    "davies_bouldin_index": 0.42,
    "description": "Excellent clustering quality",
    "silhouette_score": 0.85
  },
  "centroids": [
    {
      "cluster_id": "<string>",
      "num_members": 1,
      "label": "Product Reviews",
      "summary": "This cluster contains documents related to product reviews and customer feedback.",
      "keywords": [
        "reviews",
        "products",
        "feedback"
      ]
    }
  ],
  "completed_at": "2025-11-13T13:25:40.122000Z",
  "error_message": "Insufficient documents for clustering: need at least 2 documents",
  "llm_labeling_errors": [
    "{\"error\": \"LLM API timeout\", \"clusters\": [\"cl_3\", \"cl_5\"]}",
    "{\"error\": \"No representative documents\", \"clusters\": [\"cl_1\"]}"
  ]
}

Executions

Get Latest Cluster Execution

Get the most recent execution results for a cluster.

Returns execution metadata including:

Execution status (pending, processing, completed, failed)
Clustering metrics (silhouette score, Davies-Bouldin index, etc.)
Number of clusters found and documents processed
Centroid information with labels and summaries
Execution timestamps

Useful for:

Displaying cluster statistics in dashboards
Showing cluster quality metrics to users
Rendering cluster labels and summaries in the UI
Tracking execution status and errors

GET

clusters

{cluster_id}

executions

Get Latest Cluster Execution

curl --request GET \
  --url https://api.mixpeek.com/v1/clusters/{cluster_id}/executions

{
  "run_id": "<string>",
  "cluster_id": "<string>",
  "status": "pending",
  "num_clusters": 1,
  "num_points": 1,
  "created_at": "2023-11-07T05:31:56Z",
  "metrics": {
    "calinski_harabasz_score": 1234.56,
    "davies_bouldin_index": 0.42,
    "description": "Excellent clustering quality",
    "silhouette_score": 0.85
  },
  "centroids": [
    {
      "cluster_id": "<string>",
      "num_members": 1,
      "label": "Product Reviews",
      "summary": "This cluster contains documents related to product reviews and customer feedback.",
      "keywords": [
        "reviews",
        "products",
        "feedback"
      ]
    }
  ],
  "completed_at": "2025-11-13T13:25:40.122000Z",
  "error_message": "Insufficient documents for clustering: need at least 2 documents",
  "llm_labeling_errors": [
    "{\"error\": \"LLM API timeout\", \"clusters\": [\"cl_3\", \"cl_5\"]}",
    "{\"error\": \"No representative documents\", \"clusters\": [\"cl_1\"]}"
  ]
}

Headers

Authorization

string

REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.

Examples:

"Bearer YOUR_API_KEY"

"Bearer YOUR_STRIPE_API_KEY"

X-Namespace

string

REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'

Examples:

"ns_abc123def456"

"production"

"my-namespace"

Path Parameters

cluster_id

string

required

Cluster ID

Response

Successful Response

Complete results from a single clustering execution.

Represents the outcome of running a clustering algorithm on a collection's documents. Each execution creates a snapshot of clustering results at a point in time, including the clusters found, quality metrics, and semantic labels.

Use Cases: - Display clustering execution history in UI - Compare clustering quality across multiple runs - Track execution status for long-running jobs - Debug failed clustering attempts - View cluster summaries and labels for analysis

Workflow: 1. Create cluster configuration → POST /clusters 2. Execute clustering → POST /clusters/{id}/execute 3. Poll execution status → GET /clusters/{id}/executions 4. View execution history → POST /clusters/{id}/executions/list

Status Lifecycle: pending → processing → completed (or failed)

Note: Execution results are immutable once completed. Re-running clustering creates a new execution result with a new run_id.

run_id

string

required

REQUIRED. Unique identifier for this specific clustering execution. Format: 'run_' prefix followed by random alphanumeric string. Used to retrieve specific execution artifacts and results. Each re-execution of the same cluster creates a new run_id. References execution artifacts in S3 and MongoDB.

Examples:

"run_a8e270953254754b"

"run_b3f58210ab"

"run_xyz789"

cluster_id

string

required

REQUIRED. Parent cluster configuration that was executed. Format: 'clust_' prefix followed by random alphanumeric string. Links this execution back to the cluster definition. Multiple executions can share the same cluster_id.

Examples:

"clust_ae3e28a429"

"clust_xyz789"

"clust_abc123"

status

enum<string>

required

REQUIRED. Current status of the clustering execution. Values: 'pending' = Job queued, waiting to start. 'processing' = Clustering algorithm running (may take minutes for large datasets). 'completed' = Clustering finished successfully, results available. 'failed' = Clustering failed, check error_message for details. Status changes: pending → processing → (completed OR failed). Poll this field to track job progress.

Available options:

pending,

processing,

completed,

failed

Examples:

"completed"

"processing"

"pending"

"failed"

num_clusters

integer

required

REQUIRED. Number of clusters found by the clustering algorithm. Range: 1 to num_points (though typically much lower). Interpretation: Too few clusters = overgeneralization, may need lower n_clusters param. Too many clusters = overfitting, may need higher n_clusters param. Optimal value depends on dataset and use case. Available immediately upon completion, even if metrics fail.

Required range: x >= 0

Examples:

3

5

10

25

num_points

integer

required

REQUIRED. Total number of documents/points that were clustered. Equals the count of documents in the collection at execution time. Note: This may differ across executions if documents were added/removed. Used to calculate metrics and validate clustering quality. Minimum 2 points required for clustering (1 cluster per point otherwise).

Required range: x >= 0

Examples:

100

1000

50000

created_at

string<date-time>

required

REQUIRED. Timestamp when the clustering execution started. ISO 8601 format with timezone (UTC). Used to: - Sort executions chronologically. - Calculate execution duration (completed_at - created_at). - Filter execution history by date range. Always present, even for failed executions.

Examples:

"2025-11-13T13:20:40.122000Z"

"2025-11-13T10:00:00.000000Z"

metrics

ClusterExecutionMetrics · object

OPTIONAL. Quality metrics evaluating clustering performance. NOT REQUIRED - only present for successful executions. null if: - Execution is still pending/processing. - Execution failed. - Too few points to calculate metrics (need 2+ points). Contains silhouette_score, davies_bouldin_index, calinski_harabasz_score. Use to compare quality across multiple executions.

Show child attributes

Example:

{
  "calinski_harabasz_score": 1234.56,
  "davies_bouldin_index": 0.42,
  "description": "Excellent clustering quality",
  "silhouette_score": 0.85
}

centroids

ClusterExecutionCentroid · object[] | null

OPTIONAL. List of cluster centroids with semantic labels. NOT REQUIRED - only present for completed executions with LLM labeling enabled. Length: equals num_clusters. Each centroid contains: - cluster_id: Identifier for the cluster (e.g., 'cl_0'). - num_members: Count of documents in this cluster. - label: Human-readable cluster name (e.g., 'Product Reviews'). - summary: Brief description of cluster content. - keywords: Array of representative terms. null if: - Execution pending/processing/failed. - LLM labeling not configured. Use for: Displaying cluster summaries in UI, filtering by cluster.

Show child attributes

completed_at

string<date-time> | null

OPTIONAL. Timestamp when the clustering execution finished. ISO 8601 format with timezone (UTC). NOT REQUIRED - only present for completed or failed executions. null if: status is 'pending' or 'processing'. Use to: - Calculate execution duration (completed_at - created_at). - Show when results became available. Present for both successful and failed executions.

Example:

"2025-11-13T13:25:40.122000Z"

error_message

string | null

OPTIONAL. Error message if the clustering execution failed. NOT REQUIRED - only present when status is 'failed'. null if: execution succeeded or is still in progress. Contains: - Human-readable error description. - Possible causes and suggested fixes. - Stack trace details (for debugging). Common errors: - 'Insufficient documents for clustering' (need 2+ docs). - 'Feature extractor not found' (invalid collection config). - 'Out of memory' (dataset too large for algorithm). Use for: Debugging failed executions and user error messages.

Example:

"Insufficient documents for clustering: need at least 2 documents"

llm_labeling_errors

string[] | null

OPTIONAL. List of errors encountered during LLM labeling. NOT REQUIRED - only present when LLM labeling was attempted and encountered errors. null if: - LLM labeling was not enabled. - LLM labeling succeeded for all clusters. - Execution is still in progress. Each error is a JSON string containing: - 'error': Human-readable error message. - 'clusters': List of cluster IDs affected by this error. Common errors: - 'LLM API timeout for 2 clusters' (network/API issues). - 'OpenAI rate limit exceeded' (quota exhausted). - 'Invalid model name: gpt-3.5' (config error). - 'No representative documents for cluster cl_3' (empty cluster). Use for: - Debugging why some clusters have fallback labels. - Identifying LLM API issues without failing entire clustering. - Warning users about partial labeling success.

Example:

[
  "{\"error\": \"LLM API timeout\", \"clusters\": [\"cl_3\", \"cl_5\"]}",
  "{\"error\": \"No representative documents\", \"clusters\": [\"cl_1\"]}"
]

Delete Cluster Trigger Get Specific Cluster Execution

⌘I

Namespaces

Buckets

Feature Extractors

Collections

Retrievers

Taxonomies

Clusters

Templates

Manifest

Resource Search

Inference

Tasks

Webhooks

Get Latest Cluster Execution

Headers

Path Parameters

Response