Create and run clustering jobs

  • Create: Click New Cluster, select collections, pick vector or attribute clustering, and configure algorithm params. API: Create Cluster.
  • Execute: Run real-time clustering on the Engine or submit as a job for async processing. API: Execute Clustering and Submit Job.
  • Inspect: Review centroids, metrics, and members if saved. Download artifacts like parquet paths under Artifacts. API: Get Artifacts.
  • List/Get/Delete: Manage clustering configurations and results. API: List, Get, Delete.
  • Stream data: Browse cluster centroids and members directly. API: Stream Data.
  • Apply enrichment: Attach cluster labels back to a source or target collection at scale. API: Apply Enrichment.

Tips

  • Start with a sample size to validate parameters before full runs.
  • Use LLM labeling for human-friendly labels when vectors are dense and unlabeled.
1

Create a cluster job

Choose collections and configure algorithm parameters; optionally set dimensionality reduction.
2

Execute or submit

Run in real-time or submit as an asynchronous job and track via Tasks.
3

Inspect and enrich

Review centroids and metrics, then apply enrichment back to collections if desired.
Artifacts such as parquet paths allow downstream analytics and reproducible exploration.