Skip to main content
Webhooks notify your systems when ingestion, enrichment, or retrieval events occur. The Engine writes events to MongoDB, Celery Beat dispatches them, and Mixpeek delivers HTTP POST requests (or channel-specific messages) until acknowledged.

Event Flow

Engine → MongoDB (webhook_events) → Celery Beat → Celery Worker → HTTP POST → Your endpoint
  • Events persist until the worker receives a 2xx response.
  • Retries use exponential backoff; failures remain in MongoDB for inspection.
  • Delivery includes cache scopes so you can invalidate selectively.

Common Event Types

EventTriggerPayload Highlights
collection.documents.writtenEngine finishes writing documents to Qdrantcollection_id, document_ids, index_signature
object.createdObject registered in a bucketbucket_id, object_id, metadata snapshot
batch.completedBatch processing succeededbatch_id, collection_ids, counts
cluster.completedClustering run finishedcluster_id, run_id, artifact URIs
taxonomy.materializedTaxonomy enrichment completedtaxonomy_id, collection_id, updated_document_count
Use /api-reference/webhooks/list-webhooks to see the full catalog.

Create a Webhook

curl -sS -X POST "$MP_API_URL/v1/organizations/webhooks" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "webhook_name": "prod-notifications",
    "event_types": [
      "collection.documents.written",
      "cluster.completed"
    ],
    "channels": [
      {
        "channel": "webhook",
        "configs": {
          "url": "https://hooks.example.com/mixpeek",
          "headers": { "X-Mixpeek-Secret": "super-secret"},
          "timeout": 10
        }
      }
    ]
  }'
Mixpeek also supports Slack, email, and SMS channels via channel-specific configs.

Payload Structure

{
  "event_id": "evt_123",
  "event_type": "collection.documents.written",
  "occurred_at": "2025-10-28T10:03:22Z",
  "namespace_id": "ns_prod",
  "subject": {
    "type": "collection",
    "id": "col_products"
  },
  "metadata": {
    "document_count": 100,
    "collection_id": "col_products",
    "index_signature": "sig_xyz789"
  },
  "cache_scope": {
    "scope": "collection",
    "collection_id": "col_products"
  }
}
Use event_id for deduplication and store payloads for auditing.

Security & Reliability

  • Require HTTPS endpoints; reject plaintext URLs.
  • Include a shared secret header (X-Mixpeek-Secret) and verify before processing.
  • Respond quickly (<10s). Offload heavy work to background jobs and return 200.
  • Use idempotent handlers; Mixpeek may retry on failure or timeout.
  • Monitor webhook delivery with your logging pipeline; correlate by event_id.

Operational Tips

  1. Subscribe only to the events you need to reduce noise.
  2. Combine webhook notifications with the Tasks API for full status context.
  3. Use cache scopes to invalidate retriever caches efficiently (collection, namespace, or key).
  4. Store webhook definitions in infrastructure-as-code so environments stay consistent.
  5. Alert on sustained non-2xx responses—Mixpeek will keep retrying, but you should fix endpoint issues quickly.

References