Mixpeek consists of an API service and an Engine service (Ray). It depends on MongoDB, Qdrant, Redis, and S3‑compatible storage. This guide outlines common topologies, configuration, health, and scaling.

Overview

API

FastAPI app with auth, tenancy, routes, and orchestration

Engine

Ray Serve app for extractors, clustering, and taxonomy execution

Datastores

MongoDB (metadata), Qdrant (vectors), Redis (rate limit/tasks), S3 (artifacts)

Observability

Health checks, logs, metrics, and tasks status

Architecture

Configuration

  • ENV, DEBUG, LOG_LEVEL
  • QDRANT_URL, QDRANT_API_KEY
  • MONGO_URI, MONGODB_API_DB
  • REDIS_URL
  • AWS_BUCKET, AWS_REGION

Health and readiness

  • Route: /v1/health — checks Redis, MongoDB, Qdrant, Celery, Engine
  • Reference: Health API
curl -s https://api.mixpeek.com/v1/health | jq

Deploy steps (typical)

1

Provision deps

Managed MongoDB, Qdrant, Redis, and S3 bucket with IAM
2

Deploy Engine

Start Ray Serve with configured workers and resource limits
3

Deploy API

Start FastAPI app; point to Engine and datastores via env vars
4

Verify

Call /v1/health and run a smoke: create namespace → collection → retriever → execute

Scaling and upgrades

Scale Engine

Add Ray workers for extractor and clustering throughput

Scale Qdrant

Increase memory/IO; shard by namespace if needed

API Autoscale

Horizontally scale API; rate limit via Redis to protect backends

Zero‑downtime

Blue/green or rolling restarts; confirm /v1/health before cutover

Operations references