Mixpeek consists of an API service and an Engine service (Ray). It depends on MongoDB, Qdrant, Redis, and S3‑compatible storage. This guide outlines common topologies, configuration, health, and scaling.
Overview
API
FastAPI app with auth, tenancy, routes, and orchestration
Engine
Ray Serve app for extractors, clustering, and taxonomy execution
Datastores
MongoDB (metadata), Qdrant (vectors), Redis (rate limit/tasks), S3 (artifacts)
Observability
Health checks, logs, metrics, and tasks status
Architecture
Configuration
ENV
,DEBUG
,LOG_LEVEL
QDRANT_URL
,QDRANT_API_KEY
MONGO_URI
,MONGODB_API_DB
-
REDIS_URL
AWS_BUCKET
,AWS_REGION
Health and readiness
- Route: /v1/health — checks Redis, MongoDB, Qdrant, Celery, Engine
- Reference: Health API
Deploy steps (typical)
1
Provision deps
Managed MongoDB, Qdrant, Redis, and S3 bucket with IAM
2
Deploy Engine
Start Ray Serve with configured workers and resource limits
3
Deploy API
Start FastAPI app; point to Engine and datastores via env vars
4
Verify
Call /v1/health and run a smoke: create namespace → collection → retriever → execute
Scaling and upgrades
Scale Engine
Add Ray workers for extractor and clustering throughput
Scale Qdrant
Increase memory/IO; shard by namespace if needed
API Autoscale
Horizontally scale API; rate limit via Redis to protect backends
Zero‑downtime
Blue/green or rolling restarts; confirm /v1/health before cutover
Operations references
- Observability: Logs, Metrics, Tasks
- Security: Auth, tenancy, rate limiting
- Tasks: Monitor jobs