Skip to main content
This guide spins up the end-to-end Mixpeek workflow: create an isolated namespace, register raw objects, materialize features through the Engine, and query results with a stage-based retriever. Every request matches the current OpenAPI specification.

Prerequisites

  • A Mixpeek account and API key (obtain one at mixpeek.com/start)
  • curl (or an HTTP client of your choice)
  • Basic familiarity with JSON payloads
export MP_API_URL="https://api.mixpeek.com"
export MP_API_KEY="sk_live_replace_me"
All subsequent examples send two headers:
-H "Authorization: Bearer $MP_API_KEY"
-H "X-Namespace: ns_quickstart"   # replace with your namespace id once created

1. Create (or Choose) a Namespace

Namespaces guarantee tenant isolation across MongoDB, Qdrant, Redis, and task execution. If you already have one, skip to step 2.
curl -sS -X POST "$MP_API_URL/v1/namespaces" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "namespace_name": "quickstart",
    "description": "Docs quickstart namespace",
    "feature_extractors": [
      { "feature_extractor_name": "text_extractor", "version": "v1" }
    ]
  }'
Copy the returned namespace_id and export it:
export MP_NAMESPACE="ns_quickstart"

2. Create a Bucket

Buckets validate object shape and track blobs in S3-compatible storage.
curl -sS -X POST "$MP_API_URL/v1/buckets" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "bucket_name": "quickstart-bucket",
    "description": "Sample product descriptions",
    "schema": {
      "properties": {
        "product_text": { "type": "text", "required": true }
      }
    }
  }'
Set an environment variable for the bucket_id returned above.

3. Define a Collection with a Feature Extractor

Collections map bucket objects into documents by running feature extractors on the Engine. In v2 the feature_extractor field is singular.
curl -sS -X POST "$MP_API_URL/v1/collections" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "collection_name": "quickstart-docs",
    "description": "Embeddings for product text",
    "source": {
      "type": "bucket",
      "bucket_id": "<bucket_id>"
    },
    "feature_extractor": {
      "feature_extractor_name": "text_extractor",
      "version": "v1",
      "input_mappings": {
        "text": "product_text"
      },
      "field_passthrough": [
        { "source_path": "metadata.category" }
      ],
      "parameters": {
        "model": "multilingual-e5-large-instruct"
      }
    }
  }'
Collections immediately expose their deterministic output_schema, so you can build integrations before any documents are processed. For all available feature extractors, see Feature Extractors.

4. Register an Object

Objects simply register blobs and metadata in the bucket. Processing happens later.
curl -sS -X POST "$MP_API_URL/v1/buckets/<bucket_id>/objects" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "key_prefix": "/catalog",
    "metadata": { "category": "headphones" },
    "blobs": [
      {
        "property": "product_text",
        "type": "text",
        "data": "Lightweight wireless headphones with active noise cancellation."
      }
    ]
  }'
Store the returned object_id.

5. Create and Submit a Batch

Flatten objects into per-extractor artifacts and dispatch the Engine.
curl -sS -X POST "$MP_API_URL/v1/buckets/<bucket_id>/batches" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "object_ids": ["<object_id>"]
  }'
Submit the batch for processing (note the returned task_id):
curl -sS -X POST "$MP_API_URL/v1/buckets/<bucket_id>/batches/<batch_id>/submit" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{ "include_processing_history": true }'

6. Track Task Progress

Task metadata lives in Redis with MongoDB persistence. Poll until status is COMPLETED (fallback to the batch resource if the task ages out after 24h).
curl -sS -X GET "$MP_API_URL/v1/tasks/<task_id>" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"

7. Inspect Documents

curl -sS -X POST "$MP_API_URL/v1/collections/<collection_id>/documents/list" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "limit": 10,
    "filters": {
      "field": "metadata.category",
      "operator": "eq",
      "value": "headphones"
    },
    "return_url": false
  }'
Every document includes lineage back to the root object (root_object_id) and feature URIs you can query later.

8. Create a Retriever

Retrievers combine stage-based pipelines with cache-aware execution. Stages fall into categories like filter, sort, and apply (for enrichment and transformation).

Basic Retriever

A simple semantic search retriever:
curl -sS -X POST "$MP_API_URL/v1/retrievers" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "retriever_name": "quickstart-search",
    "description": "Semantic search over product descriptions",
    "input_schema": {
      "properties": {
        "query_text": { "type": "text", "required": true }
      }
    },
    "collection_ids": ["<collection_id>"],
    "stages": [
      {
        "stage_type": "filter",
        "stage_id": "feature_search",
        "parameters": {
          "feature_uri": "urn:embedding:text:multilingual_e5_large:1",
          "input": { "text": "{{INPUT.query_text}}" },
          "limit": 20
        }
      }
    ],
    "cache_config": {
      "enabled": true,
      "ttl_seconds": 300
    }
  }'

Advanced Retriever with Apply Stages

Retrievers support powerful apply stages for enrichment and transformation:
curl -sS -X POST "$MP_API_URL/v1/retrievers" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "retriever_name": "enriched-product-search",
    "description": "Search with external enrichment and transformation",
    "input_schema": {
      "properties": {
        "query_text": { "type": "text", "required": true },
        "customer_id": { "type": "string", "required": false }
      }
    },
    "collection_ids": ["<collection_id>"],
    "stages": [
      {
        "stage_type": "filter",
        "stage_id": "feature_search",
        "parameters": {
          "feature_uri": "urn:embedding:text:multilingual_e5_large:1",
          "input": { "text": "{{INPUT.query_text}}" },
          "limit": 20
        }
      },
      {
        "stage_type": "apply",
        "stage_id": "document_enrich",
        "parameters": {
          "target_collection_id": "<catalog_collection_id>",
          "source_field": "metadata.product_id",
          "target_field": "product_id",
          "fields_to_merge": ["price", "inventory_count", "description"],
          "output_field": "catalog_data"
        }
      },
      {
        "stage_type": "apply",
        "stage_id": "api_call",
        "parameters": {
          "url": "https://api.stripe.com/v1/customers/{{INPUT.customer_id}}",
          "method": "GET",
          "allowed_domains": ["api.stripe.com"],
          "auth": {
            "type": "bearer",
            "secret_ref": "stripe_api_key"
          },
          "output_field": "metadata.customer_data",
          "on_error": "skip"
        }
      },
      {
        "stage_type": "apply",
        "stage_id": "json_transform",
        "parameters": {
          "template": "{\"id\": \"{{DOC.document_id}}\", \"title\": \"{{DOC.metadata.title}}\", \"price\": {{DOC.catalog_data.price}}, \"score\": {{DOC.score}}}",
          "fail_on_error": false
        }
      }
    ]
  }'
For all available retriever stages, see Retriever Pipelines. Execute the retriever:
curl -sS -X POST "$MP_API_URL/v1/retrievers/<retriever_id>/execute" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "query_text": "wireless headphones with noise cancelling",
      "customer_id": "cus_abc123"
    },
    "limit": 5,
    "return_urls": false
  }'
Responses include execution telemetry (stage_statistics, budget, execution_id) so you can troubleshoot latency or cache behavior.

9. (Optional) Enrich with a Taxonomy

Taxonomies reuse retrievers under the hood to enrich documents via JOIN stages.
curl -sS -X POST "$MP_API_URL/v1/taxonomies" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "taxonomy_name": "product-categories",
    "taxonomy_type": "flat",
    "retriever_id": "<retriever_id>",
    "input_mappings": {
      "query_embedding": "mixpeek://text_extractor@v1/text_embedding"
    },
    "source_collection": {
      "collection_id": "<collection_id>",
      "enrichment_fields": [
        { "field_path": "metadata.category", "merge_mode": "replace" }
      ]
    }
  }'
Attach the taxonomy to your collection’s taxonomy_applications for materialized enrichment, or add a taxonomy stage to the retriever for on-demand enrichment.

Where to Go Next

Need help? Click “Talk to Engineers” in the top bar and we’ll assist with deployment, scaling, or integration design.