Execute Retriever (Auto-Optimized)

curl --request POST \
  --url https://api.mixpeek.com/v1/retrievers/{retriever_id}/execute \
  --header 'Content-Type: application/json' \
  --data '
{
  "inputs": {},
  "pagination": {
    "method": "offset",
    "page_size": 10,
    "page_number": 1
  },
  "stream": false
}
'

{
  "status": 123,
  "error": {
    "message": "<string>",
    "type": "<string>",
    "details": {}
  },
  "success": false
}

Retrievers

Execute Retriever (Auto-Optimized)

Execute a retriever and return matching documents. The pipeline is automatically optimized before execution for best performance.

Automatic Optimization: Your pipeline stages are automatically transformed for optimal performance:

Filters pushed down to reduce expensive operations
Redundant stages merged or eliminated
Grouping operations pushed to database layer (10-100x faster)
Operations reordered for efficiency

Streaming Support: Set stream=true in the request body to receive real-time stage updates via SSE:

Response uses text/event-stream content type
Each stage emits stage_start and stage_complete events
Final event contains complete results and pagination
Useful for progress tracking and debugging

Response Includes (when stream=false):

documents: Final matching documents
pagination: Pagination metadata
stage_statistics: Per-stage execution metrics
budget: Credit/time consumption
optimization_applied: Whether optimizations were applied
optimization_summary: Details about transformations (when applied)

Optimization Summary Example:

{
  "optimization_applied": true,
  "optimization_summary": {
    "original_stage_count": 5,
    "optimized_stage_count": 3,
    "optimization_time_ms": 8.2,
    "rules_applied": ["push_down_filters", "group_by_push_down"],
    "stage_reduction_pct": 40.0
  }
}

Use the /explain endpoint to see the optimized execution plan before running.

POST

retrievers

{retriever_id}

execute

Execute Retriever (Auto-Optimized)

curl --request POST \
  --url https://api.mixpeek.com/v1/retrievers/{retriever_id}/execute \
  --header 'Content-Type: application/json' \
  --data '
{
  "inputs": {},
  "pagination": {
    "method": "offset",
    "page_size": 10,
    "page_number": 1
  },
  "stream": false
}
'

{
  "status": 123,
  "error": {
    "message": "<string>",
    "type": "<string>",
    "details": {}
  },
  "success": false
}

Headers

Authorization

string

REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.

Examples:

"Bearer YOUR_API_KEY"

"Bearer YOUR_STRIPE_API_KEY"

X-Namespace

string

REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'

Examples:

"ns_abc123def456"

"production"

"my-namespace"

Path Parameters

retriever_id

string

required

Retriever ID or name. Pipeline will be automatically optimized before execution.

Query Parameters

return_presigned_urls

boolean

default:false

return_vectors

boolean

default:false

Body

application/json

Execution request with inputs, filters, pagination, and optional stream parameter. Set stream=true to receive real-time stage updates via Server-Sent Events.

inputs

Inputs · object

Runtime inputs for the retriever mapped to the input schema. Keys must match the retriever's input_schema field names. Values depend on field types (text, vector, filters, etc.). REQUIRED unless all retriever inputs have defaults.

Common input keys:

'query': Text search query
'embedding': Pre-computed vector for search
'top_k': Number of results to return
'min_score': Minimum relevance threshold
Any custom fields defined in input_schema

Template Syntax (Jinja2):

Namespaces (uppercase or lowercase):

INPUT / input: Query inputs (e.g., {{INPUT.query}})
DOC / doc: Document fields (e.g., {{DOC.payload.title}})
CONTEXT / context: Execution context
STAGE / stage: Stage configuration
SECRET / secret: Vault secrets (e.g., {{SECRET.api_key}})

Accessing Data:

Dot notation: {{DOC.payload.metadata.title}}
Bracket notation: {{DOC.payload['special-key']}}
Array index: {{DOC.items[0]}}, {{DOC.tags[2]}}
Array first/last: {{DOC.items | first}}, {{DOC.items | last}}

Array Operations:

Iterate: {% for item in DOC.tags %}{{item}}{% endfor %}
Extract key: {{DOC.items | map(attribute='name') | list}}
Join: {{DOC.tags | join(', ')}}
Length: {{DOC.items | length}}
Slice: {{DOC.items[:5]}}

Conditionals:

If: {% if DOC.status == 'active' %}...{% endif %}
If-else: {% if DOC.score > 0.8 %}high{% else %}low{% endif %}
Ternary: {{'yes' if DOC.enabled else 'no'}}

Built-in Functions: max, min, abs, round, ceil, floor Custom Filters: slugify (URL-safe), bool (truthy coercion), tojson (JSON encode)

S3 URLs: Internal S3 URLs (s3://bucket/key) are automatically presigned when accessed via DOC namespace.

Examples:

{
  "query": "artificial intelligence",
  "top_k": 25
}

{
  "min_score": 0.7,
  "query": "customer feedback",
  "top_k": 50
}

{
  "category": "blog",
  "embedding": [0.1, 0.2, 0.3],
  "top_k": 10
}

pagination

OffsetPaginationParams · object

Offset-based pagination using page number sizing.

Best for: Traditional page UIs with page number navigation

How it works:

Uses page numbers (1, 2, 3...) and page size
Calculates offset as: (page_number - 1) * page_size
Simple and familiar for users
Can jump to any page directly

Tradeoffs:

Can have "page drift" if data changes between requests
Example: Items added/deleted causes duplicates or gaps
Less efficient for large offsets (database must skip N rows)

Use when:

Building traditional page-numbered UIs
Users need to jump to specific pages
Result set is relatively stable
Working with smaller datasets

Example: Page 1: {"method": "offset", "page_size": 25, "page_number": 1} Page 2: {"method": "offset", "page_size": 25, "page_number": 2}

OffsetPaginationParams
CursorPaginationParams
ScrollPaginationParams
KeysetPaginationParams

Show child attributes

stream

boolean

default:false

Enable streaming execution to receive real-time stage updates via Server-Sent Events (SSE). NOT REQUIRED - defaults to False for standard execution.

When stream=True:

Response uses text/event-stream content type
Each stage completion emits a StreamStageEvent
Events include: stage_start, stage_complete, stage_error, execution_complete
Clients receive intermediate results and statistics as stages execute
Useful for progress tracking, debugging, and partial result display

When stream=False (default):

Response returns after all stages complete
Returns a single RetrieverExecutionResponse with final results
Lower overhead for simple queries

Use streaming when:

You want to show real-time progress to users
You need to display intermediate results
Pipeline has many stages or long-running operations
Debugging or monitoring pipeline performance

Example streaming client (JavaScript):

const eventSource = new EventSource('/v1/retrievers/ret_123/execute?stream=true');
eventSource.onmessage = (event) => {
  const stageEvent = JSON.parse(event.data);
  if (stageEvent.event_type === 'stage_complete') {
    console.log(`Stage ${stageEvent.stage_name} completed`);
    console.log(`Documents: ${stageEvent.documents.length}`);
  }
};

Example streaming client (Python):

import requests
response = requests.post('/v1/retrievers/ret_123/execute',
                        json={'inputs': {...}, 'stream': True},
                        stream=True)
for line in response.iter_lines():
    if line.startswith(b'data: '):
        event = json.loads(line[6:])
        print(f"Stage {event['stage_name']}: {event['event_type']}")

Examples:

false

true

Response

Execution results with documents, pagination, statistics, and optimization details. When stream=true, returns Server-Sent Events. When stream=false, returns JSON response.

Delete Retriever List Executions

⌘I

Namespaces

Buckets

Feature Extractors

Collections

Retrievers

Taxonomies

Clusters

Templates

Manifest

Resource Search

Inference

Tasks

Webhooks

Execute Retriever (Auto-Optimized)

Headers

Path Parameters

Query Parameters

Body

Response