Skip to main content
POST
/
v1
/
retrievers
/
execute
Execute Adhoc Retriever
curl --request POST \
  --url https://api.mixpeek.com/v1/retrievers/execute \
  --header 'Content-Type: application/json' \
  --data '
{
  "input_schema": {},
  "stages": [
    {
      "stage_name": "<string>",
      "config": {},
      "stage_type": "filter",
      "batch_size": "<string>",
      "description": "<string>"
    }
  ],
  "inputs": {},
  "collection_identifiers": [
    "<string>"
  ],
  "budget_limits": {
    "max_credits": 100,
    "max_time_ms": 60000
  },
  "stream": false
}
'
{
  "status": 123,
  "error": {
    "message": "<string>",
    "type": "<string>",
    "details": {}
  },
  "success": false
}

Headers

Authorization
string

REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.

X-Namespace
string

REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'

Body

application/json

Request to execute a retriever ad-hoc without persistence.

This combines retriever creation parameters with execution inputs to allow one-time retrieval without saving the retriever configuration.

Use Cases: - One-time queries without polluting retriever registry - Testing retriever configurations before persisting - Dynamic retrieval with varying stage configurations - Temporary search operations

Behavior: - Retriever is NOT saved to database - Execution history is logged but marked as ad-hoc - Response includes X-Execution-Mode: adhoc header - execution_metadata.retriever_persisted = False

Streaming Execution (stream=True): When streaming is enabled, the response uses Server-Sent Events (SSE) format with Content-Type: text/event-stream. Each stage emits events as it executes:

Event Types:
- stage_start: Emitted when a stage begins execution
- stage_complete: Emitted when a stage finishes with results
- stage_error: Emitted if a stage encounters an error
- execution_complete: Emitted after all stages finish successfully
- execution_error: Emitted if the entire execution fails

Each event is a StreamStageEvent containing:
- event_type: The type of event
- execution_id: Unique execution identifier
- stage_name: Human-readable stage name
- stage_index: Zero-based stage position
- total_stages: Total number of stages
- documents: Intermediate results (for stage_complete)
- statistics: Stage metrics (duration_ms, input_count, output_count, etc.)
- budget_used: Cumulative resource consumption (credits, time, tokens)

Response Headers (streaming):
- Content-Type: text/event-stream
- Cache-Control: no-cache
- Connection: keep-alive
- X-Execution-Mode: adhoc

Example streaming request:
```python
response = requests.post(
'/v1/retrievers/execute',
json={
'collection_identifiers': ['my_collection'],
'input_schema': {'query': {'type': 'text', 'required': True}},
'stages': [...],
'inputs': {'query': 'machine learning'},
'stream': True
},
stream=True
)
for line in response.iter_lines():
if line.startswith(b'data: '):
event = json.loads(line[6:])
print(f"{event['event_type']}: {event.get('stage_name')}")
```

Standard Execution (stream=False, default): Returns a single ExecuteRetrieverResponse with final documents, pagination, and aggregate statistics after all stages complete.

Examples: Simple ad-hoc search: { "collection_identifiers": ["col_123"], "input_schema": {"query": {"type": "text", "required": True}}, "stages": [{ "stage_name": "search", "stage_type": "filter", "config": { "stage_id": "feature_search", "parameters": { "feature_uris": [{ "uri": "urn:embedding:text:bge_base_en_v1_5:1", "input": {"text": "{{inputs.query}}"} }], "limit": 10 } } }], "inputs": {"query": "machine learning"}, "stream": false }

input_schema
Input Schema · object
required

REQUIRED. Input schema defining expected inputs. Each key is an input name, value is a BucketSchemaField.

stages
StageConfig · object[]
required

REQUIRED. Ordered list of stage configurations. At least one stage is required for execution.

Minimum array length: 1
inputs
Inputs · object
required

REQUIRED. Input values matching the input_schema. These values are passed to stages for parameterization.

collection_identifiers
string[]

Collection identifiers (names or IDs) to query. Can be collection names or IDs. Names are automatically resolved. Can be empty for query-only inference mode (e.g., LLM query analysis without documents).

budget_limits
BudgetLimits · object

OPTIONAL. Budget limits for execution.

Example:
{ "max_credits": 100, "max_time_ms": 60000 }
stream
boolean
default:false

Enable streaming execution to receive real-time stage updates via Server-Sent Events (SSE). NOT REQUIRED - defaults to False for standard execution.

When stream=True:

  • Response Content-Type: text/event-stream
  • Events emitted: stage_start, stage_complete, stage_error, execution_complete, execution_error
  • Each event is formatted as: data: {json}\n\n
  • StreamStageEvent contains: event_type, execution_id, stage_name, stage_index, total_stages, documents (intermediate), statistics, budget_used

When to use streaming:

  • Progress tracking for multi-stage pipelines
  • Displaying intermediate results as stages complete
  • Real-time budget and performance monitoring
  • Debugging pipeline execution

When to skip streaming:

  • Single-stage or fast pipelines (<100ms)
  • No need for intermediate results
  • Minimizing overhead is critical

Response

Successful Response