Authorizations
Bearer token authentication using your API key. Format: 'Bearer your_api_key'. To get an API key, create an account at mixpeek.com/start and generate a key in your account settings.
Headers
REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
"Bearer sk_live_abc123def456"
"Bearer sk_test_xyz789"
REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'
"ns_abc123def456"
"production"
"my-namespace"
Path Parameters
The ID of the collection.
Body
Request model for creating a document.
ID of the collection the document belongs to.
"collection_123"
Optional denormalized root object identifier provided during creation.
Optional denormalized bucket identifier provided during creation.
Optional immediate parent type for the document.
bucket, collection Optional parent collection identifier when sourced from a collection.
Optional parent document identifier when sourced from a collection.
Optional parent object identifier when sourced directly from a bucket.
Optional materialized lineage path to set during creation.
Processing steps from root object to this document. Recommended for decomposition trees.
Features to associate with the document
Response
Successful Response
Response model for a single document.
This is the standard response format when fetching documents via API endpoints. Contains all document data plus optional presigned URLs for S3 blobs.
Key Fields: - document_id: Application ID for queries/references - lineage: Complete processing history and source tracking - User-defined fields: Any fields from source passed via field_passthrough - source_blobs: Which blobs from source object were processed - document_blobs: Artifacts generated by extractor (thumbnails, etc.) - enrichment fields: Flat taxonomy/cluster fields (e.g., taxonomy_*_label, cluster_id)
Query Parameters Affecting Response: - return_url=true: Adds presigned_url to each document_blobs entry - return_vectors=true: Includes embedding arrays in response
Use Cases: - Display document details in UI - Download source files or generated artifacts - Understand document provenance and processing - Access enrichment fields (flat) for filtering/display
REQUIRED. Unique identifier for the document. Format: 'doc_' prefix + alphanumeric characters. Use for: API queries, references, filtering.
"doc_f8966ff29c18e20c6b45e053"
"doc_abc123"
REQUIRED. ID of the collection this document belongs to. Format: 'col_' prefix + alphanumeric characters. Use for: Collection-scoped queries, filtering.
"col_articles"
"col_video_frames"
Denormalized root object identifier for the document.
"obj_video123"
Denormalized bucket identifier for the document's root object.
"bkt_marketing"
Immediate parent type that produced this document (bucket or collection).
bucket, collection "bucket"
"collection"
Collection identifier of the immediate parent when sourced from a collection.
"col_frames"
Document identifier of the immediate parent when sourced from a collection.
"doc_frame_050"
Bucket object identifier of the immediate parent when sourced from a bucket.
"obj_video_123"
Materialized lineage path for fast lookups (e.g., 'bkt_123/col_frames/col_scenes').
"bkt_marketing/col_frames/col_scenes"
Processing steps from root object to this document. Contains: collection_id, feature_extractor_id, document_id (if intermediate), timestamp. Use for: Visualizing pipeline, debugging, audit trail.
Lightweight references to source object's blobs (blob_id, blob_property, blob_type). For full details, fetch: GET /buckets/{bucket_id}/objects/{object_id}
System metadata (ingestion_status, feature_extractor_config_hash, etc.)
Vector embedding for the document (only included when return_vectors=true)
NOT REQUIRED - only populated when return_url=true query parameter is used. Single presigned URL for the primary source blob (for backward compatibility). For multiple blobs, use presigned_urls array or document_blobs[].presigned_url instead.
Artifacts generated during feature extraction (thumbnails, processed outputs). When return_url=true, each entry includes a presigned_url field.
Aggregated presigned URLs for all blobs. Only populated when return_url=true query parameter is provided.

