Create a document by ID.
REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
"Bearer YOUR_API_KEY"
"Bearer YOUR_STRIPE_API_KEY"
REQUIRED: Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'
"ns_abc123def456"
"production"
"my-namespace"
The ID of the collection.
Request model for creating a document.
ID of the collection the document belongs to.
"collection_123"
Optional denormalized root object identifier provided during creation.
Optional denormalized bucket identifier provided during creation.
Optional immediate parent type for the document.
bucket, collection Optional parent collection identifier when sourced from a collection.
Optional parent document identifier when sourced from a collection.
Optional parent object identifier when sourced directly from a bucket.
Optional materialized lineage path to set during creation.
Processing steps from root object to this document. Recommended for decomposition trees.
Optional document schema version (v1 or v2). If not provided, uses system default.
Optional metadata dictionary for user-defined fields and custom attributes.
Features to associate with the document
Successful Response
Response model for a single document.
This is the standard response format when fetching documents via API endpoints. Contains all document data plus optional presigned URLs for S3 blobs.
The document payload structure follows native Qdrant format:
- System fields are stored in _internal (lineage, metadata, blobs, etc.)
- User fields are at root level (brand_name, thumbnail_url, etc.)
- Only document_id and collection_id are Mixpeek IDs at root level
- No duplication between root and _internal
Query Parameters Affecting Response: - return_url=true: Adds presigned_url to each document_blobs entry - return_vectors=true: Includes embedding arrays in response
Use Cases: - Display document details in UI - Download source files or generated artifacts - Understand document provenance and processing - Access enrichment fields (flat) for filtering/display
REQUIRED. Unique identifier for the document. Format: 'doc_' prefix + alphanumeric characters. Use for: API queries, references, filtering.
"doc_f8966ff29c18e20c6b45e053"
"doc_abc123"
REQUIRED. ID of the collection this document belongs to. Format: 'col_' prefix + alphanumeric characters. Use for: Collection-scoped queries, filtering.
"col_articles"
"col_video_frames"
System-managed internal fields. Contains all Mixpeek-managed metadata including lineage, processing info, timestamps, and blob references. User-defined fields appear at root level alongside document_id and collection_id.
{
"collection_id": "col_articles",
"created_at": "2025-10-31T10:00:00Z",
"document_id": "doc_f8966ff29c",
"internal_id": "org_abc123",
"lineage": {
"path": "bkt_content/col_articles",
"root_bucket_id": "bkt_content",
"root_object_id": "obj_article_001",
"source_object_id": "obj_article_001",
"source_type": "bucket"
},
"metadata": { "ingestion_status": "COMPLETED" },
"modality": "text",
"namespace_id": "ns_xyz789",
"updated_at": "2025-10-31T10:00:00Z"
}