Buckets

Buckets are the foundation of Mixpeek’s storage architecture. They serve as containers for raw objects and their associated files before processing. They are your entry point for all multimodal processing, search and analysis.

Overview

Buckets accept objects which are composed of blobs (collections of related files or json-types) before they’re processed by collections into documents. Watch an Intro Video

Define Bucket Schema

First, create a bucket with a schema that defines what types of files your objects will contain. This schema validation ensures data consistency and proper processing.

Select Blobs

Gather the files (blobs) you want to include in your object. These can be images, videos, documents, audio files, or JSON data that are related to each other.

Upload Blobs as Object

Bundle your selected blobs into an object and upload it to your bucket. You can include object metadata and organize with prefixes for better structure.

Storage Containers

Securely store raw objects and their associated files in logically grouped containers

Object Organization

Organize objects based on use case, content type, or processing requirements

Key Concepts

Schema Validation

Object Prefixes

Creating a Bucket

Python

from mixpeek import Mixpeek

mp = Mixpeek(api_key="YOUR_API_KEY")

# Create a bucket
bucket = mp.buckets.create(
    namespace="ns_abc123",
    bucket_name="product-images",
    description="Product images for e-commerce catalog",
    schema={
      "type": "object",
      "properties": {
        "video_1": {
          "type": "video"
        },
        "pdf_1": {
          "type": "pdf"
        }
      }
    }
)

bucket_id = bucket["bucket_id"]
print(f"Created bucket: {bucket_id}")

Objects

Once you’ve created a bucket, you can add objects to it. Objects are collections of related blobs that represent a single entity in your domain. Create an object in your bucket. Each object can contain multiple related files. Bucket Details

# Create an object with multiple files
mp.objects.create(
  bucket=bucket_id,
  prefix="/files",
  # the metadata is passed down each document
  metadata={
    "name": "red-sneaker-product" 
  }
  blobs=[
      {
          "url": "https://example.com/images/red-sneaker-front.jpg",
          "mimetype": "image/jpeg"
      },
      {
          "url": "https://example.com/images/red-sneaker-side.jpg",
          "mimetype": "image/jpeg"
      },
      {
          "url": "https://example.com/data/red-sneaker-specs.txt",
          "mimetype": "text/plain"
      }
  ]
)

Best Practices

Logical Grouping

Group objects in buckets based on logical collections, such as product categories, content types, or processing requirements

Naming Conventions

Use consistent naming patterns for buckets to make them easily identifiable and manageable

Metadata Usage

Metadata from your objects is passed down to all associated documents in your destination collections.

Resource Optimization

Monitor bucket usage and distribute objects across multiple buckets if needed for performance optimization

Common Use Cases

E-commerce Products

Store product images, videos, specs, and descriptions for catalog processing

Media Assets

Organize images, videos, and audio files for media libraries

Documentation

Manage PDFs, technical documents, and related assets

Blobs

Blobs represent individual files within Objects. While Objects group related files together, Blobs are the actual raw file content or JSON types that gets processed by feature extractors.

Once your blobs are processing into objects, they maintain the prefix structure you assigned on upload. They can be treated as a standard file system. Bucket Details

Supported File Types

Format	MIME Type	Max Size	Notes
JPEG	image/jpeg	50MB	RGB color space
PNG	image/png	50MB	Transparency supported
WebP	image/webp	50MB	Modern format
GIF	image/gif	50MB	Animated GIFs supported

Best Practices for Blob Management

File Organization

Group related blobs into objects for better organization and processing efficiency

Metadata Usage

Add descriptive metadata to blobs to improve searchability and organization

Size Optimization

Compress large files when possible to improve upload and processing speed

Format Selection

Use recommended formats for each content type to ensure optimal processing

Limitations

Bucket Limitations

Storage Quotas: Each namespace has limits on total bucket storage capacity based on your plan
Bucket Naming: Bucket names must be unique within a namespace and follow naming conventions
Rate Limits: API operations on buckets are subject to rate limiting based on your account tier
Schema Changes: Bucket schemas cannot be modified after creation; a new bucket must be created

Object Limitations

Size Restrictions: Objects have a maximum combined blob size of 10GB per object
Metadata Size: Object metadata is limited to 100KB in size
Immutability: Object structure cannot be modified after creation (blobs cannot be added or removed)
Prefix Depth: Object prefixes are limited to a maximum of 20 levels of nesting

Blob Limitations

Size Constraints: Maximum blob size varies by file type (see Supported File Types above)
Quantity Limits: Maximum number of blobs per object: 10
Format Restrictions: Supported MIME types are limited to those listed above
Content Immutability: Blob content is immutable once uploaded

Overview

Data Management

Data Processing

Search & Retrieval

Data Enrichment

Troubleshooting

Overview

Storage Containers

Object Organization

Key Concepts

Creating a Bucket

Objects

Best Practices

Logical Grouping

Naming Conventions

Metadata Usage

Resource Optimization

Common Use Cases

E-commerce Products

Media Assets

Documentation

Blobs

Supported File Types

Best Practices for Blob Management

File Organization

Metadata Usage

Size Optimization

Format Selection

Limitations

Bucket Limitations

Object Limitations

Blob Limitations

Overview

Data Management

Data Processing

Search & Retrieval

Data Enrichment

Troubleshooting

​Overview

Storage Containers

Object Organization

​Key Concepts

​Creating a Bucket

​Objects

​Best Practices

Logical Grouping

Naming Conventions

Metadata Usage

Resource Optimization

​Common Use Cases

E-commerce Products

Media Assets

Documentation

​Blobs

​Supported File Types

​Best Practices for Blob Management

File Organization

Metadata Usage

Size Optimization

Format Selection

​Limitations

​Bucket Limitations

​Object Limitations

​Blob Limitations

Overview

Key Concepts

Creating a Bucket

Objects

Best Practices

Common Use Cases

Blobs

Supported File Types

Best Practices for Blob Management

Limitations

Bucket Limitations

Object Limitations

Blob Limitations