Buckets
Store and organize raw multimodal data objects for processing
Buckets are the foundation of Mixpeek’s storage architecture. They serve as containers for raw objects and their associated files before processing. They are your entry point for all multimodal processing, search and analysis.
Overview
Buckets accept objects which are composed of blobs (collections of related files or json-types) before they’re processed by collections into documents.
Define Bucket Schema
First, create a bucket with a schema that defines what types of files your objects will contain. This schema validation ensures data consistency and proper processing.
Select Blobs
Gather the files (blobs) you want to include in your object. These can be images, videos, documents, audio files, or JSON data that are related to each other.
Upload Blobs as Object
Bundle your selected blobs into an object and upload it to your bucket. You can include object metadata and organize with prefixes for better structure.
Storage Containers
Securely store raw objects and their associated files in logically grouped containers
Object Organization
Organize objects based on use case, content type, or processing requirements
Key Concepts
Creating a Bucket
Objects
Once you’ve created a bucket, you can add objects to it. Objects are collections of related blobs that represent a single entity in your domain.
Create an object in your bucket. Each object can contain multiple related files.
Best Practices
Logical Grouping
Group objects in buckets based on logical collections, such as product categories, content types, or processing requirements
Naming Conventions
Use consistent naming patterns for buckets to make them easily identifiable and manageable
Metadata Usage
Metadata from your objects is passed down to all associated documents in your destination collections.
Resource Optimization
Monitor bucket usage and distribute objects across multiple buckets if needed for performance optimization
Common Use Cases
E-commerce Products
Store product images, videos, specs, and descriptions for catalog processing
Media Assets
Organize images, videos, and audio files for media libraries
Documentation
Manage PDFs, technical documents, and related assets
Blobs
Blobs represent individual files within Objects. While Objects group related files together, Blobs are the actual raw file content or JSON types that gets processed by feature extractors.
Once your blobs are processing into objects, they maintain the prefix structure you assigned on upload. They can be treated as a standard file system.
Supported File Types
Format | MIME Type | Max Size | Notes |
---|---|---|---|
JPEG | image/jpeg | 50MB | RGB color space |
PNG | image/png | 50MB | Transparency supported |
WebP | image/webp | 50MB | Modern format |
GIF | image/gif | 50MB | Animated GIFs supported |
Best Practices for Blob Management
File Organization
Group related blobs into objects for better organization and processing efficiency
Metadata Usage
Add descriptive metadata to blobs to improve searchability and organization
Size Optimization
Compress large files when possible to improve upload and processing speed
Format Selection
Use recommended formats for each content type to ensure optimal processing
Limitations
Bucket Limitations
- Storage Quotas: Each namespace has limits on total bucket storage capacity based on your plan
- Bucket Naming: Bucket names must be unique within a namespace and follow naming conventions
- Rate Limits: API operations on buckets are subject to rate limiting based on your account tier
- Schema Changes: Bucket schemas cannot be modified after creation; a new bucket must be created
Object Limitations
- Size Restrictions: Objects have a maximum combined blob size of 10GB per object
- Metadata Size: Object metadata is limited to 100KB in size
- Immutability: Object structure cannot be modified after creation (blobs cannot be added or removed)
- Prefix Depth: Object prefixes are limited to a maximum of 20 levels of nesting
Blob Limitations
- Size Constraints: Maximum blob size varies by file type (see Supported File Types above)
- Quantity Limits: Maximum number of blobs per object: 10
- Format Restrictions: Supported MIME types are limited to those listed above
- Content Immutability: Blob content is immutable once uploaded
Was this page helpful?