AWS S3
Setting Up an AWS S3 Connection in Mixpeek
When you establish an S3 connection in Mixpeek and attach a pipeline, Mixpeek automates several backend processes to efficiently handle and process your data. Here’s how it works:
Automated Infrastructure Setup
Upon creating an S3 connection and attaching a pipeline, Mixpeek automatically sets up the following components:
-
Amazon SNS (Simple Notification Service) Topic: This acts as a pub/sub system. Every new object added to your S3 bucket that matches the pre-filter logic defined in your pipeline configuration triggers a notification to this SNS topic.
-
Amazon SQS (Simple Queue Service) Queue: Notifications from the SNS are forwarded to this SQS queue. This decouples the process of receiving data from the processing, thereby enhancing efficiency and scalability.
-
Serverless Functions (Create and Delete): These functions are triggered based on the activity in your S3 bucket:
- Create Function: Activated when a new object is added to the S3 bucket. It processes the object according to the pipeline logic you define and inserts the relevant data into the designated database.
- Delete Function: Handles the deletion of objects, ensuring that your database and the S3 bucket remain synchronized.
Workflow Diagram
Below is a Mermaid diagram that visualizes the workflow:
Benefits
This setup allows Mixpeek to handle a large volume of object uploads from S3 efficiently. By processing these uploads through a scalable pipeline, it prevents overloading the inference servers and indexing pipeline, ensuring smooth and efficient data handling.