Quickstart
A Simple Example
Say you upload an image: dress.png
to your S3 bucket for your apparel ecommerce store and you want to extract the description and keywords, and create an embedding of the dress itself.
You’d create a Mixpeek pipeline that combines ML models and gets invoked whenever there’s a new object in your bucket.
Your first pipeline
Here’s an example pipeline that extracts a description, keywords, and creates an embedding.
def function(event, context):
mixpeek = Mixpeek("API_KEY")
# extract description
description = mixpeek.extract.text(model_id="model_id_1")
# extract keywords
keywords = mixpeek.extract.text(model_id="model_id_2")
# create an embedding
embedding = mixpeek.embed.image(model_id="model_id_3")
return {
"file_url": event.object_url,
"description": description,
"keywords": keywords,
"embedding": embedding
}
Once we create this pipeline then connect our S3 bucket and finally MongoDB collection then enable it.
Your first connection
Every new account comes with a preconfigured S3 and MongoDB connection so you can get started right away. Alternatively, you bring your own.
Create connection documentation
Setup pipelines
Two options for pipeline creation
Github for CI/CD
Integrate a private Github repository directly. We’ll update the pipeline every time you commit.
API Post Directly
Create and configure your pipeline by sending the pipeline as a string.
Previously, we would have created an AWS IAM role, which opens up a listener on your S3 bucket apparel
, every new object gets sent through the pipeline-as-code we defined above and then sent into our MongoDB collection: apparel_items
(which was also instantiated previously).
That’s really it! Mixpeek is designed to be “set and forget”, never think about processing your S3 bucket again.
Sample output
Here’s an example output from the S3 object: dress.png
that gets sent into your MongoDB collection:
{
"object_url": "s3://dress.png",
"description": "Elegant summer dress",
"keywords": ["summer", "dress", "casual"],
"embedding": [0, 1, 2, 3],
"metadata": {
"pipeline_version": "v1"
}
}
Use the Methods directly
You can also use the extract
, embed
and generate
methods outside of a pipeline.
from mixpeek import Mixpeek
mixpeek = Mixpeek("API_KEY")
output = mixpeek.embed.text(
input="lorem ipsum",
model_id="mixedbread-ai/mxbai-embed-large-v1"
)
To use the API, you need to register an API key and an engineer will contact you.
What does it enable?
With fresh vectors, metadata, and extracted content from your apparel images, you can design hyper-targeted marketing campaigns and improve product discovery on your ecommerce platform:
- Personalized Recommendations: Suggest products based on user preferences and past interactions.
- Visual Search: Allow users to search for products using images instead of text.
- Trend Analysis: Identify and capitalize on emerging fashion trends by analyzing frequently occurring keywords and styles.
All without having to think about data prep again. You can even modify your pipeline, and the new version will be appended to the metadata.pipeline_version
key, so you can filter your pipeline code changes against the output data.
Was this page helpful?