Fine-tuning allows you to optimize Mixpeek’s embedding models for your specific content and search patterns, improving relevance and accuracy for your unique use case.

Why Fine-Tune?

Domain Adaptation

  • Optimize for industry-specific terminology
  • Better understand your content context
  • Improve matching for technical/specialized content

Search Quality

  • Boost relevance scores for important matches
  • Reduce false positives
  • Better handle edge cases

Use Cases

Fine-Tuning Process

1

Send Annotated Data

Upload your training data to a designated S3 bucket. Data should include:

  • Positive examples (good matches)
  • Negative examples (poor matches)
  • Relevance scores
  • Content pairs with similarity annotations

Our team will provide detailed specifications for your use case.

2

Initiate Fine-Tuning

Use the Mixpeek Dashboard to schedule a fine-tuning job:

{
  "base_model_id": "multimodal",
  "training_data": "s3://your-bucket/training-data",
  "specs": {
    "epochs": 10,
    "batch_size": 32,
    "learning_rate": 2e-5,
    "loss_function": "contrastive"
  }
}

This returns a new model_id (e.g., model_1askdh2390) for use in your API calls.

3

Version Control

Track model versions in your metadata for reproducibility:

{
  "metadata": {
    "model_version": "model_1askdh2390",
    "base_model": "multimodal",
    "training_date": "2024-03-15",
    "description": "Optimized for technical documentation"
  }
}

Performance Monitoring

Metrics

  • Mean Average Precision (MAP)
  • Recall
  • Query latency

Validation

  • A/B testing support
  • Automated regression testing
  • Performance comparison with base model

Available to Enterprise customers only. To trial fine-tuning capabilities, contact our team at info@mixpeek.com

Fine-tuned models maintain the same API interface as base models, ensuring seamless integration with existing code.