Why Reverse Search?
Traditional text search requires describing visual content in words, which often fails when:- You can’t articulate what you’re looking for (“I know it when I see it”)
- Visual features are hard to describe (“find videos with this exact color palette”)
- Content is identical but metadata differs (re-uploads, duplicates)
- You need pixel-level similarity (copyright detection, brand monitoring)
Search Flow Diagram
Input Methods Comparison
| Method | Use Case | Example | Performance |
|---|---|---|---|
| Direct Embedding | Pre-computed vectors, high-volume checks | {"embedding": [0.1, 0.2, ...]} | Fastest (no inference) |
| Image URL | Search using hosted images | {"url": "https://cdn.example.com/img.jpg"} | Fast |
| Video URL | Search using video files | {"url": "s3://bucket/video.mp4"} | Medium (extracts keyframe) |
| Base64 Image | Upload image data directly | {"base64": "data:image/jpeg;base64,..."} | Fast |
| Base64 Video | Upload video data directly | {"base64": "data:video/mp4;base64,..."} | Medium |
Implementation Steps
1. Create a Bucket for Visual Content
2. Create a Collection with Visual Embeddings
For Images:3. Ingest Visual Assets
4. Create a Reverse Search Retriever
Method 1: Image URL Search5. Execute Reverse Search
Search with Image URL:Model Evolution & A/B Testing
Test different vision models without rebuilding your entire visual catalog.Compare Vision Models
Test Scene Detection Thresholds
Measure Impact
- Precision@10: v1 (0.78) vs v2 (0.86) → +10% accuracy
- Scene count: v1 (15/video) vs v2 (42/video) → better granularity
- Processing time: v1 (45s) vs v2 (68s) → 50% slower
- User engagement: v1 (38% CTR) vs v2 (44% CTR) → improved results
Advanced Patterns
Cross-Modal Search (Image → Video)
Search videos using a reference image:Multi-Collection Federated Search
Search across images, videos, and segments simultaneously:Combine with Metadata Filters
Enhance reverse search with structured filters:Batch Similarity Check
Find duplicates across your entire catalog:Group Results by Root Object
For segmented videos, group results by parent video:Similarity Threshold Guidelines
| Similarity Score | Interpretation | Action |
|---|---|---|
| 0.95 - 1.00 | Near-duplicate or same content | Flag as duplicate, merge |
| 0.85 - 0.94 | Very similar creative concept | Highly related, consider grouping |
| 0.70 - 0.84 | Similar theme/style | Related content, suggest to user |
| 0.50 - 0.69 | Loosely related | Weak match, filter out |
| < 0.50 | Different content | Not similar |
- Copyright detection: 0.90+ (high precision)
- Product discovery: 0.70+ (balanced)
- Creative inspiration: 0.60+ (high recall)
Use Case Examples
Brand Safety & Copyright Detection
Brand Safety & Copyright Detection
Detect unauthorized use of brand visual assets by extracting embeddings from all brand assets, monitoring incoming content with reverse search, and flagging matches above 0.90 similarity for review.
Visual Product Discovery
Visual Product Discovery
Enable “find products that look like this” by allowing users to upload photos, searching the product catalog with reverse image search, and filtering by price range, availability, and brand.
Content Deduplication
Content Deduplication
Clean up media libraries by processing the catalog to extract embeddings, searching for similar content, grouping near-duplicates (0.95+ similarity), and keeping the highest quality version.
Video Ad Performance Analysis
Video Ad Performance Analysis
Analyze creative patterns by identifying top-performing ads, extracting visual embeddings from winning creatives, searching the ad library for similar concepts, and analyzing performance patterns.
Scene-Level Video Search
Scene-Level Video Search
Find specific moments in video archives by segmenting videos into scenes with
video_extractor, accepting screenshot queries, and returning matching segments with timestamps.UGC Campaign Matching
UGC Campaign Matching
Link user-generated content to campaigns by extracting embeddings from official campaign assets, processing incoming submissions, and matching via visual similarity.
Competitive Intelligence
Competitive Intelligence
Track competitor creative strategies by monitoring competitor ad libraries, searching your creative archive for similar concepts, and identifying overlap and gaps in visual strategies.
Performance Optimization
1. Cache Embeddings for Repeated Searches
2. Use Anonymous Retrievers for One-Time Searches
3. Batch Process Similarity Checks
4. Pre-Filter by Metadata
5. Tune Top-K Values
Troubleshooting
Low Similarity Scores (<0.5)
Causes:- Query and corpus are genuinely different
- Wrong feature extractor (using text embeddings instead of visual)
- Corrupted or low-quality images
Vector Dimension Mismatch
Error: “Embedding dimension [512] does not match indexed dimension [768]” Solution:Slow Query Performance
Causes:- Large collection without pre-filtering
- Extracting embeddings on every query
- High top-k values
Feature Not Found
Error: “Feature URI not found in collection” Solution:Next Steps
- Explore Video Understanding for scene detection patterns
- Learn Collections for feature extractor configuration
- Review Retrievers for stage composition
- Check Analytics for performance monitoring

