Skip to main content
Interactions capture how users engage with search results—clicks, purchases, skips, dwell time. This feedback automatically improves your retrievers through fine-tuning, personalization, and analytics.

Why Track Interactions

Automatic Quality Improvement: User behavior becomes training data for fine-tuning embedding models and ranking algorithms—no manual labeling required. Measure Real Performance: Traditional IR metrics (precision, recall) use test sets. Interactions reveal actual user satisfaction through click-through rates, conversion, and engagement. Personalization at Scale: Build user preference profiles from interaction history to deliver personalized results without managing separate recommendation systems. Identify Blind Spots: Find queries with low engagement, results that users skip despite high rankings, and content gaps causing zero-result queries.

What You Get

Training Data Generation

Interactions format into contrastive pairs (query, clicked_doc) for embedding fine-tuning with automatic position bias correction.

Relevance Analytics

CTR by position, documents with high skip rates, queries needing tuning, time-to-first-click—all without custom infrastructure.

Result Deduplication

Exclude previously purchased, viewed, or consumed content automatically by filtering on past interactions.

Popularity Boosting

Rerank results based on recent interaction signals (clicks, conversions) to surface trending content.

Capture Interactions

Record user behavior with a single API call:
POST /v1/retrievers/interactions
{
  "feature_id": "doc_xyz789",        # Document ID from retriever results
  "interaction_type": ["click"],     # One or more signal types
  "position": 3,                      # 0-based position in results (critical for bias correction)
  "metadata": {
    "duration_ms": 4500,              # Optional: dwell time, device, query text
    "query": "wireless earbuds"
  },
  "user_id": "user_123",              # For personalization and session tracking
  "session_id": "sess_456"
}

Signal Types

TypeStrengthUse Case
clickModerate positiveUser clicked result
long_viewStrong positiveSustained engagement (track via duration_ms)
purchase, add_to_cartConversionE-commerce actions
positive_feedback / negative_feedbackExplicitThumbs up/down votes
skip, return_to_resultsNegativeUser ignored or bounced back
Custom typesVariableDefine domain-specific signals
Combine multiple types per event: ["click", "long_view"] when a user clicks and stays engaged.

Outcomes & Use Cases

1. Fine-Tune Embedding Models

# Collect 30 days of interaction data as training pairs
training_data = analytics.get_query_document_pairs(
    date_range="last_30_days",
    min_interactions=2  # Only docs with real engagement
)

# System formats as contrastive pairs:
# Positive: (query, clicked_doc)
# Hard negative: (query, high_ranked_but_skipped_doc)
# Position bias automatically corrected

fine_tuned_model = training.fine_tune(
    base_model="text-embedding-3-small",
    training_data=training_data
)
Outcome: 10-30% improvement in relevance metrics from domain-specific fine-tuning using real user preferences.

2. Identify & Fix Ranking Issues

# Find queries where top results are skipped
signals = analytics.get_retriever_signals(retriever_id="ret_123")
# Returns:
# - Documents ranked high but skipped (ranking issues)
# - Queries with <5% CTR (need tuning)
# - Average position of first click (relevance proxy)
Outcome: Data-driven decisions on which retrievers need adjustment, which taxonomy mappings to update, or which filters to refine.

3. Personalize Without Recommendation Infrastructure

# Exclude content user already consumed
previous_purchases = interactions.list(
    user_id="user_123",
    interaction_type=["purchase"],
    days=90
)

retriever.execute(
    query="new arrivals",
    filters={
        "document_id": {"$nin": [i.feature_id for i in previous_purchases]}
    }
)
Outcome: Improved user experience (no duplicate suggestions) without building separate collaborative filtering or recommendation systems. Configure retrievers to automatically boost recently popular items:
{
  "stages": [
    {"type": "vector_search", "top_k": 100},
    {
      "type": "rerank",
      "boost_by_interaction_count": {
        "interaction_types": ["click", "purchase"],
        "time_window_days": 7,
        "boost_multiplier": 1.2
      }
    }
  ]
}
Outcome: Real-time popularity signals influence ranking without manual updates.

Query & Export

Retrieve interactions for analysis or external ML pipelines:
# Get all interactions for a document
GET /v1/retrievers/interactions?feature_id=doc_xyz789

# Get user interaction history
GET /v1/retrievers/interactions?user_id=user_123&limit=50

# Export for training (bulk pagination)
GET /v1/retrievers/interactions?page=1&page_size=1000

Privacy & Compliance

GDPR Deletion: Remove all interactions for a user to honor right-to-deletion requests.
DELETE /v1/retrievers/interactions/<interaction_id>
Anonymization: Hash user_id before sending if you need consistent tracking without PII.

Best Practices

  1. Always capture position – essential for correcting position bias in learning-to-rank models
  2. Store query text in metadata – enables query-document pair export for fine-tuning
  3. Track dwell time – separates genuine engagement (long_view) from accidental clicks
  4. Batch when possible – reduce API calls by aggregating interactions and sending in bulk
  5. Monitor weekly – set up dashboards to track CTR trends and catch relevance regressions early
See the Analytics Overview for dashboards and alerting on interaction-driven metrics like CTR, engagement rates, and query quality.