Skip to main content
Analytics gives you visibility into how your retrievers perform in production. Track latency, stage-level bottlenecks, cache efficiency, and get automated recommendations for improvement.

Endpoints Overview

EndpointMethodReturns
/analytics/retrievers/{id}/performanceGETLatency percentiles (P50/P95/P99), query counts, trends
/analytics/retrievers/{id}/stagesGETPer-stage execution times and document flow
/analytics/retrievers/{id}/signalsGETOperational signals (cache hits, rerank scores, filter reduction)
/analytics/retrievers/{id}/cache-performanceGETCache hit/miss rates, latency savings
/analytics/retrievers/{id}/slow-queriesGETSlowest queries with stage-level breakdown
/analytics/retrievers/{id}/analyze-tuningPOSTAI-powered parameter tuning recommendations

Performance Metrics

Get latency percentiles and query volume over time:
curl "$MP_API_URL/v1/analytics/retrievers/{retriever_id}/performance?group_by=hour&start_date=2025-01-01T00:00:00Z" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"
Query parameters:
ParameterTypeDefaultDescription
start_datedatetimeStart of time range (UTC)
end_datedatetimeEnd of time range (UTC)
group_bystring"hour"Time grouping: hour, day, week
Response includes: P50, P95, and P99 latency, query counts, result counts, and trends over the time range.

Stage Breakdown

Understand which stages consume the most time:
curl "$MP_API_URL/v1/analytics/retrievers/{retriever_id}/stages?hours=24" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"
Query parameters:
ParameterTypeDefaultDescription
hoursinteger24Hours of history (1–720)
Response includes: Per-stage execution time, document count entering/exiting each stage, and stage-level latency distributions. Use this to identify bottlenecks — a rerank stage processing 500 documents is slower than one processing 50.

Slow Queries

Find the queries that take the longest to execute:
curl "$MP_API_URL/v1/analytics/retrievers/{retriever_id}/slow-queries?limit=10&hours=24" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"
Query parameters:
ParameterTypeDefaultDescription
limitinteger10Number of slow queries to return
hoursinteger24Hours of history (1–720)
Response includes: Query text, total execution time, result count, and stage-by-stage breakdown for each slow query. Use this to find pathological queries that need optimization.

Cache Performance

Monitor how effectively caching reduces latency:
curl "$MP_API_URL/v1/analytics/retrievers/{retriever_id}/cache-performance?hours=24" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"
Response includes: Hit/miss rates, average latency for cache hits vs full execution, and hourly trends. A low hit rate may indicate your queries are too diverse for caching, or that cache TTL needs adjustment.

Retriever Signals

Get raw operational signals for debugging:
curl "$MP_API_URL/v1/analytics/retrievers/{retriever_id}/signals?signal_type=rerank_scores&limit=100&hours=24" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"
Signal types:
SignalDescription
cache_hitQuery served from cache
cache_missCache miss, full execution
rerank_scoresScore distribution from rerank stage
filter_reductionHow much the filter stage reduced the candidate set
expansion_resultsQuery expansion output

AI-Powered Tuning

Get automated recommendations for improving your retriever:
curl -X POST "$MP_API_URL/v1/analytics/retrievers/{retriever_id}/analyze-tuning?days=7" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"
Query parameters:
ParameterTypeDefaultDescription
daysinteger7Days of history to analyze (1–90)
Response includes: Parameter suggestions (e.g., “reduce top_k from 500 to 200”), cache optimization tips, and performance improvement estimates based on observed patterns.

Identifying Relevance Issues

Use analytics to spot relevance problems:
SymptomAnalytics SignalLikely CauseAction
High latency, normal resultsStage breakdown shows slow rerankToo many candidates entering rerankReduce top_k or add a limit stage before rerank
Low click-through rateInteraction signals show high skip ratePoor ranking or irrelevant resultsCheck fusion strategy, consider learned fusion
Cache hit rate droppingCache performance shows increasing missesQuery diversity increasing or TTL too shortAdjust cache strategy, review in caching best practices
Inconsistent latencySlow queries show specific patternsCertain query types trigger expensive pathsAdd pre-filters or query-specific optimization

Monitoring Cadence

FrequencyCheckTools
DailySlow queries, P95 latency/slow-queries, /performance
WeeklyStage breakdown, cache efficiency, interaction trends/stages, /cache-performance, /signals
MonthlyAI tuning analysis, full evaluation run/analyze-tuning, Evaluations