Building Blocks
| Stage Type | Examples | Use in Research |
|---|---|---|
| Search | semantic_search, hybrid_search, late_interaction_search, web_search | Gather candidate documents across modalities and the open web |
| Filter | filter (structured/text/LLM/custom) | Narrow to relevant time ranges, entities, or sentiment |
| Enrich | join (direct/retriever), taxonomy | Attach structured context, e.g., taxonomy tags or related entities |
| Transform | llm_generation | Summarize, extract key facts, or generate structured notes |
| Compose | retriever, external_api_call | Chain sub-retrievers or call external services (e.g., fact-check APIs) |
Common Patterns
Literature Review
- Seed search using
hybrid_searchto retrieve recent papers. - Structured filter by publication date and venue.
- Taxonomy join to classify by research area.
- LLM generation stage to summarize findings with citations.
- Store summaries alongside
feature_idreferences for auditability.
Competitive Intelligence
- Use
web_search+web_lookupstages to pull public announcements. - Join with internal product docs via
join@v1(retriever strategy) to compare specs. - Apply a custom filter to spotlight price or feature gaps.
- Generate a briefing memo with the
llm_generationstage.
Incident Investigation
- Collect relevant runbooks/logs via
semantic_searchover internal collections. - Use
filterstages to isolate the incident window. - Enrich with taxonomy-based tags (
taxonomy@v1) for impacted systems. - Summarize timeline and root cause via
llm_generation, keeping citations.
Orchestrating Multi-Retriever Flows
Leverage theretriever@v1 compose stage to call sub-retrievers based on previous stage output:
Capturing Feedback
- Record user signals with the Interactions API (
click,long_view,positive_feedback, etc.). - Feed interactions back into rerankers or filter stages (“hide documents seen in this session”).
- Combine interactions with
analyticsendpoints to optimize parameter choices (e.g., increasehybrid_search.limitif users often tap beyond top 10).
Operational Tips
- Persist execution IDs – each
executeresponse includesexecution_id; link it to your research session for audit trails. - Monitor stage telemetry –
stage_statisticsidentifies bottlenecks (e.g., LLM stages dominating latency). - Budget controls – set
budget_limitson retrievers to cap time or credit consumption for exploratory workflows. - Cache intermediate results – use
cache_stage_namesfor expensive discovery steps, especially when analysts reiterate queries. - Leverage tasks – schedule enrichment batches (clusters, taxonomies) ahead of time so research pipelines stay low-latency.
Suggested Architecture
Next Steps
- Review Retrievers for stage configuration details.
- Learn how Filters and Taxonomies contribute structure to exploratory pipelines.
- Use Operations → Observability to monitor research workloads in production.

