Generate content, summaries, or insights using large language models
The LLM Generation retriever stage uses large language models to generate new content based on retrieved documents or custom prompts.
LLM Generation leverages powerful language models to generate content, summarize documents, answer questions, or perform other text-generation tasks as part of a retrieval pipeline. This stage can enhance search results with AI-generated insights, explanations, or transformations of the retrieved content.
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | The prompt template or instruction for the LLM |
documents | array | No | [] | Array of document IDs or content to include in context |
model | string | No | ”mixpeek/llm-v1” | The LLM model to use for generation |
max_tokens | integer | No | 1024 | Maximum number of tokens to generate |
temperature | float | No | 0.7 | Temperature for generation (0.0-1.0) |
Mode | Description | Use Case |
---|---|---|
standalone | Generate content based only on the prompt | Creative content, initial responses |
document_context | Use retrieved documents as context for generation | Question answering, summarization |
rag | Retrieval-Augmented Generation with dynamically retrieved content | Knowledge-intensive tasks |
agent | Run as an agent with tool-use capabilities | Complex reasoning, multi-step tasks |
The system supports various prompt formats and structures:
Template Type | Description |
---|---|
simple | Direct text prompt without special formatting |
chat | JSON array of messages with role and content |
jinja2 | Jinja2 template with variables for document content |
handlebars | Handlebars template with document variables |
Parameter | Type | Default | Description |
---|---|---|---|
temperature | float | 0.7 | Controls randomness (0.0-1.0) |
top_p | float | 0.95 | Nucleus sampling parameter |
top_k | integer | 50 | Limits vocabulary for next token selection |
repetition_penalty | float | 1.0 | Penalizes repeated tokens |
max_tokens | integer | 1024 | Maximum length of generated text |
stop_sequences | array | [] | Sequences that stop generation when encountered |
Generate content, summaries, or insights using large language models
The LLM Generation retriever stage uses large language models to generate new content based on retrieved documents or custom prompts.
LLM Generation leverages powerful language models to generate content, summarize documents, answer questions, or perform other text-generation tasks as part of a retrieval pipeline. This stage can enhance search results with AI-generated insights, explanations, or transformations of the retrieved content.
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | The prompt template or instruction for the LLM |
documents | array | No | [] | Array of document IDs or content to include in context |
model | string | No | ”mixpeek/llm-v1” | The LLM model to use for generation |
max_tokens | integer | No | 1024 | Maximum number of tokens to generate |
temperature | float | No | 0.7 | Temperature for generation (0.0-1.0) |
Mode | Description | Use Case |
---|---|---|
standalone | Generate content based only on the prompt | Creative content, initial responses |
document_context | Use retrieved documents as context for generation | Question answering, summarization |
rag | Retrieval-Augmented Generation with dynamically retrieved content | Knowledge-intensive tasks |
agent | Run as an agent with tool-use capabilities | Complex reasoning, multi-step tasks |
The system supports various prompt formats and structures:
Template Type | Description |
---|---|
simple | Direct text prompt without special formatting |
chat | JSON array of messages with role and content |
jinja2 | Jinja2 template with variables for document content |
handlebars | Handlebars template with document variables |
Parameter | Type | Default | Description |
---|---|---|---|
temperature | float | 0.7 | Controls randomness (0.0-1.0) |
top_p | float | 0.95 | Nucleus sampling parameter |
top_k | integer | 50 | Limits vocabulary for next token selection |
repetition_penalty | float | 1.0 | Penalizes repeated tokens |
max_tokens | integer | 1024 | Maximum length of generated text |
stop_sequences | array | [] | Sequences that stop generation when encountered |