The context variable allows you to control how information from your documents (in term of chunks) is presented to the LLM during completion requests. This helps you tailor the context to your specific needs and improve the accuracy of the LLM's responses.
- Prompt Section: You define the context variable within the Prompt section of the RAG Assistant.
- Automatic Population: The RAG Assistant automatically populates the context variable with information retrieved from your documents based on semantic similarity. This information is usually the textual content of the retrieved document chunks.
You can customize the way the context variable is populated by using the template parameter within the Profile Metadata section under the Retrieval settings of the RAG Assistant. This allows you to include specific metadata elements from your documents, providing the LLM with more context. You need to add the chunkDocument element including at least the template element and following a convention for the metadata elements using curly brackets, for example:
{
"chat": {
"search": {
....
"chunkDocument": {
"separator": "\n---\n",
"template":"{page_content}\nFile:{name}.{extension}\nDescription:{description}"
}
}
}
}
Parameter |
Description |
template |
Prompt used for formatting each document into a string. The template processes every document chunk; by default, it uses the complete textual information, which means it is substituted as {page_content}. You can add plain text and special metadata fields using the {metadata_element} notation.
Note: This is particularly useful if you ingest documents with extra metadata. If not, you still can use the predefined ones. |
separator |
The separator between each formatted document chunk; by default, it uses the "\n\n" value. |
The following predefined metadata elements are available:
Element |
Description |
name |
The document name will match the associated name on the backend |
extension |
Document extension |
description |
Document description when available |
Suppose you have two documents:
- file1.pdf: Contains information about GeneXus development best practices.
- file2.pdf: Contains information about GeneXus security guidelines.
You want to include the document name, extension, and description in the context provided to the LLM.
Go to Profile Metadata section under the Retrieval and configure the template parameter by specific metadata elements: name, extension, description.
{
"chat": {
"search": {
"chunkDocument": {
"separator": "\n---\n",
"template": "{page_content}\nFile:{name}.{extension}\nDescription:{description}"
}
}
}
}
Without any customization, the context variable within the Prompt would look like this:
...
<context>
Chunk1Data
Chunk2Data
</context>
...
With the template parameter defined above, the context variable would look like this:
...
<context>
Chunk1Data
File:file1.pdf
Description:some description
---
Chunk2Data
File:file2.pdf
Description:some other description
</context>
...
Suppose you ingest files with additional metadata fields such as url, type, year. You can update the template parameter within Profile Metadata to include these fields:
"chunkDocument": {
"separator": "\n---\n",
"template": "{page_content}\nFile: {name}.{extension}\nDescription: {description}\nDocument Type:{type}\nUrl: {url}\nPublish Year: {year}"
}
You can also try with alternative formatting, such as using tag names, the previous example is equivalent to the following one
"chunkDocument": {
"separator": "\n---\n",
"template": "
<pageContent>{page_content}</pageContent>\n
<file>{name}.{extension}</file>\n
<description>{description}</description>\n
<type>{type}</type>\n
<url>{url}</url>
<year>{year}</year>"
}
- Metadata Availability: Some metadata elements may not be available for all documents. If a field is not found, it will be substituted as an empty string.
- Case Sensitivity: Make sure to reference metadata elements with the correct casing.
- Chunk Count: Check the Chunk Count to understand how the document was processed.
Configuration - Prompt
.custom File Format
Self Query Use Case