Image

Sign upLogin

Table of contents

Page Id

Configuration - Retrieval

This documentation is valid for:

This subsection allows you to specify how to obtain the augmented information sent to the context:

Retriever Type

It indicates the type of retriever used to obtain information; the default value is VectorStore.

The values it can take are:

HyDE

Specific retrieval method called Hypothetical Document Embeddings (HyDE). This method uses embedding techniques to take queries, generate hypothetical answers, embed the generated document, and use it as the final example.

By default, the HyDE class comes with some default queries, but you can also create custom queries that must have a single input variable {question}.

Contextual Compression

It tries to improve the answers returned from vector store document similarity searches by better taking into account the context from the query.

It wraps another retriever and uses a Document Compressor as an intermediate step after the initial similarity search that removes information irrelevant to the initial query from the retrieved documents. It aims to reduce the amount of distraction a subsequent chain has to deal with when parsing the retrieved documents and making its final judgments.

VectorStore

This is the default value, which directly uses the defined VectorStore without any further pre-processing.

Self Query

It first queries itself to retrieve filter information based on the natural language query. It parses the response and processes it as a JSON structure; then it executes a second query to the LLM with the query and filters applied based on the first one. As a prerequisite, each document (or chunk) needs to be ingested with associated filters applied, check a Self Query Use Case.
The score field is not returned to the client.

Multi Query

It automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the MultiQueryRetriever might be able to overcome some of the limitations of the distance-based retrieval and get a richer set of results.
The score field is not returned to the client.

Score Threshold

It uses a feature called Recursive Similarity Search. With it, you can do a similarity search without having to rely solely on the “Document Count” value. The system will return all similar question matches based on the minimum score threshold configured (check below the associated parameter).
The score field is not returned to the client.

Graph

It enables the use of a graph-based information representation approach for retrieval, more information Graph Retrieval.
Amazon Knowledge Base

Check detail here.

Retriever prompt

It specifies the query that is sent to the retriever to search for information. This query could be a question or a specific request.

When the following values are configured, this parameter is not taken into account:

VectorStore
scoreThreshold
selfQuery

Clicking on the Set Default Prompt button automatically sets the Retriever Prompt to default values according to the option selected in "Retriever Type".

For example, if the value set in Retriever Type is graph, clicking on the Set Default Prompt button will populate the Retrieval Prompt.

Below are the default values of "Retriever prompt" according to the value set in Retriever Type.

Hypothetical Document Embeddings
```
{
"prompt": "Please write a minimal passage to answer the question only\nQuestion: {question}\nPassage:"
}
```
The specific query for HyDE requesting a minimal paragraph to answer the question.

Contextual Compression

{
"prompt": "Given the following question and context, extract any part of the context *AS IS* that is relevant to answer the question. If none of the context is relevant, return empty.\n> Question: {question}\n> Context:\n>>>\n{context}\n>>>\nExtracted relevant parts:"
}

The query for Contextual Compression looks for relevant parts of the context that answer the question.

Multi Query

You are an AI language model assistant. Your task is to generate {queryCount} different versions of the given user question to retrieve relevant documents from a vector database.

By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of distance-based similarity search.

Provide these alternative questions separated by newlines between XML tags. For example:
<questions>
Question 1
Question 2
Question 3
</questions>
Original question: {question}

Multi Query generates 5 additional queries from different perspectives. Each generated query is used to retrieve a set of relevant documents from the configured vectorStore, and then the single join of all document sets is performed to obtain a larger set of potentially relevant documents.

ScoreThreshold

It defines the minimal valid value to consider the information as valid when retrieved from the VectorStore; otherwise, it is discarded. If there are no valid Documents, no interaction takes place with the LLM. The default value is 0.0.

To select a correct threshold value you need to execute experiments on your data, type of expected questions and embeddings model. Notice that when changing any of these parameters you may need to retune the threshold parameter. Check these links that can guide you on how to get a good value for your use case

Profile Metadata

This field activates the advanced configuration. By default, it remains disabled. For more information, see RAG Profile Metadata configuration.

Endpoint

URL address pointing to the specific server or service where the models or retrieval methods are hosted; the value is optional.

Page Id

—

Created: 4 March 2024 - Last update: 17 March 2025 by lsilveira

Next: Index Configuration

Backlinks

See all