Table of contents
Official Content
  • This documentation is valid for:

Retrieval Augmented Generation (RAG) is an approach that combines information retrieval from unstructured data and text generation to improve performance on tasks such as question answering.

In the retrieval phase, a selective search is performed on a set of documents, identifying related information and efficiently reducing the search space. This approach ensures that focus is placed on the most relevant and meaningful information.

The information retrieved is effectively integrated and expanded in prompting, providing deeper connections and insights into unstructured data. This process enriches the information with additional details and relationships. In this way, it encompasses not only immediate relevance, but also broader contextual connections and helps in understanding the retrieved information.

In the subsequent text generation phase, this expanded data set is used to produce coherent and contextual responses. The generative model, by working on the information previously retrieved and added to the prompting (in-context learning), not only improves the system's ability to generate accurate and relevant responses, but also effectively incorporates the enriched details from the textual information. This process enhances the system's ability to generate accurate and relevant content, effectively completing the cycle in the comprehensive RAG approach.

RAG Assistants in GeneXus Enterprise AI

RAGAssistants

GeneXus Enterprise AI makes it possible to use RAG assistants to chat with or search for information stored in documents (unstructured data). This functionality is enabled through the RAG Assistants API or RAG Assistants section of the GeneXus Enterprise AI backend. 

The different phases of this process are described below:

Data Ingestion

The initial phase, known as the Ingestion stage, involves loading various types of documents from multiple sources. This phase is not only limited to data acquisition but also includes the configuration of chunks.

Documents and other data sources

Data can be loaded in different formats and from different sources.

In addition, other data sources generated by end users are considered.

Configuration of Chunks

In parallel with document loading, the configuration of chunks is performed to optimize information management.

These chunks act as organizational fragments, enabling efficient data segmentation. This process goes beyond simple data partitioning, as it integrates with the Index Profile and its chunking strategy.

This connection ensures that data segmentation is consistent and aligned with the specific requirements of the index, thus optimizing data preparation for subsequent processing.

The default chunking strategy is as follows:

  • chunkSize: 1000 characters
  • chunkOverlap: 100 characters

Retrieval

In the Retrieval stage, the data retrieval process is started, leveraging the previously ingested and organized information.

During this phase, the vector database is accessed and the documents loaded during Ingestion are indexed and stored efficiently.

The main component of this stage is data extraction through embeddings and access to the vector store.

Embeddings

Embeddings, derived from the context and represented by a numeric array, capture the contextual essence of the documents in terms of chunks and the user's query.

Vector Store

The Vector Store, powered by embeddings and metadata, connects with providers, index parameters, and distance metrics to ensure accurate and contextualized retrieval of data chunks relevant to the given query.

Generation

The Generation stage marks the point at which the GeneXus Enterprise AI architecture focuses on generating relevant and contextually consistent responses.

In this process, the system uses the RAG Assistant configuration to know which model to access and with which parameters.

RAG Assistant

This assistant incorporates key elements such as Prompts, LLMs, and search retrieval parameters to define the search strategy. Prompts act as guides to contextualize the responses, while the LLM contributes to the consistency and relevance of the generated content.

In addition, the assistant makes it possible to add variable-based adjustments and filters, enhancing the customization of the generated responses. This adaptability allows for responses to be specific and relevant to the particular needs of the end user.

End user interaction

Finally, the GeneXus Enterprise AI architecture enables interaction with the user through the API. This interface facilitates smooth and efficient communication between end users and RAG Assistants, completing the cycle and providing answers to queries in an efficient manner.

Last update: September 2024 | © GeneXus. All rights reserved. GeneXus Powered by Globant