This log shows the most important fixes or features added to the platform.
- Support for Model Context Protocol (MCP) to integrate external tools.
- The GEAI proxy is a Python-based component that enables dynamic integration of external tools into Globant Enterprise AI (GEAI) via MCP. It acts as a bridge between GEAI and one or more MCP-compliant tool servers.
- Once the MCP servers are properly configured and connected through the GEAI proxy, the tools they expose become automatically available in the Lab → Tools section of GEAI, ready for use by any Agent without additional setup.
- See more information about this protocol https://modelcontextprotocol.io/introduction
- See how to import tools into GEAI using MCP tool servers
- New /responses Endpoint for AI Interactions
- We’ve introduced a new /responses endpoint in Globant Enterprise AI (GEAI), which is fully compatible with the OpenAI Responses API. This addition allows developers to submit prompts as plain text, invoke functions, or pass files such as PDFs and images. The endpoint simplifies AI integration by supporting a familiar request/response structure, enabling a smoother transition for teams already using OpenAI-based workflows.
- New Images API
- A new API is available that lets you generate images from text prompts. Supported providers: OpenAI, Vertex AI and xAI.
- LLMs:
- New Gemini models:
- Gemini 2.5 Pro Preview 'I/O edition': Built on its predecessor with significantly enhanced coding abilities and improved reasoning for complex tasks. Designed for developers and advanced users, this edition refines performance across benchmarks and expands its problem-solving reach. Release date: May 6th, 2025.
- Gemini 2.5 Flash: Google's latest model built for complex problem-solving. It allows users to activate thinking and set a thinking budget (1–24k tokens). Designed to balance reasoning and speed, it delivers better performance and accuracy by reasoning before responding.
- Updates in OpenAI's "o" series:
- o3: The most powerful reasoning model in the "o" family; it pushes the frontier across coding, math, science, visual perception, and more.
- o4-mini: A smaller model optimized for fast, cost-efficient reasoning; it achieves remarkable performance for its size and cost, particularly in math, coding, and visual tasks.
- o1-pro: Available through our Responses API, offering a faster, more flexible, and easier way to create agentic experiences.
- Over the next few weeks, the o1‑preview model will be migrated to the new o3 model, while o1‑mini will move to o4‑mini. More info in Deprecated Models.
- Refer to the LLMs with Reasoning Capabilities article for step-by-step guidance on how to use reasoning-enabled models through the API.
- The new GPT-4.1 model series by OpenAI is now available in the production environment, featuring significant improvements in coding, instruction following, and long-context handling—along with their first-ever nano model.
- Grok 3 Model Family added, including two pairs of models:
- Lightweight Variants:
grok-3-mini-beta and grok-3-mini-fast-beta support function calling and enhanced reasoning (with configurable effort levels) for tasks like meeting scheduling and basic customer support. Both variants deliver identical response quality; the difference lies in response latency, with the "fast" version optimized for quicker responses.
- Flagship Variants:
grok-3-beta and grok-3-fast-beta are designed for enterprise use cases such as data extraction, coding, and text summarization. They bring deep domain expertise in fields like finance, healthcare, law, and science. Similar to the mini variants, these models have identical capabilities, with the "fast" version offering reduced response times at a higher cost.
- Llama 4 collection by Meta: We continue to expand our coverage of this model family. Recently added Llama 4 Scout and Maverick through Vertex AI's serverless API. Also available in Beta: Llama 4 Maverick via Groq and SambaNova, and Llama 4 Scout through the Cerebras provider, which offers this model with an inference speed of up to 2,600 tokens per second.
- Llama Nemotron Collection: The Llama Nemotron Ultra and Super models are now available in Beta as Nvidia NIM microservices. These are advanced reasoning models, post-trained to optimize performance on tasks such as RAG, tool calling, and alignment with human chat preferences. Both models support a context window of up to 128K tokens.
- Introducing the OpenRouter Provider (Beta):
- OpenRouter joins the GEAI model suite with its Auto Router meta-model, which analyzes each user query and dynamically routes it to the most suitable LLM. This workflow maximizes response quality while minimizing cost and latency, delivering the most efficient output possible.
-
Qwen3 Family recently added: The latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique ability to switch seamlessly between a thinking mode for complex reasoning and a non-thinking mode for efficient dialogue ensures versatile, high-quality performance. Significantly outperforming prior models like QwQ and Qwen2.5, Qwen3 delivers superior mathematics, coding, commonsense reasoning, creative writing, and interactive dialogue capabilities.
- Better processing of error messages, for example on those cases where the LLMs return specific errors.
- New Python SDK for Globant Enterprise AI (PyGEAI). It's composed of libraries, tools, code samples, and other documentation that allows developers to interact with the platform more easily with a 16K token context window.
- New omni-parser API to get the content of different file types.
- RAG
- Support for new audio and video formats.
- New endpoints to reindex documents.
- New parameters available when ingesting documents.
- startPage and endPage to selectively process what is needed.
- media parameters such as mediaPrompt, dialogue, frameSamplingRate and so on.
- fix parameter truncate is not supported when calling the cohere-rerank-3.5 model.
- Flows
- File support for Teams & Slack: You can now easily send documents, images, audio, and video files through Teams and Slack when you integrate a Flow into these conversational channels.
- Evaluation Module Enhancements
- New Metrics Introduced:
- Faithfulness: Assesses how factually consistent a response is with the retrieved context.
- Hallucination: Calculated as 1 - Faithfulness, indicating the level of fabricated information.
- Context Precision: Measures the proportion of relevant information within the retrieved contexts, compared against a reference answer for a given user input. (Note: Current calculation does not yet consider the position of retrieved chunks.)
- Noise Sensitivity: This would involve analyzing the relationship between Assistant Accuracy and Context Precision across successive runs of an evaluation plan, varying the number of chunks retrieved. It examines how much and in what way the quality of the generated response changes when irrelevant content is added to the retrieved context.
- The Lab Enhancements
- Flows Integration: The definition and management of Flows are now fully integrated into the Lab.
- Agentic Processes:
- New Conditional Gateway: Introduces the ability to define branching paths based on natural language prompts, enabling dynamic decision-making within processes.
- New Synchronization Gateway: Allows synchronization of multiple parallel paths. The process automatically waits at this point until all incoming paths are completed.
- Enhanced Task Flexibility: Now, it supports multiple inputs and outputs per task, significantly expanding the complexity and richness of the processes you can model.
- Meta-Agent Iris Improvements
- Enhanced LLM Selection Experience: When creating or editing an agent with Iris, users now benefit from a refined LLM selection flow, improving usability and model configuration accuracy.
- The Lab - Custom SSO not supported in this release.
- New Agents
- The Lab is designed for defining, managing, and orchestrating autonomous AI agents. It provides a standardized model for representing agents, their capabilities, and their interactions within complex workflows. The core components of the Lab include:
- Agents & Tools: This module allows for the definition and management of individual agents and their resources, such as skills and tools. It serves as a central hub for cataloging and managing the agent workforce.
- Agentic Processes: This component enables the definition of processes based on tasks executed by Agents. These workflows facilitate collaboration among agents to achieve larger objectives. More details at How to create an Agentic Process.
- Agent RunTime: This module provides the execution environment for agentic workflows, where agents perform tasks based on their skills and interact with artifacts, driven by events and the flow of knowledge.
- The Lab aims to meet the growing demand for intelligent, self-sufficient AI agents capable of collaborating and solving complex problems with minimal human intervention. It offers a flexible and adaptable model, allowing for the creation and management of a diverse range of agents, from co-pilots working alongside humans to fully automated agents executing complex tasks. Implemented as a module of Globant Enterprise AI, the Lab supports the development of intelligent agents that can work autonomously or in collaboration with humans and other agents.
- New features in Flows
- Agent Integration Component: You can now directly integrate agents created with the AI Lab into a Flow. These agents can be exposed through platforms like WhatsApp, Teams, or Slack.
- File Upload Support from WhatsApp: Flows now support receiving file attachments such as documents, images, audio, and video directly from WhatsApp interactions.
- Audio and Video Attachment in Web Chat: The Web Chat component now allows users to attach audio and video files, enhancing the interaction experience.
- New Features for the Data Analyst Agent
- Reduced Configuration Requirements: The setup needed to enable the assistant to respond to a wide range of questions has been minimized.
- Enhanced Analysis Module: An additional analysis module has been incorporated to complement the responses with relevant business conclusions and interpretations of the obtained data.
- New metrics to track processed tokens.
- LLMs:
- New Gemini 2.5 Pro (via providers Vertex AI and Gemini): Gemini 2.5 is Google’s latest reasoning model, engineered to tackle increasingly complex challenges. This model is designed for tasks that demand advanced analytical thinking and robust problem-solving capabilities. More details at LLM API.
- Migration to Gemini 2.0 series (Vertex AI): Based on recommendations from Vertex AI, we have migrated from the legacy Gemini 1.0 and 1.5 models to the more advanced Gemini 2.0 series, offering improved performance, scalability, and integration capabilities. For comprehensive information, please refer to the Deprecated Models section.
- New Azure OpenAI models (o1, o1-mini and o3-mini): We have expanded our model availability by introducing these models via Azure, providing the same high-quality capabilities as those offered through the OpenAI provider.
- DeepSeek-R1 via AWS Bedrock: Recently added through a Serverless API, DeepSeek-R1 offers reliable inference with a substantial 128K token context window and up to 32K maximum output tokens.
- OpenAI's new models with built-in web search tool: These specialized models integrate web search capabilities directly into the Chat Completions API, enabling them to both interpret and execute search queries in real time.
- openai/gpt-4o-search-preview
- openai/gpt-4o-mini-search-preview
- New models - Beta only:
- gemini/gemma-3-27b-it: Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 has a large, 128K context window, and multilingual support in over 140 languages.
- SambaNova:
- DeepSeek-R1: This Provider offers the fastest performance for running DeepSeek, processing up to 198 tokens per second per user, with a 16K token context window. The model is hosted in US data centers; privately and securely.
- DeepSeek-V3-0324: This model significantly outperforms its predecessor with enhanced reasoning benchmarks, improved code executability and refined web aesthetics, and superior Chinese writing aligned with the R1 style. It also offers better multi-turn interactive rewriting, translation quality, detailed report analysis, and more accurate function calling.
- QwQ-32B: SambaNova provides access to QwQ-32B-Preview, the best open source test-time compute model released by Alibaba.
- Llama 4 collection by Meta:
- Llama 4 Scout: A 17B-parameter multimodal MoE model with 16 experts that excels in text and image understanding. The model is currently in beta and is available via providers Nvidia, Groq and SambaNova. Via Groq, it supports a 128k tokens context window with fast inference at 460 tokens/sec, while SambaNova Cloud runs at 697 tokens/second/user.
- Llama 4 Maverick: Available via Nvidia, this 17-billion-parameter model featuring 128 experts supports a 32k-token context window.
- New LLMs:
- GPT-4.5
- Claude 3.7 Sonnet (Providers Anthropic, Vertex AI and AWS Bedrock)
- Updates in Gemini 2.0 series:
- vertex_ai/gemini-2.0-flash-lite-preview-02-05
- vertex_ai/gemini-2.0-flash-thinking-exp-01-21
- RAG Revision #6
- Support for o3-mini, gpt-4.5-preview, claude-3-7-sonnet-20250219, new DeepSeek, Gemini2* and sambanova LLM providers.
- New pinecone provider available for embeddings and rerankers.
- The CleanUp action message has been corrected to clearly specify that it will permanently delete the RAG Assistant files and update the information in the RDS.
- Added usage element on every response.
- Improvements when changing the LLM/Embeddings settings; all models and providers are normalized to be selected from standard combo-box items; use the override mechanism if you need other options.
- Support for guardrails.
- New documentAggregation property to decide how sources are grouped and returned.
- It is possible to provide feedback on the response of the Chat with Data Assistant in the Frontend.
- The new Evaluation APIs introduce key functionalities through three interconnected APIs: DataSet API, Evaluation Plan API, and Evaluation Result API. This version is primarily designed for users with a data science profile and is mainly accessed via APIs, complemented by a series of Jupyter notebooks that demonstrate their use. For a comprehensive guide on how to use these APIs, you can refer to How to evaluate an AI Assistant and the EvaluationAPITutorial.ipynb notebook, which provide practical examples and code for working through the evaluation process.
- File attachment support in Flows (version 0.9).
- Support for Full Story integration in the Workspace/Playground to generate user access statistics in Full Story.
- In the LLM API, for models that have descriptions in the specified languages, the descriptions property is included in the Response, which contains the descriptions in the available languages, such as Spanish, English and Japanese.
- Data Analyst Assistant 2.0 version presents important improvements, simplifying the interaction with the data by reducing the main components to just two: Dispatcher and Thinker. In addition, the metadata structure is automatically generated when loading the datasets, streamlining the setup process. For more information, see How to create a Data Analyst Assistant.
- The option to consult version-specific documentation is now available.
Articles with versions show the option “Other document versions” in the header. Clicking on “Other document versions” brings up a menu that allows you to choose between the most recent version (“Latest”) or earlier versions (e.g. “2025-02 or prior”). If you select a version other than “Latest”, a message appears: “This is not the latest version of this document; to access the latest version, click here”. This message provides a direct link to the most up-to-date documentation.
Components Version Update
- New documentation with details about Supported Chart Types in Chat with Data Assistant.
- New Usage Limits API.
- Flows
- RAG Revision #5
- New endpoint GET /accessControl/apitoken/validate returns information about the organization and project associated with the provided apitoken.
- New LLMs:
- Already in production
- Already in Beta
- DeepSeek:
- deepseek/deepseek-reasoner
- deepseek/deepseek-chat
- azure/deepseek-r1
- nvidia/deepseek-ai-deepseek-r1
- groq/deepseek-r1-distill-llama-70b
- sambanova/DeepSeek-R1-Distill-Llama-70B
- Updates in Gemini 2.0 series:
- gemini-2.0-flash-thinking-exp-01-21 (Via Providers Gemini and Vertex AI)
- gemini/gemini-2.0-flash-lite-preview
- gemini/gemini-2.0-pro-exp
- vertex_ai/gemini-2.0-flash-001
- sambanova/Llama-3.1-Tulu-3-405B
- Internationalization, Backoffice, and frontend support for Japanese.
- Invitations now include information about the organization and project in the subject.
- New LLMs
- Already in Production
- OpenAI: o1 (2024-12-17 version)
- Already in Beta
- Guardrails configured by assistant.
- Rerank API to semantically order a list of document chunks given a query.
- New optional RAG Retrieve and Rerank adds an extra layer of precision to ensure that only the most relevant information reaches the model used in the generation step.
- Automatic Creation of Default Assistant
- Organization Usage Limits: It is possible to set quota limits to control organization expenses or usage.
- Chat with Data Assistant
- Show details about the generated query in the Playground.
- Support in Chat API to interact with Chat with Data Assistant.
- Flows
- Support for markdown when showing the response on the different channels supported by Flows (web, Slack, WhatsApp, and Teams).
- New component for connecting flows to the agent overflow console (Human-in-the-loop) via B2Chat. Please read How to connect a Flow to B2Chat.
- RAG
- Data Analyst Assistant
- Option to update metadata options.
- New version by default in new Data Analyst assistants.
- New LLMs
- OpenAI: gpt-4o-2024-11-20
- AWS Bedrock: Anthropic Claude 3.5 Haiku
- Amazon Nova models (Micro, Lite, and Pro)
- Llama 3.1 405B on Vertex AI
- Beta:
- Support for providers Cerebras, SambaNova and xAI (Grok models).
- All new Gemini Experimental models.
- Security
- Flows execution integrated into the Playground
- New LLMs support
- OpenAI: o1-preview and o1-mini
- Claude Sonnet 3.5 v2 - Providers: Anthropic, Vertex AI, and AWS Bedrock
- Llama 3.2 models - Providers: Vertex AI and AWS Bedrock
- Chat with data assistants
- Possibility to edit metadata, entities, and attribute descriptions.
- The Properties tab has been renamed to Settings along with the options that can be configured in it.
- RAG
- New returnSourceDocuments option to disable returning the documents section used to answer the question.
- New step option to use the assistant as a retrieval tool.
- Support for custom history in conversations using the chat_history variable.
- Stand-alone Frontend based on the new Playground UI
- Options to customize the Frontend to use the client logo, color palette, welcome message, and descriptions.
- Feature to collect feedback (thumbs up/down) in each response.
- Google Analytics support.
- Data Analyst Assistant
- Support to upload large CSV files.
- In the Organization API, the ability to set and manage usage limits on projects through the POST /project and GET /project/{id} endpoints has been added.
- Quota Limit now includes improvements such as highlighting the active quota in green, offering options to cancel active quotas, among others.
- Rebranding to Globant Enteprise AI
- Improvements in RAG
- Playground improvements
- File management improvements
- New LLMs supported
- NVIDIA provider with new models supported
- nvidia.nemotron-mini-4b-instruct
- meta.llama-3.1-8b-instruct
- meta.llama-3.1-70b-instruct
- meta.llama-3.1-405b-instruct
- meta.llama-3.2-3b-instruct
- Groq provider supported
- groq/llama-3.1-70b-versatile
- groq/llama-3.2-11b-vision-preview
- groq/llama-3.2-3b-previewgroq/llama-3.2-1b-preview
- New embeddings models added
- Vertex AI:
- vertex_ai/textembedding-gecko
- vertex_ai/text-embedding-004
- vertex_ai/textembedding-gecko-multilingual
- Nvidia:
- nvidia/nvclip
- nvidia/nv-embed-v1
- nvidia/baai.bge-m3
- nvidia/snowflake.arctic-embed-l
- nvidia/nv-embedqa-mistral-7b-v2
- nvidia/embed-qa-4
- nvidia/nv-embedqa-e5-v5
- Support for file processing with prompt-based assistants. This will enable many scenarios, such as uploading documents and summarizing, extracting, and checking information, etc. Also, depending on the model used by the assistant, it will be able to process audio, video, or images.
- Support for multi-modal LLMs allow processing docs, audio, video, and images in models like GPT-4o or Gemini Pro.
- Chat with data assistants
- The model used to build the queries was updated with GPT-4o, which improves the quality of the generated query.
- Configure the query builder server by organization and project. This means you can connect with different DBMS from each project when building Chat with data assistants.
- Show an explanation of how the query was built.
- New Playground Interface design
- New design
- Upload documents from the front end to chat with them.
- Flows builder
- There will be two types of Flows, one more oriented to build a conversational UI and the other to build assistant flows.
Access to these flows will only be available through Chat API or through the channels offered by Flows.
- New models hosted in AWS Bedrock added:
- Amazon Titan Express v1
- Amazon Titan Lite v1
- Anthropic Claude 3 Haiku
- Anthropic Claude 3 Sonnet
- Anthropic Claude 3.5 Sonnet
- Cohere Command
- Meta Llama 3 8B
- Meta LLama 3 70B
- It is now possible to provide clear guidance on the assistant's capabilities, allowing you to add information such as descriptions, features, and example prompts. This configuration can be done from the Backoffice, Start Page, or WelcomeData section of the Assistant API and RAG Assistants API endpoints.
- RAG Assistants
- Support of new models
- RAG Assistants
- New option called CLEANUP allows to delete the documents associated to a RAG Assistant.
- When creating a new assistant, the following defaults are updated:
- Data Analyst Assistant
- Considerations
- Enterprise AI Proxy is deprecated. Use Chat API instead.
- Support for new LLMs
- OpenAI new model GPT-4o
- Models in Google Vertex
- Gemini 1.0 Pro
- Gemini 1.5 Flash preview-0514
- Gemini 1.5 Pro preview-0514
- Claude 3 Haiku
- Claude 3 Opus
- Claude 3 Sonnet
- RAG Improvements
- New option to initialize RAG Assistant based on another when creating a new RAG Assistant.
- New option to export document list in View Documents over a RAG Assistant.
- Added filter options when browsing Documents.
- SelfQuery RAG retriever partial support for a customized Prompt.
- Support for text-embedding-004 in Google models to generate the embeddings.
- Deprecated Assistant API endpoints.
- /assistant/text/begin
- /assistant/text
- Support to deploy in Google Cloud Platform.
- New Chat with Data Assistant.
- New Ingestion SDK to automate document ingestion in RAG assistants.
- New models hosted in NVIDIA platform supported. See Supported Chat Models for more details.
- New option to export information about projects and members available for the organization administrator.
- New API to extend dataset for Data Analyst Assistant 1.0.
- New filter by user email in Requests.
- Update default to use text-embedding-3-small OpenAI Embeddings for new RAG assistants.
- Support for gemini-1.5-pro-preview-0409 model added.
- GeneXus Identity Provider is implemented, expanding the login options in the Backoffice of the production environment. This allows for login not only with Google but also with Apple or GeneXus Account.
- It is possible to customize the icon for each assistant.
- Frontend improvements in UI/UX.
- Option to get feedback from end users when interacting with RAG Assistant.
- Gemini Pro LLM support.
- New Dashboard with user metrics.
- New Average Request Time metric added in the Project Dashboard.
- The option formerly known as 'Search Documents' has been improved and renamed to RAG Assistant (Retrieval Augmented Generation) to provide an optimized experience when searching and generating information.
- Feedback is provided during conversations with RAG Assistants, indicating where you are in the process.
- 'Response streaming' support for RAG Assistants.
- Settings are hidden when selecting an assistant, except when 'Chat with LLMs' is selected.
- Fixed: Too Many Redirects when accessing Playground using a browser in Spanish language.
- New backoffice design.
- Access to the Playground from the backoffice to chat with the assistants defined in the project.
- Upload images for analysis with GPT-4 Vision.
- Google Analytics support at the frontend.
- Keep a conversation thread when chatting with documents.
- An email notification is sent automatically when a new member is invited to join the organization or project.
- First version officially released!!
- The following OpenAI models are supported: GTP-4 Turbo (gpt-4-1106-preview), GPT-3.5 Turbo (gpt-3.5-turbo-1106), and GPT-4 Vision (gpt-4-vision-preview).
- AI-Driven Load Balancing: The platform automatically manages the Load Balancing process when you work with generative AI providers, efficiently addressing the limits imposed by LLM platforms.