Evaluation APIs

This documentation is valid for:

Evaluation APIs allow you to manage the entire evaluation lifecycle of your AI Assistants programmatically. This is achieved through three interconnected APIs: DataSet API, Evaluation Plan API, and Evaluation Result API.

These APIs are designed for data science-oriented users who prefer a code-driven approach to evaluation.

For a comprehensive guide on evaluating your AI Assistants, refer to How to evaluate an AI Assistant and the EvaluationAPITutorial.ipynb notebook.

DataSet API

Allows for the management of datasets used to evaluate Assistants. Users can create, retrieve, update, and delete datasets and manage rows within them. Additionally, expected sources and filter variables can be managed, and file uploads for datasets and their rows can be performed. This API is complemented by the DataSetAPI.ipynb notebook, which provides examples and code snippets for working with datasets and managing their content.

Evaluation Plan API

Facilitates the definition of how an Assistant will be evaluated. Users can create, retrieve, update, and delete evaluation plans, as well as associate system metrics and execute defined evaluation plans. This process is covered in the EvaluationPlanAPI.ipynb notebook, which includes examples for creating and managing evaluation plans and associated metrics.

Evaluation Result API

Provides access to the results of executed evaluation plans. The API is accompanied by the EvaluationResultAPI.ipynb notebook, which provides examples for retrieving and analyzing evaluation results.

Page Id

—

Created: 10 March 2025 - Last update: 10 March 2025 by rp

Next: DataSet API

Backlinks

See all