Skip to main content

aimon.decorators.evaluate

Class: Dataset

A dataset is a collection of records that can be used for evaluations. The dataset should be a CSV file. The supported columns are:

  • "prompt": This is the system prompt used for the LLM
  • "user_query": This the query specified by the user
  • "context_docs": These are context documents that are either retrieved from a RAG or through other methods. For tasks like summarization, these documents could be directly specified by the user.
  • "output": This is the generated text by the LLM
  • "instructions": These are the instructions provided to the LLM
  • "metadata": This is a dictionary of additional metadata associated with the record.

aimon_client.datasets.create

This function creates a new dataset.

Args:

  • file (FileTypes): The CSV file that corresponds to the dataset.
  • json_data (str): JSON string containing dataset metadata. This should contain the name (dataset name) and description (description of the dataset) fields.
  • extra_headers (Headers | None, optional): Send extra headers.
  • extra_query (Query | None, optional): Add additional query parameters to the request.
  • extra_body (Body | None, optional): Add additional JSON properties to the request.
  • timeout (float | httpx.Timeout | None | NotGiven, optional): Override the client-level default timeout for this request, in seconds.

Returns:

  • Dataset: The created dataset.

Example:

from aimon import Client
import json

aimon_client = Client(auth_header="Bearer <AIMON API KEY>")
# Create a new dataset
file_path = "evaluation_dataset.csv"

dataset_args = json.dumps({
"name": "evaluation_dataset.csv",
"description": "This is a golden dataset"
})

with open(file_path, 'rb') as file1:
aimon_dataset = aimon_client.datasets.create(
file=file1,
json_data=dataset_args
)

Class: DatasetCollection

A dataset collection is a collection of one or more datasets that can be used for evaluations.

aimon_client.datasets.collection.create

This function creates a new dataset collection.

Args:

  • name (str): The name of the dataset collection.
  • description (str): The description of the dataset collection.
  • dataset_ids (List[str]): A list of dataset IDs to include in the collection.

Returns:

  • CollectionCreateResponse: The created dataset collection.

Example:

from aimon import Client

aimon_client = Client(auth_header="Bearer <AIMON API KEY>")

dataset_collection = aimon_client.datasets.collection.create(
name="my_first_dataset_collection",
dataset_ids=[aimon_dataset1.sha, aimon_dataset2.sha],
description="This is a collection of two datasets."
)

Function: evaluate

Run an evaluation on a dataset collection using the Aimon API.

Signature

from aimon import evaluate

evaluate(
application_name: str,
model_name: str,
dataset_collection_name: str,
evaluation_name: str,
headers: List[str],
api_key: Optional[str] = None,
aimon_client: Optional[Client] = None,
config: Optional[Dict[str, Any]] = None
) -> List[EvaluateResponse]

Parameters:

  • application_name (str): The name of the application to run the evaluation on.
  • model_name (str): The name of the model to run the evaluation on.
  • dataset_collection_name (str): The name of the dataset collection to run the evaluation on.
  • evaluation_name (str): The name of the evaluation to be created.
  • headers (list): A list of column names in the dataset to be used for the evaluation. Must include 'context_docs'.
  • api_key (str, optional): The API key to use for the Aimon client. Required if aimon_client is not provided.
  • aimon_client (Client, optional): An instance of the Aimon client to use for the evaluation. If not provided, a new client will be created using the api_key.
  • config (dict, optional): A dictionary of configuration options for the evaluation.

Returns:

  • List[EvaluateResponse]: A list of EvaluateResponse objects containing the output and response for each record in the dataset collection.

Raises:

  • ValueError: If headers is empty or doesn't contain 'context_docs', or if required fields are missing from the dataset records.

Note:

The dataset records must contain 'context_docs' and all fields specified in the 'headers' argument. The 'prompt', 'output', and 'instructions' fields are optional.

Example:

from aimon import evaluate
import os

headers = ["context_docs", "user_query", "output"]
config = {
"hallucination": {"detector_name": "default"},
"instruction_adherence": {"detector_name": "default"}
}
results = evaluate(
application_name="my_app",
model_name="gpt-4o",
# this dataset collection must exist in the Aimon platform
dataset_collection_name="my_dataset_collection",
evaluation_name="my_evaluation",
headers=headers,
api_key=os.getenv("AIMON_API_KEY"),
config=config
)
for result in results:
print(f"Output: {result.output}")
print(f"Response: {result.response}")
print("---")

Class: EvaluateResponse

Represents the response from an evaluation.

Constructor

EvaluateResponse(output, response)

Parameters:

  • output: The output of the evaluated function.
  • response: The response from the Aimon API analysis.