aimon.decorators.evaluate

Class: Dataset

A dataset is a collection of records that can be used for evaluations. The dataset should be a CSV file. The supported columns are:

"prompt": This is the system prompt used for the LLM
"user_query": This the query specified by the user
"context_docs": These are context documents that are either retrieved from a RAG or through other methods. For tasks like summarization, these documents could be directly specified by the user.
"output": This is the generated text by the LLM
"instructions": These are the instructions provided to the LLM
"metadata": This is a dictionary of additional metadata associated with the record.

`aimon_client.datasets.create`

This function creates a new dataset.

Args:

file (FileTypes): The CSV file containing the dataset.
name (str): Name of the dataset.
description (str, optional): Description of the dataset.
extra_headers (Headers | None, optional): Additional request headers.
extra_query (Query | None, optional): Add additional query parameters to the request.
extra_body (Body | None, optional): Add additional JSON properties to the request.
timeout (float | httpx.Timeout | None | NotGiven, optional): Override the client-level default timeout for this request, in seconds.

Returns:

Dataset: The created dataset.

Example:

from aimon import Client
import json

aimon_client = Client(auth_header="Bearer <AIMON API KEY>")
# Create a new dataset
file_path = "evaluation_dataset.csv"

with open(file_path, 'rb') as file:
    aimon_dataset = aimon_client.datasets.create(
        file=file,
        name="evaluation_dataset.csv",
        description="This is a golden dataset"
    )

Class: DatasetCollection

A dataset collection is a collection of one or more datasets that can be used for evaluations.

`aimon_client.datasets.collection.create`

This function creates a new dataset collection.

Args:

name (str): The name of the dataset collection.
description (str): The description of the dataset collection.
dataset_ids (List[str]): A list of dataset IDs to include in the collection.

Returns:

CollectionCreateResponse: The created dataset collection.

Example:

from aimon import Client

aimon_client = Client(auth_header="Bearer <AIMON API KEY>")

dataset_collection = aimon_client.datasets.collection.create(
    name="my_first_dataset_collection",
    dataset_ids=[aimon_dataset1.sha, aimon_dataset2.sha],
    description="This is a collection of two datasets."
)

Function: evaluate

Run an evaluation on a dataset collection using the Aimon API.

Signature

from aimon import evaluate

evaluate(
    application_name: str,
    model_name: str,
    dataset_collection_name: str,
    evaluation_name: str,
    headers: List[str],
    api_key: Optional[str] = None,
    aimon_client: Optional[Client] = None,
    config: Optional[Dict[str, Any]] = None
) -> List[EvaluateResponse]

Parameters:

application_name (str): The name of the application to run the evaluation on.
model_name (str): The name of the model to run the evaluation on.
dataset_collection_name (str): The name of the dataset collection to run the evaluation on.
evaluation_name (str): The name of the evaluation to be created.
headers (list): A list of column names in the dataset to be used for the evaluation. Must include 'context_docs'.
api_key (str, optional): The API key to use for the Aimon client. Required if aimon_client is not provided.
aimon_client (Client, optional): An instance of the Aimon client to use for the evaluation. If not provided, a new client will be created using the api_key.
config (dict, optional): A dictionary of configuration options for the evaluation.

Returns:

List[EvaluateResponse]: A list of EvaluateResponse objects containing the output and response for each record in the dataset collection.

Raises:

ValueError: If headers is empty or doesn't contain 'context_docs', or if required fields are missing from the dataset records.

Note:

The dataset records must contain 'context_docs' and all fields specified in the 'headers' argument. The 'prompt', 'output', and 'instructions' fields are optional.

Example:

from aimon import evaluate
import os

headers = ["context_docs", "user_query", "output"]
config = {
    "hallucination": {"detector_name": "default"},
    "instruction_adherence": {"detector_name": "default"}
}
results = evaluate(
    application_name="my_app",
    model_name="gpt-4o",
    # this dataset collection must exist in the Aimon platform
    dataset_collection_name="my_dataset_collection",
    evaluation_name="my_evaluation",
    headers=headers,
    api_key=os.getenv("AIMON_API_KEY"),
    config=config
)
for result in results:
    print(f"Output: {result.output}")
    print(f"Response: {result.response}")
    print("---")

Class: EvaluateResponse

Represents the response from an evaluation.

Constructor

EvaluateResponse(output, response)

Parameters:

output: The output of the evaluated function.
response: The response from the Aimon API analysis.

Class: Dataset​

aimon_client.datasets.create​

Args:​

Returns:​

Example:​

Class: DatasetCollection​

aimon_client.datasets.collection.create​

Args:​

Returns:​

Example:​

Function: evaluate​

Signature​

Parameters:​

Returns:​

Raises:​

Note:​

Example:​

Class: EvaluateResponse​

Constructor​

Parameters:​

Class: Dataset

`aimon_client.datasets.create`

Args:

Returns:

Example:

Class: DatasetCollection

`aimon_client.datasets.collection.create`

Args:

Returns:

Example:

Function: evaluate

Signature

Parameters:

Returns:

Raises:

Note:

Example:

Class: EvaluateResponse

Constructor

Parameters: