aimon.decorators.evaluate
Class: Dataset
A dataset is a collection of records that can be used for evaluations. The dataset should be a CSV file. The supported columns are:
- "prompt": This is the system prompt used for the LLM
- "user_query": This the query specified by the user
- "context_docs": These are context documents that are either retrieved from a RAG or through other methods. For tasks like summarization, these documents could be directly specified by the user.
- "output": This is the generated text by the LLM
- "instructions": These are the instructions provided to the LLM
- "metadata": This is a dictionary of additional metadata associated with the record.
aimon_client.datasets.create
This function creates a new dataset.
Args:
file
(FileTypes
): The CSV file that corresponds to the dataset.json_data
(str
): JSON string containing dataset metadata. This should contain thename
(dataset name) anddescription
(description of the dataset) fields.extra_headers
(Headers | None
, optional): Send extra headers.extra_query
(Query | None
, optional): Add additional query parameters to the request.extra_body
(Body | None
, optional): Add additional JSON properties to the request.timeout
(float | httpx.Timeout | None | NotGiven
, optional): Override the client-level default timeout for this request, in seconds.
Returns:
Dataset
: The created dataset.
Example:
from aimon import Client
import json
aimon_client = Client(auth_header="Bearer <AIMON API KEY>")
# Create a new dataset
file_path = "evaluation_dataset.csv"
dataset_args = json.dumps({
"name": "evaluation_dataset.csv",
"description": "This is a golden dataset"
})
with open(file_path, 'rb') as file1:
aimon_dataset = aimon_client.datasets.create(
file=file1,
json_data=dataset_args
)
Class: DatasetCollection
A dataset collection is a collection of one or more datasets that can be used for evaluations.
aimon_client.datasets.collection.create
This function creates a new dataset collection.
Args:
name
(str
): The name of the dataset collection.description
(str
): The description of the dataset collection.dataset_ids
(List[str]
): A list of dataset IDs to include in the collection.
Returns:
CollectionCreateResponse
: The created dataset collection.
Example:
from aimon import Client
aimon_client = Client(auth_header="Bearer <AIMON API KEY>")
dataset_collection = aimon_client.datasets.collection.create(
name="my_first_dataset_collection",
dataset_ids=[aimon_dataset1.sha, aimon_dataset2.sha],
description="This is a collection of two datasets."
)
Function: evaluate
Run an evaluation on a dataset collection using the Aimon API.
Signature
from aimon import evaluate
evaluate(
application_name: str,
model_name: str,
dataset_collection_name: str,
evaluation_name: str,
headers: List[str],
api_key: Optional[str] = None,
aimon_client: Optional[Client] = None,
config: Optional[Dict[str, Any]] = None
) -> List[EvaluateResponse]
Parameters:
application_name
(str): The name of the application to run the evaluation on.model_name
(str): The name of the model to run the evaluation on.dataset_collection_name
(str): The name of the dataset collection to run the evaluation on.evaluation_name
(str): The name of the evaluation to be created.headers
(list): A list of column names in the dataset to be used for the evaluation. Must include 'context_docs'.api_key
(str, optional): The API key to use for the Aimon client. Required if aimon_client is not provided.aimon_client
(Client, optional): An instance of the Aimon client to use for the evaluation. If not provided, a new client will be created using the api_key.config
(dict, optional): A dictionary of configuration options for the evaluation.
Returns:
- List[EvaluateResponse]: A list of EvaluateResponse objects containing the output and response for each record in the dataset collection.
Raises:
- ValueError: If headers is empty or doesn't contain 'context_docs', or if required fields are missing from the dataset records.
Note:
The dataset records must contain 'context_docs' and all fields specified in the 'headers' argument. The 'prompt', 'output', and 'instructions' fields are optional.
Example:
from aimon import evaluate
import os
headers = ["context_docs", "user_query", "output"]
config = {
"hallucination": {"detector_name": "default"},
"instruction_adherence": {"detector_name": "default"}
}
results = evaluate(
application_name="my_app",
model_name="gpt-4o",
# this dataset collection must exist in the Aimon platform
dataset_collection_name="my_dataset_collection",
evaluation_name="my_evaluation",
headers=headers,
api_key=os.getenv("AIMON_API_KEY"),
config=config
)
for result in results:
print(f"Output: {result.output}")
print(f"Response: {result.response}")
print("---")
Class: EvaluateResponse
Represents the response from an evaluation.
Constructor
EvaluateResponse(output, response)
Parameters:
output
: The output of the evaluated function.response
: The response from the Aimon API analysis.