Evaluate
Dataset
A dataset is a collection of records that can be used for evaluations. The dataset should be a CSV file. The supported columns are:
- "prompt": This is the system prompt used for the LLM
- "user_query": This the query specified by the user
- "context_docs": These are context documents that are either retrieved from a RAG or through other methods. For tasks like summarization, these documents could be directly specified by the user.
- "output": This is the generated text by the LLM
- "instructions": These are the instructions provided to the LLM
- "metadata": This is a dictionary of additional metadata associated with the record.
Create dataset:
import Client from "aimon";
import { fileFromPath } from "formdata-node/file-from-path";
const aimon = new Client({
authHeader: `Bearer API_KEY`,
});
// Creates a new dataset from the local path csv file
const createDataset = async (
path: string,
datasetName: string,
description: string
): Promise<Client.Dataset> => {
const file = await fileFromPath(path);
const json_data = JSON.stringify({
name: datasetName,
description: description,
});
const params = {
file: file,
json_data: json_data,
};
const dataset: Client.Dataset = await aimon.datasets.create(params);
return dataset;
};
const dataset1 = await createDataset(
"/path/to/file/filename_1.csv",
"filename1.csv",
"description"
);
const dataset2 = await createDataset(
"/path/to/file/filename_2.csv",
"filename2.csv",
"description"
);
DatasetCollection
A dataset collection is a collection of one or more datasets that can be used for evaluations.
aimon.datasets.collection.create
This function creates a new dataset collection.
Args:
name
(str
): The name of the dataset collection.description
(str
): The description of the dataset collection.dataset_ids
(List[str]
): A list of dataset IDs to include in the collection.
Returns:
CollectionCreateResponse
: The created dataset collection.
Example:
let datasetCollection: Client.Datasets.CollectionCreateResponse | undefined;
// Ensures that dataset1.sha and dataset2.sha are defined
if (dataset1.sha && dataset2.sha) {
// Creates dataset collection
datasetCollection = await aimon.datasets.collection.create({
name: "my_first_dataset_collection",
dataset_ids: [dataset1.sha, dataset2.sha],
description: "This is a collection of two datasets.",
});
} else {
throw new Error("Dataset sha is undefined");
}
Function: evaluate
Run an evaluation on a dataset collection using the Aimon API.
Parameters:
applicationName
(str): The name of the application to run the evaluation on.modelName
(str): The name of the model to run the evaluation on.datasetCollectionName
(str): The name of the dataset collection to run the evaluation on.evaluationName
(str): The name of the evaluation to be created.headers
(list): A list of column names in the dataset to be used for the evaluation. Must include 'context_docs'.config
(dict, optional): A dictionary of configuration options for the evaluation.
Returns:
- List[EvaluateResponse]: A list of EvaluateResponse objects containing the output and response for each record in the dataset collection.
Raises:
- ValueError: If headers is empty or doesn't contain 'context_docs', or if required fields are missing from the dataset records.
Note:
The dataset records must contain 'context_docs' and all fields specified in the 'headers' argument. The 'prompt', 'output', and 'instructions' fields are optional.
Example:
const headers = ["context_docs", "user_query", "output"];
const config = {
hallucination: { detector_name: "default" },
instruction_adherence: { detector_name: "default" },
};
const results = await aimon.evaluate(
"my_application_name", //Application Name
"my_model_name", // Model name
//this dataset collection must exist in the Aimon platform
"my_first_dataset_collection",
"my_evaluation_name", // Evaluation name,
headers,
config
);