Skip to main content

Custom Metrics

Custom Metrics let you define your own evaluation criteria for AI-generated responses. These metrics work just like AIMon’s built-in ones (e.g., Conciseness, Toxicity), but are fully customizable by your team.

You can define:

  • The name of the metric (e.g., "financial_risk_accuracy")
  • A list of guideline instructions the model output should follow
  • And optionally assign those metrics to different applications

This allows for fine-grained evaluation of domain-specific needs like legal tone, medical caution, brand alignment, or competitive comparison.

Creating a new Custom Metric

Each metric includes:

  • name: The name of the metric (e.g., "Financial Risk Accuracy").
  • description: Optional longer explanation of the metric's purpose.
  • guidelines: A list of guideline instructions the model output should follow

Steps to create the metric

  1. Go to the Metrics section in the application sidebar.

custom_metric_menu.png

  1. Click the + Create Metric button to open the creation modal.

custom_metric_view.png

  1. Fill in the metric name, guidelines and (optionally) a description.

create_custom_metric_modal.png

Once the metric is created, it will appear in the list and can be opened to see the details.

custom_metric_detailed.png

Using the metric with the SDK client

Each custom metric can include multiple guidelines — these are written as natural language instructions. For example:

  • “Response should not contain emojis.”
  • “Avoid using overly promotional language.”
  • “Mention at least one financial ratio.”

The evaluation will return:

  • instruction: the rule being evaluated
  • label: whether the output passed that instruction
  • follow_probability: confidence score (0 to 1)
  • explanation: a short, human-readable justification

Example Request

[
{
"context": "Company A's revenue rose 10% YoY. Debt levels were reduced, and free cash flow improved. EPS guidance was raised by 5%.",
"user_query": "Summarize the financial health of Company A.",
"generated_text": "Company A had strong financial performance, growing revenue and improving margins. The raised EPS guidance shows continued momentum.",
"config": {
"financial_metric": {
"detector_name": "default",
"explain":true
}
}
}
]

Example Response

[
{
"financial_metric": {
"instructions_list": [
{
"instruction": "Mention specific financial indicators like revenue, EPS, or debt.",
"label": true,
"follow_probability": 0.912,
"explanation": "The output mentions revenue growth and EPS guidance."
},
{
"instruction": "Avoid vague or generic praise like 'strong performance'.",
"label": false,
"follow_probability": 0.331,
"explanation": "The phrase 'strong financial performance' is vague."
},
{
"instruction": "Use a neutral and analytical tone.",
"label": true,
"follow_probability": 0.87,
"explanation": "The response focuses on facts like revenue and EPS."
}
],
"score": 0.667
}
}
]

Code Example

Python

from aimon import Detect
import os

detect = Detect(
config={"financial_metric": {"detector_name": "default"}},
values_returned=["context", "generated_text"],
publish=True,
api_key=os.getenv("AIMON_API_KEY"),
application_name="financial_analyzer",
model_name="llm_v1"
)

@detect
def analyze_financials(context, query):
generated = my_model(context, query)
return context, generated

context = "EPS grew by 5%. Revenue was up 10%. Debt fell."
query = "How is the company performing?"
_, _, result = analyze_financials(context, query)

print(result)

TypeScript

import Client from "aimon";

const aimon = new Client({ authHeader: "Bearer API_KEY" });

const runDetect = async () => {
const response = await aimon.detect(
"The company grew earnings and lowered debt.",
["Revenue up 10%, EPS up 5%."],
"Summarize financial performance",
{
financial_metric: { detector_name: "default" },
}
);

console.log(response);
};

runDetect();

How to see metrics results in AIMon UI

To see your metric results, click on an LLM app, either using the Monitoringor Evals menu items. At monitoring choose your application, click on the Requeststab and choose your request. At the requests details, there is a Custom Metrics tab that you can see your metrics results.

custom_metric_result_ui.png

When to Use Custom Metrics

Use custom metrics when:

  • You need domain-specific evaluations (e.g., healthcare, finance, legal)
  • You want to enforce brand guidelines (e.g., tone, disclaimers)
  • You need a flexible internal QA checklist for LLM evaluations

Custom metrics integrate directly into your evaluation UI, dashboards, and logs — just like the built-in metrics.

Managing Custom Metrics via SDK

Create a Custom Metric

from aimon import Client

client = Client(api_key="YOUR_API_KEY")

response = client.metrics.create(
name="financial_risk_accuracy",
label="Financial Risk Accuracy",
instructions='["Mention at least one financial indicator", "Avoid vague statements"]',
description="Evaluates how accurately financial data is reported."
)

print(response)

List Custom Metrics

response = client.metrics.list()
for metric in response:
print(metric)

Delete a Custom Metric

uuid = "your-custom-metric-uuid"
response = client.metrics.delete(uuid)
print(response)