Metrics | AIMon Documentation

📄️ Metrics Overview

Evaluating LLM and RAG applications often involves trade-offs between accuracy, reliability, and speed. While several frameworks like RAGAs, TrueLens, and DeepEval exist, they can be overwhelming due to inconsistent metrics, complexity in setup, and reliance on subjective LLM-based evaluation.

🗃️ Output Quality

5 items

🗃️ Safety Metrics

9 items

🗃️ RAG & Data

2 items

🗃️ Tool Tracing

1 item

📄️ Custom Metrics

Custom Metrics let you define your own evaluation criteria for AI-generated responses. These metrics work just like AIMon’s built-in ones (e.g., Conciseness, Toxicity), but are fully customizable by your team.

📄️ Deprecated Metrics

This page lists AIMon metrics that have been deprecated in favor of more accurate or modular replacements. These are no longer actively maintained but may still be available for legacy users.