Deprecated Metrics
This page lists AIMon metrics that have been deprecated in favor of more accurate or modular replacements. These are no longer actively maintained but may still be available for legacy users.
Hallucination (deprecated)
Deprecated July 2025 — Replaced by
Groundedness
for improved factual consistency and hallucination detection in RAG and QA settings.
The hallucination metric, previously powered by the HDM-2 model, scored generated text for factual errors based on context alignment and annotated hallucinated spans.
We now recommend using the groundedness
metric, which offers improved explainability, instruction-level traces, and better support for real-time and retrieval-augmented applications.
Legacy model: GitHub · Hugging Face
Retrieval Relevance (deprecated)
Deprecated August 2025 — Replaced by
Context Query Relevance
for improved evaluation of context-query alignment in retrieval-augmented generation (RAG) pipelines.
The retrieval relevance metric evaluated how closely retrieved context matched the user query, producing per-document relevance scores and natural language explanations. It was commonly used to assess ranking quality in LLM applications and to tune retrieval systems for accuracy and efficiency.
We now recommend using the context_query_relevance
metric, which offers robust scoring across diverse query types and improved accuracy in evaluating the alignment between context documents and user queries.