Skip to main content

Completeness

The Completeness API evaluates how thoroughly a generated response addresses a user query based on the provided context. The context typically includes background documents and the original query that was passed to the language model. The goal is to determine whether the generated output fully covers all the necessary information, accurately reflects the context, and delivers a complete response without omissions or gaps. This is particularly useful in use cases such as retrieval-augmented generation, customer support, or domain-specific QA, where missing or incomplete answers can significantly reduce user trust and task success.

Response Format

The API returns a structured object containing the completeness evaluation. Each result includes:

  • score (float, 0.0 – 1.0):

    A numerical indicator of how completely the response addresses the query:

    • 0.0–0.2: The response is incomplete or not relevant.
    • 0.2–0.7: The response is partially complete, missing key elements or details.
    • 0.7–1.0: The response is fully complete and covers all relevant information.
  • instructions_list (array):

    A list of specific completeness criteria used to evaluate the response. Each item includes:

    • instruction: A rule representing one aspect of completeness.
    • label: Indicates whether the response followed the instruction (true) or violated it (false).
    • follow_probability: A confidence score indicating the likelihood the instruction was followed.
    • explanation: A natural-language rationale explaining why the instruction was marked true or false.

Note: The score reflects the overall completeness, but detailed reasoning is broken down across these instruction-level explanations.

[
{
"context": "Paul Graham is an English-born computer scientist, entrepreneur, venture capitalist, author, and essayist. He is best known for his work on Lisp, his former startup Viaweb (later renamed Yahoo! Store), co-founding the influential startup accelerator and seed capital firm Y Combinator, his blog, and Hacker News.",
"generated_text": "Paul Graham has worked in several key areas throughout his career: IBM 1401: He began programming on the IBM 1401 during his school years, specifically in 9th grade. In addition, he has also been involved in writing essays and sharing his thoughts on technology, startups, and programming.",
"config": {
"completeness": {
"detector_name": "default"
}
}
}
]

Code Example

The below example demonstrates how to use the instruction adherence detector in a synchronous manner.

from aimon import Detect
import os

# This is a synchronous example
# Use async=True to use it asynchronously
# Use publish=True to publish to the AIMon UI

detect = Detect(
values_returned=['context', 'generated_text'],
config={"completeness": {"detector_name": "default"}},
publish=True,
api_key=os.getenv("AIMON_API_KEY"),
application_name="my_awesome_llm_app",
model_name="my_awesome_llm_model"
)

@detect
def my_llm_app(context, query):
my_llm_model = lambda context, query: f'''I am a LLM trained to answer your questions.
But I often don't fully answer your question.
The query you passed is: {query}.
The context you passed is: {context}.'''
generated_text = my_llm_model(context, query)
return context, generated_text

context, gen_text, aimon_res = my_llm_app("This is a context", "This is a query")

print(aimon_res)