Instruction Adherence
The Instruction Adherence metric evaluates whether a model’s response follows a set of provided or implied instructions. It is especially useful when testing prompt engineering patterns, instruction-tuned models, or fine-grained control over generated output.
This metric analyzes both explicit instructions (manually provided) and implicit instructions (automatically extracted from the user_query, if enabled). For each instruction, it returns:
- Whether it was followed (
label) - A confidence score (
follow_probability) - An optional explanation (if enabled)
 
In addition, the metric computes an overall instruction adherence score.
Configuration Options
To enable or customize Instruction Adherence evaluation, pass the following fields inside the instruction_adherence object in the config:
- 
detector_name(string, required):
Must be"default". - 
explain(boolean or"negatives_only"):
Controls whether textual explanations are included for each instruction.false(default): No explanations.true: Explanations for all instructions."negatives_only": Only for instructions that were not followed (label: false).
 - 
extract_from_system(boolean):
Iftrue, the metric will try to extract implicit instructions from theuser_queryin addition to any explicitly provided instructions. 
API Request & Response Example
- Request
 - Response
 
[
    {
        "context":"Baking Class 101",
        "user_query": "How do I make chocolate chip cookies? Write up to 200 words.",
        "generated_text": "To make chocolate chip cookies, you need flour, sugar, butter, eggs, chocolate chips, and vanilla extract. Preheat oven to 375°F. Cream butter and sugar, add eggs and vanilla. Mix dry ingredients separately, then combine. Fold in chocolate chips. Drop spoonfuls onto baking sheet and bake for 9-11 minutes.",
        "instructions":["Answer in English only"],
        "config": {
            "instruction_adherence": {
                "detector_name": "default",
                "explain": true,
                "extract_from_system": true
            }
        },
        "application_name":"ia_results",
        "model_name":"ia_results",
        "publish": true
    }
]
[
    {
        "avg_context_doc_length": 16.0,
        "instruction_adherence": {
            "extractions": [
                {
                    "explanation": "The response is 68 words long, well under the 200-word limit.",
                    "follow_probability": 0.7549,
                    "instruction": "Write up to 200 words.",
                    "label": true
                }
            ],
            "instructions_list": [
                {
                    "explanation": "The answer is entirely in English without any non-English elements.",
                    "follow_probability": 0.9987,
                    "instruction": "Answer in English only",
                    "label": true
                }
            ],
            "score": 1.0
        }
    }
]
Response Fields Explained
- 
instruction_adherence: Main field containing the adherence evaluation results.
- 
score: A float between
0.0and1.0indicating the fraction of instructions (explicit + extracted) that were followed (label: true). For example, a score of1.0means all instructions were followed. - 
extractions: A list of instructions automatically extracted from the
user_query, ifextract_from_systemistrue.- instruction: The extracted instruction as interpreted from the prompt.
 - label: 
trueif the response adhered to the instruction. - follow_probability: A confidence score between 
0.0and1.0.
Higher values indicate greater confidence that the instruction was followed. - explanation (optional): Explanation for the instruction, if enabled.
 
 - 
instructions_list: List of explicit instructions provided in the request under the
instructionsfield.- instruction: The user-supplied instruction.
 - label: 
trueif the response adhered to the instruction. - follow_probability: A confidence score between 
0.0and1.0.
Higher values indicate greater confidence that the instruction was followed. - explanation (optional): Explanation for the instruction, if enabled.
 
 
 - 
 - 
avg_context_doc_length: Average word length of
context(if provided). If no documents are present, this isNaN. 
Code Exampless
- Python (Sync)
 - Python (Async)
 - Python (Decorator)
 - TypeScript
 
# Synchronous example
import os
from aimon import Client
import json
# Initialize client
client = Client(auth_header=f"Bearer {os.environ['AIMON_API_KEY']}")
# Construct payload
payload = [{
    "user_query": "Say Hello!",
    "generated_text": "Bonjour!",
    "instructions": ["The response must be in English only."],
    "config": {
        "instruction_adherence": {
            "detector_name": "default",
            "explain": True
        }
    },
    "publish": False
}]
# Call sync detect
response = client.inference.detect(body=payload)
# Print result
print(json.dumps(response[0].instruction_adherence, indent=2))
# Aynchronous example
import os
import json
from aimon import AsyncClient
aimon_api_key = os.environ["AIMON_API_KEY"]
aimon_payload = {
    "user_query": "Say Hello!",
    "generated_text": "Bonjour!",
    "instructions": ["The response must be in English only."],
    "config": {
        "instruction_adherence": {
            "detector_name": "default",
            "explain": True
        }
    },
    "publish": True,
    "async_mode": True,
    "application_name": "async_metric_example",
    "model_name": "async_metric_example"
}
data_to_send = [aimon_payload]
async def call_aimon():
    async with AsyncClient(auth_header=f"Bearer {aimon_api_key}") as aimon:
        return await aimon.inference.detect(body=data_to_send)
# Await and confirm
resp = await call_aimon()
print(json.dumps(resp, indent=2))
print("View results at: https://www.app.aimon.ai/llmapps?source=sidebar&stage=production")
from aimon import Detect
import os
import json
# Configure the detector
detect = Detect(
    values_returned=['user_query', 'instructions', 'generated_text'],
    config={
        "instruction_adherence": {
            "detector_name": "default",
            "explain": True,
            "extract_from_system": False,
        }
    },
    api_key=os.getenv("AIMON_API_KEY"),
    application_name="my_llm_app_ia_example",
    model_name="my_model_v1"
)
# Decorate your LLM application function
@detect
def my_llm_app(query: str, explicit_instructions: list[str]):
    # Example: Generate a response based on query and instructions
    # Replace with your actual LLM call
    response_text = (f"Based on your request '{query}', and considering the instructions provided, here is a poem: A quick brown fox jumps high. So very fast it goes by now.")
    
    # Return values matching the order in 'values_returned'
    return query, explicit_instructions, response_text
# Example Usage
user_query = "Write a 10-word poem about a fox."
instructions_to_follow = [
    'The poem must not contain the letter "e".',
    "The poem must contain exactly 10 words."
]
# Call the decorated function
# The return value includes the original outputs plus the Aimon results
query_out, instructions_out, response_out, aimon_res = my_llm_app(user_query, instructions_to_follow)
import Client from "aimon";
import dotenv from "dotenv";
dotenv.config();
const aimon = new Client({
  authHeader: `Bearer ${process.env.AIMON_API_KEY}`,
});
const run = async () => {
  const response = await aimon.detect({
    userQuery: "Summarize the plot of Romeo and Juliet.",
    generatedText:
      "Romeo Montague and Juliet Capulet fall in love despite their families' bitter feud. Their passionate romance brings their families together.",
    instructions: [
      "Mention the names 'Romeo' and 'Juliet'.",
      "State the main conflict (feuding families).",
      "Mention the tragic ending explicitly (e.g., death).",
    ],
    config: {
      instruction_adherence: {
        detector_name: "default",
        explain: "negatives_only",
        extract_from_system: false,
      },
    },
  });
  console.log("AIMon response:", JSON.stringify(response, null, 2));
};
run();
Example notebook
https://colab.research.google.com/drive/1fXXqmGBVJeTnoBla_A-30gy8eUouOuOw
