AIMon Rerank

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

%%capture
!pip install llama-index
!pip install llama-index-postprocessor-aimon-rerank

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.response.pprint_utils import pprint_response

An OpenAI and AIMon API key is required for this notebook. Import the AIMon and OpenAI API keys from Colab Secrets.

import os

# Import Colab Secrets userdata module.
from google.colab import userdata
os.environ['AIMON_API_KEY'] = userdata.get('AIMON_API_KEY')
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

Download data.

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2025-03-14 17:37:33-- https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 75042 (73K) [text/plain] Saving to: ‘data/paul_graham/paul_graham_essay.txt’

data/paul 0%[ ] 0 --.-KB/s
data/paul_graham/pa 100%[===================>] 73.28K --.-KB/s in 0.02s

2025-03-14 17:37:33 (3.12 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]

Generate documents and build an index.

# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

# build index
index = VectorStoreIndex.from_documents(documents=documents)

Define a task definition for the AIMon reranker and instantiate an instance of the AIMonRerank class. The task definition serves as an explicit instruction to the system, defining what the reranking evaluation should focus on.

import os
from llama_index.postprocessor.aimon_rerank import AIMonRerank

task_definition = "Your task is to assess the actions of an individual specified in the user query against the context documents supplied."

aimon_rerank = AIMonRerank(top_n=2, api_key=userdata.get('AIMON_API_KEY'), task_definition =  task_definition)

Directly retrieve top 2 most similar nodes (i.e., without using a reranker).

query_engine = index.as_query_engine(similarity_top_k=2)
response = query_engine.query("What did Sam Altman do in this essay?")

pprint_response(response, show_source=True)

Final Response: Sam Altman was asked to become the president of Y Combinator, initially declined the offer to pursue starting a startup focused on nuclear reactors, but eventually agreed to take over starting with the winter 2014 batch.

Source Node 1/2 Node ID: 470fdd7a-dbe7-4e9c-a258-9fd1cb8e61f7 Similarity: 0.8305926707169754 Text: When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17] As well as HN, I wrote all of YC's internal software in Arc. But while I continued to work a good deal in Arc, I gradually stopped working on Arc, partly because I didn't have t...

Source Node 2/2 Node ID: 5223de79-4df4-46f2-97cf-067655aeca57 Similarity: 0.8239262662012308 Text: Up till that point YC had been controlled by the original LLC we four had started. But we wanted YC to last for a long time, and to do that it couldn't be controlled by the founders. So if Sam said yes, we'd let him reorganize YC. Robert and I would retire, and Jessica and Trevor would become ordinary partners. When we asked Sam if he wanted to...

Retrieve top 10 most relevant nodes, but then rerank with AIMon Reranker.

Diagram depicting working of AIMon reranker

Explanation of the reranking process:

The diagram illustrates how a reranker refines document retrieval for a more accurate response.

Initial Retrieval (Vector DB):
- A query is sent to the vector database.
- The system retrieves the top 10 most relevant records based on similarity scores (top_k = 10).
Reranking with AIMon:
- Instead of using only the highest-scoring records directly, these 10 records are reranked using the AIMon Reranker.
- The reranker evaluates the documents based on their actual relevance to the query, rather than just raw similarity scores.
- During this step, a task definition is applied, serving as an explicit instruction that defines what the reranking evaluation should focus on.
- This ensures that the selected records are not just statistically similar but also contextually relevant to the intended task.
Final Selection (top_n = 2):
- After reranking, the system selects the top 2 most contextually relevant records for response generation.
- The task definition ensures that these records align with the query’s intent, leading to a more precise and informative response.

query_engine = index.as_query_engine(similarity_top_k=10, node_postprocessors=[aimon_rerank])
response = query_engine.query("What did Sam Altman do in this essay?")

pprint_response(response, show_source=True)

Final Response: Sam Altman was involved in the transition of leadership at Y Combinator.

Source Node 1/2 Node ID: 470fdd7a-dbe7-4e9c-a258-9fd1cb8e61f7 Similarity: 0.48260445005911023 Text: When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17] As well as HN, I wrote all of YC's internal software in Arc. But while I continued to work a good deal in Arc, I gradually stopped working on Arc, partly because I didn't have t...

Source Node 2/2 Node ID: 5d012644-c40e-44d1-a5e9-e88ad7fae488 Similarity: 0.48151918284717965 Text: As Jessica and I were walking home from dinner on March 11, at the corner of Garden and Walker streets, these three threads converged. Screw the VCs who were taking so long to make up their minds. We'd start our own investment firm and actually implement the ideas we'd been talking about. I'd fund it, and Jessica could quit her job and work for ...

Conclusion

The AIMon reranker, using task definition, shifted retrieval focus from general YC leadership changes to Sam Altman’s specific actions. Initially, high-similarity documents lacked his decision-making details. After reranking, lower-similarity but contextually relevant documents highlighted his reluctance and timeline, ensuring a more accurate, task-aligned response over purely similarity-based retrieval.

Directly retrieve top 2 most similar nodes (i.e., without using a reranker).​

Retrieve top 10 most relevant nodes, but then rerank with AIMon Reranker.​

Conclusion​

Directly retrieve top 2 most similar nodes (i.e., without using a reranker).

Retrieve top 10 most relevant nodes, but then rerank with AIMon Reranker.

Conclusion