šŸ”Retrivers

AI components that efficiently fetch relevant data from knowledge bases in response to queries, supporting natural language tasks like question-answering and information retrieval.

1)AWS Bedrock Knowledge Base Retriever

Purpose: Retrieves relevant documents or chunks from a Knowledge Base created and managed in Amazon Bedrock.

How It Works:

  • Uses the Bedrock service's native vector store and retrieval tools.

  • Typically paired with a Bedrock-supported embedding model (like Titan or Cohere).

Use Case: Enterprise RAG applications hosted entirely on AWS infrastructure for scalability and security.

2)Custom Retriever

Purpose: Allows you to define your own logic for retrieving documents from a vector database or custom source.

How It Works:

  • You can implement custom search logic (e.g., specific filters, hybrid retrieval, advanced ranking).

  • Often used when built-in retrievers (like Pinecone or Qdrant) don't meet specific needs.

Use Case: When you need fine-grained control over how data is retrieved—e.g., combining metadata filtering, hybrid ranking, or integrating proprietary databases.

3)Embeddings Filter Retriever

A document compressor that uses embeddings to drop documents unrelated to the query.

Purpose: Allows filtering of retrieved documents based on embedding similarity threshold from a vector store.

How it Works:

• Uses embeddings to compare query similarity with stored documents. • Applies a similarity threshold to filter out less relevant results. • Returns only documents that meet the defined similarity score.

Use Cases:

• Improving search accuracy by removing low-relevance results. • Fine-tuning retrieval quality in RAG systems.

4)HyDE Retriever

Use HyDE retriever to retrieve from a vector store.

Purpose: Enhances retrieval by generating a hypothetical document using an LLM before performing the search.

How it Works:

• Takes the user query and generates a hypothetical answer using a language model. • Converts this generated text into embeddings. • Uses these embeddings to retrieve more relevant documents from the vector store.

Use Cases:

• Improving retrieval for vague or short queries. • Boosting performance in semantic search and RAG pipelines.

5)LLM Filter Retriever

Iterate over the initially returned documents and extract, from each, only the content that is relevant to the query.

Purpose: Filters retrieved documents using a language model to ensure only relevant results are returned.

How it Works:

• Retrieves documents from a vector store. • Uses an LLM to evaluate and filter the results. • Keeps only documents that are contextually relevant to the query.

Use Cases:

• Removing irrelevant or noisy results. • Improving answer quality in AI applications.

6) Multi Query Retriever

Purpose: Improves retrieval by generating multiple variations of a query to fetch more comprehensive results.

How it Works:

• Uses an LLM to generate multiple query variations from a single input. • Executes each query against the vector store. • Combines results to provide a broader and more accurate context.

Use Cases:

• Handling ambiguous or complex queries. • Increasing recall in retrieval-based systems.

7)Prompt Retriever

Store prompt template with name & description to be later queried by MultiPromptChain.

Purpose: Retrieves predefined prompts to guide the model in generating structured and domain-specific responses.

How it Works:

• Uses stored prompt templates based on a given prompt name. • Applies system message and description to guide the model. • Provides structured instructions for consistent output generation.

Use Cases:

• Domain-specific assistants (e.g., physics, medical, legal). • Reusing predefined prompts across workflows.

8)Reciprocal Rank Fusion Retriever

Reciprocal Rank Fusion to re-rank search results by multiple query generation.

Purpose: Combines results from multiple retrieval methods to improve overall ranking and relevance.

How it Works:

• Retrieves results using a vector store retriever. • Uses multiple ranking strategies. • Applies Reciprocal Rank Fusion (RRF) to merge and re-rank results. • Returns the most relevant combined results.

Use Cases:

• Improving retrieval accuracy. • Combining multiple retrieval strategies.

9)Similarity Score Threshold Retriever

Return results based on the minimum similarity percentage.

Purpose: Filters retrieved documents based on a minimum similarity score to ensure only relevant results are returned.

How it Works:

• Performs similarity search on a vector store. • Calculates similarity scores for retrieved documents. • Filters out results below the defined threshold. • Returns only high-relevance documents.

Use Cases:

• Removing low-quality or irrelevant results. • Fine-tuning retrieval precision in RAG systems.

10)Vector Store Retriever

Store vector store as retriever to be later queried by Multi Retrieval QA Chain.

Purpose: Retrieves relevant documents directly from a vector store based on similarity search.

How it Works:

• Takes a query and converts it into embeddings. • Searches the vector store for similar embeddings. • Retrieves the most relevant documents. • Uses retriever name and description for identification.

Use Cases:

• Basic semantic search. • Retrieving documents for RAG pipelines.

11)Voyage AI Rerank Retriever

Voyage AI Rerank indexes the documents from most to least semantically relevant to the query.

Purpose: Improves retrieval quality by re-ranking documents using Voyage AI models.

How it Works:

• Retrieves initial results from a vector store. • Sends results along with the query to Voyage AI rerank model. • Reorders documents based on relevance. • Returns the most relevant ranked results.

Use Cases:

• Improving search result accuracy. • Enhancing ranking in retrieval pipelines.

Last updated