Skip to content

Integrating Prompt Compression and Reranking in AIR

Research Agent  Compression API  Custom Agent  Reranker API

This tutorial demonstrates how to use the prompt compression API and reranker API within the AIR framework.


Introduction

In complex AI systems, efficiently retrieving and processing information is crucial. The prompt compression API reduces the size of input prompts without losing essential information, enabling faster and more cost-effective processing. The reranker API improves the relevance of retrieved documents by reordering them based on their pertinence to the query.

This tutorial showcases how to integrate these two APIs into a research agent within AIR, enhancing its ability to answer user queries by retrieving, compressing, and reranking relevant information.

Overview of the Flow

The process involves several steps:

  1. User Query Input: The user provides a query.
  2. Information Retrieval: The agent retrieves documents from various sources using the user's query.
  3. Reranking: The reranker API reorders the retrieved documents based on their relevance.
  4. Compression: The prompt compression API reduces the size of the top-ranked documents.
  5. Response Generation: The agent formats the compressed documents into a prompt and generates a comprehensive response.

Below is a textual representation of the flow:

User Query
Information Retrieval (from multiple sources)
Retrieved Documents
Reranker API
Ranked Documents
Prompt Compression API
Compressed Documents
Response Generation
Final Answer

Configuration Overview

First, it is essential to understand the configuration settings for the reranker and compression features. The ResearchAgent is configured using a YAML configuration file.

Here is the relevant configuration snippet:

base_config:
  reranker_config:
    model: "BAAI/bge-reranker-large" # a reranker from our model catalog

  compression_config:
    model: "microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank" # a compression model from our model catalog

orchestrator:
  agent_list:
    - agent_name: "ResearchAgent"

utility_agents:
  - agent_class: ResearchAgent
    agent_name: "Research Agent"
    agent_description: "This agent can help you in research the information needed by the user on the internet."
    config:
      reranker_top_k: 15
      compression_rate: 0.4
      retriever_config_list:
        - retriever_name: "Internet Search" # A name you choose for your retriever
          retriever_class: WebSearchRetriever # WebSearchRetriever is the type of retriever that performs web search via Google. 
          description: "This data source can collect the latest news / information from the open internet to answer any queries." # Optional. A description of the retrievar

Explanation of Configuration Parameters

  • reranker_top_k:
    • Purpose: Determines how many top documents to keep after reranking.
    • Usage: If set to a positive integer (e.g., 15), the agent retains the top 15 most relevant documents after reranking.
    • Skipping Reranking: Setting this to a negative value will skip the reranking step entirely.
  • compression_rate:
    • Purpose: Defines the proportion to which the retrieved documents should be compressed.
    • Usage: A value between 0 and 1. For example, 0.4 compresses the documents to 40% of their original size.
    • No Compression: Setting this to 1 means no compression will be applied.
  • retriever_config_list:
    • Purpose: Defines the retrievers (data sources) used by the research agent to find relevant information for user queries. Each retriever is configured with a name, a retriever class, and a description of its purpose.

Project Execution

Next, use our DistillerClient API to create a distiller client. This client will interface with the AI Refinery service to run your project. Below is a function that sets up the distiller client. Here's what it does:

  • Instantiates a DistillerClient.
  • Creates a project named example using the configuration specified in the example.yaml file.
  • Runs the project in interactive mode.
def interactive():
    distiller_client = DistillerClient()

    # upload your config file to register a new distiller project
    distiller_client.create_project(config_path="example.yaml", project="example") 


    distiller_client.interactive(
        project="example",
        uuid="test_user",
    )
if __name__ == "__main__":

    # Run Interactive Mode
    print("\nInteractive Mode")
    interactive()

Sample Output

Let's consider a sample user query and observe how the system processes it.

User Query:

"Research the future of generative AI in Customer Growth"

System Processing:

  1. Information Retrieval:
    • Retrieves documents from sources like industry reports, academic papers, and news articles using the user's query.
  2. Reranking:
    • Reranks the documents to prioritize the most relevant ones concerning the query.
  3. Compression:

    • Compresses the top-ranked documents to include only essential information, reducing the prompt size to 40% of the original.
    • Example:
      • Original Text:
        * Which industries stand to gain the most?
           * What activities will deliver the most value for organizations?
           * How do—and will—workers feel about the technology?
           * What safeguards are needed to ensure responsible use of gen AI?
        
        In this visual _Explainer_, we’ve compiled all the answers we have so far—in 15 charts. We expect this space to evolve rapidly and will continue to roll out our research as that happens. To stay up to date on this topic, register for our email alerts on “artificial intelligence” here.
        
        ##
        
        Gen AI finds its legs
        
        The advanced machine learning that powers gen AI–enabled products has been decades in the making. But since ChatGPT came off the starting block in late 2022, new iterations of gen AI technology have been released several times a month. In March 2023 alone, there were six major steps forward, including new customer relationship management solutions and support for the financial services industry.
        
        _Source: What every CEO should know about generative AI_
        
      • Compressed Text:
        industries gain most? activities value for organizations? workers feel technology? safeguards responsible use gen AI? compiled answers in 15 charts expect space to evolve rapidly roll out research. register email alerts on artificial intelligence. Gen AI finds legs advanced machine learning gen AI products decades in making ChatGPT late 2022 new iterations gen AI technology released. March 2023 six major steps forward new customer relationship management solutions support for financial services industry. every CEO know about generative
        
  4. Response Generation:

    • Formats the prompt with the compressed documents.
    • Generates a comprehensive report.

Final Answer:

# The Future of AI in Customer Growth

The future of Artificial Intelligence (AI) in customer growth is rapidly evolving, with various trends and technologies emerging to transform the way businesses interact with their customers. According to recent studies, AI is expected to play a significant role in enhancing customer experiences, improving customer loyalty, and driving business growth.

**Key Trends in AI-Driven Customer Growth**

- **Personalization**: AI-powered personalization is becoming increasingly important, with 79% of customers expecting personalized experiences from brands[^1]. AI helps businesses tailor interactions to individual customer preferences, driving higher engagement and loyalty.
- **Generative AI**: Generative AI is a rapidly emerging trend with the potential to revolutionize customer service and marketing. It can be used to personalize offerings, optimize marketing and sales activities, and improve customer experiences[^2].
- **Voice Assistants**: Voice assistants like Amazon Alexa and Apple Siri are becoming increasingly popular, with 97% of mobile users utilizing AI-powered voice assistants[^3]. Businesses are expected to integrate voice assistants into their customer service strategies to provide more seamless and personalized experiences.
- **Chatbots**: Chatbots are becoming more sophisticated, with 91% of customer success leaders considering AI chatbots effective for customer support[^4]. Businesses are investing more in chatbots to handle customer inquiries and provide 24/7 support.

**Benefits of AI in Customer Growth**

- **Improved Customer Satisfaction**: AI enables businesses to provide more personalized and seamless experiences, leading to higher customer satisfaction and loyalty.
- **Increased Efficiency**: AI automates routine tasks, freeing up human customer support agents to focus on more complex and high-value tasks.
- **Enhanced Customer Insights**: AI provides businesses with valuable insights into customer behavior and preferences, allowing for data-driven decisions and improved marketing and sales strategies.

**Challenges and Concerns**

- **Data Quality and Security**: Ensuring high-quality and secure data is crucial for effectively training and deploying AI models.
- **Transparency and Explainability**: Businesses must ensure AI decision-making processes are transparent and explainable to build trust with customers.
- **Job Displacement**: The adoption of AI may lead to job displacement; companies need to develop strategies to upskill and reskill employees.

**Conclusion**

The future of AI in customer growth is exciting and rapidly evolving. Businesses that adopt AI technologies and strategies can expect significant benefits, including improved customer satisfaction, increased efficiency, and enhanced customer insights. However, they must also address the challenges associated with AI adoption, such as data quality, transparency, and workforce impact.

**References**

1. Salesforce State of the Connected Customer report
2. McKinsey & Company Report on Generative AI
3. Tech Jury statistics on voice search and AI-powered voice assistants
4. HubSpot survey on the future of AI in customer service
Note: The references correspond to the retrieved and compressed documents.

Conclusion

By integrating the prompt compression and reranker APIs, the AIR system efficiently processes user queries, retrieves and prioritizes relevant information, and generates detailed, high-quality responses.