# dota-rag **Repository Path**: uduk1836/dota-rag ## Basic Information - **Project Name**: dota-rag - **Description**: No description available - **Primary Language**: Python - **License**: Not specified - **Default Branch**: add_routing_v3 - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-06-26 - **Last Updated**: 2025-06-26 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Dota RAG Pipeline This repository implements a **Retrieval-Augmented Generation (RAG) pipeline** for answering support questions. It leverages **AWS SSM** for secure parameter management, a **transformer-based embedding model**, **Pinecone** for vector search, and **AI71's Falcon-180B-chat model** for generating responses. ## 🚀 Features - **🔐 Secure Parameter Management:** Uses AWS SSM to securely retrieve parameters and secrets. - **🤖 Transformer Embeddings:** Uses a pre-trained transformer model to generate embeddings for queries. - **🔍 Pinecone Integration:** Retrieves relevant context via vector search. - **📚 RAG Pipeline:** Combines retrieved context with a chat model to generate informative answers. - **📊 Reranking Mechanism:** Uses `bge-reranker-v2-m3` model to enhance search relevance. - **✅ Evaluation Script:** Processes a CSV file of questions and saves responses for evaluation. --- ## 🛠️ Requirements - Python **3.8 or later** - **pip** (Python package manager) - **Pinecone API key** - **AWS credentials (for SSM)** - **AI71 API key (for Falcon-180B-chat)** --- ## 📦 Installation 1. **Clone the repository:** ```bash git clone https://github.com/yourusername/dota-rag.git cd dota-rag ``` 2. **Install the required dependencies:** ```bash pip install -r requirements.txt ``` 3. **Set up environment variables:** - Copy the sample environment file: ```bash cp env_sample .env ``` - Open `.env` and insert your required keys (AWS credentials, Pinecone API key, AI71 API key, etc.). --- ## 🚀 Running the RAG Pipeline To run the pipeline for a **single query**, execute: ```bash python main.py ``` This script runs the retrieval and generation pipeline, using Pinecone to find relevant context and Falcon-180B-chat to generate a response. To verify the schema for submiting liverag. ```bash python verify_answer.py sample_answers.jsonl ``` --- ## 📊 Evaluation (Batch Processing) The repository includes an **evaluation script** that processes multiple questions from a jsonl file. 1. **Prepare your jsonl file:** Ensure your file is located at: ``` data/testset/testset-50q.jsonl ``` The jsonl should contain a column named **`Question`**. 2. **Get response to eval:** ```bash python get_response.py data/testset/testset-50q.jsonl # or python get_response.py data/testset/testset-3q.jsonl python verify_answer.py data/out/testset-3q-result.jsonl ``` 3. **Run judge score:** ```bash python utils/eval/evaluate.py \ --input_file "data/out/testset-3q-result.jsonl"\ --eval_name "both" ``` --- ## 📂 Repository Structure ``` dota-rag/ ├── data/ │ └── data_morgana_examples_live-rag.csv # CSV file with evaluation questions ├── utils/ │ ├── evaluate.py # Evaluation script │ ├── prompt_faithfulness.py # Faithfulness metric prompt templates │ └── prompt_relevance.py # Relevance metric prompt templates ├── rag/ │ ├── __init__.py # Package initialization and exports │ ├── aws_ssm.py # AWS SSM utilities │ ├── embedding.py # Transformer embedding functions │ ├── pinecone_utils.py # Pinecone vector search and reranking functions │ ├── rag_pipeline.py # RAG pipeline implementation │ └── rerank.py # Reranking functions (bge-reranker-v2-m3) ├── env_sample # Sample environment file (copy to .env) ├── get_response.py # Evaluation script to process CSV questions ├── main.py # Main script for running the pipeline ├── requirements.txt # Python dependencies └── README.md ``` --- ## 📌 Reranking Mechanism The pipeline **enhances retrieval accuracy** by reranking results with the `bge-reranker-v2-m3` model. 1. Pinecone retrieves top-k documents. 2. The **reranker model** reorders them based on **query relevance**. 3. The **most relevant** documents are fed into the **chat model**. ### Example Usage: ```python from rag.rerank import rerank_documents query = "Tell me about the tech company Apple" documents = [ "Apple is a fruit with a crisp texture.", "Apple Inc. is a technology company that makes the iPhone.", "Many people eat apples for their health benefits.", "Apple revolutionized the tech industry with its sleek designs." ] reranked_results = rerank_documents(query, documents, top_n=3) print(reranked_results) ``` **Output:** ``` [ {"score": 0.98, "document": "Apple Inc. is a technology company that makes the iPhone."}, {"score": 0.91, "document": "Apple revolutionized the tech industry with its sleek designs."}, {"score": 0.75, "document": "Apple is a fruit with a crisp texture."} ] ``` --- ## 🔥 Notes & Limitations - **Token Limits:** Falcon-180B-chat has a **2048-token limit** (input + output). The pipeline **truncates** context to stay within this. - **Retries & Skipped Questions:** If a question is missing (`NaN`) or invalid, it is **skipped** in evaluation to avoid errors. - **AI71 API Authentication:** Ensure **API keys** are set in `.env` before running. --- ## 📜 License This project is licensed under the **MIT License**.