# echo-parrot **Repository Path**: workwind/echo-parrot ## Basic Information - **Project Name**: echo-parrot - **Description**: No description available - **Primary Language**: Java - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 7 - **Forks**: 2 - **Created**: 2025-05-20 - **Last Updated**: 2026-01-02 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Echo-Parrot Echo-Parrot (Chinese: 回声鹦鹉), is an AI service that automatically constructs industry-specific knowledge bases using LLM (Large Language Models) and various AI technologies based on publicly available information. ## Project Introduction Echo-Parrot is your dedicated industry knowledge mining expert. Simply define your target domain, and it will continuously collect, organize, and analyze public information from the internet 24/7, building a structured and professional industry knowledge base tailored to your needs. Say goodbye to tedious manual data collection and organization. Echo-Parrot leverages intelligent web crawling technology to automatically gather the latest industry insights from websites, blogs, RSS feeds, and more, while applying advanced AI techniques to deeply process and extract valuable knowledge. You can interact with the system in real-time through text or voice, adjusting the construction direction and refining content quality to ensure the knowledge base precisely meets your professional requirements. Whether for market research, competitive analysis, or industry trend forecasting, Echo-Parrot provides comprehensive, accurate, and up-to-date industry knowledge support, helping you quickly identify key value in the information ocean and gain a competitive edge in decision-making. ## Phase One Features (Completed) ### 1. Functional Features - Create knowledge base - Support batch upload of multiple documents - Supported document formats: txt, md, doc, docx, ppt, pptx, pdf - Save as standard text format (after reading but before chunking) - Chunk and persist to database - Knowledge base file list query - File chunk query - Similarity search - Integration with xiaozhi-server-java ### 2. Knowledge Base Storage - User-uploaded knowledge documents are stored in the system-configured local directory - `${datasets.path:datasets}/uploads` - stores original uploaded files - `${datasets.path:datasets}/segments` - stores segmented text chunks (currently in XML format) - Phase one supports limited document formats including Word, Markdown, TXT, and PDF (does not support images, Excel, or engineering drawings at this stage) - Vectorized embedding data is directly saved to Neo4j - Phase one uses standard paragraph segmentation followed by vectorization (does not create knowledge graph triples at this stage) ## Phase Two Feature Plan ### 1. Overview Automatically build industry knowledge bases based on public information, providing comprehensive market analysis, research, product selection, and generating insightful reports. Using "AI toys" as a demonstration domain for building an industry knowledge base, but EchoParrot is applicable to any industry sector. ### 2. MCP-Server Implementation As a provider of industry knowledge bases, EchoParrot will implement a complete MCP-Server: - Prompts do not directly provide MCP - Agents and various tools provide services externally through MCP - Provide complete HTTP+SSE MCP-Server implementation - Knowledge base content is provided as MCP Resources ### 3. Core Functional Modules #### 3.1 Information Source Management - Support configuration of RSS sources (such as WeChat official account articles) for automatic acquisition of industry-related information - Currently managed through configuration files (no visual interface provided in this phase) #### 3.2 Data Acquisition and Cleaning - Build full lifecycle data storage structure - Automatically process crawled data, removing duplicates and invalid information - Automatically integrate, analyze, and summarize materials - Establish relationships between products, companies, and industrial chains - System runs 24/7 for network data crawling and processing #### 3.3 Intelligent Knowledge Base - Retain original document storage - Perform vectorization on documents to support semantic search - Process raw materials into structured knowledge - Automatically identify new materials and update knowledge base, with historical tracking support #### 3.4 Question-Answering Agent (Supports Voice I/O) - Automatically plan tool calls and information acquisition based on user needs - Support exporting analysis results as PDF, Markdown, and other formats - Match most relevant market information and product data based on user input - Continuously optimize information acquisition and analysis strategies based on user feedback #### 3.5 AI Toy Product Library (Example Domain) - Catalog AI toy products currently on the market with detailed information - Manage by classification such as chips, PCBA, batteries, speakers, etc. - Record component technical specifications and associate with specific products - Provide product market trends, popularity indices, and other analyses #### 3.6 AI Toy Industry Chain - Visualize the upstream and downstream relationships of the AI toy industry chain #### 3.7 AI Toy Company Library - Catalog company information from upstream and downstream of the industry chain - Automatically collect and verify company contact information ### 4. User Management - Use JWT for user authentication - Implement permission management based on RBAC - Simple and intuitive interface, operation process not exceeding 3 steps - Currently only provides PC browser layout for this phase