# watercrawl
**Repository Path**: ancodehub/watercrawl
## Basic Information
- **Project Name**: watercrawl
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-12-25
- **Last Updated**: 2025-12-25
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README

[](https://watercrawl.dev)
[](https://watercrawl.dev/pricing)
[](https://github.com/watercrawl/watercrawl/releases)
[](https://github.com/watercrawl/watercrawl/actions)
[](https://hub.docker.com/r/watercrawl/watercrawl)
[](https://github.com/watercrawl/watercrawl/stargazers)
[](https://github.com/watercrawl/watercrawl/issues)
[](https://www.python.org/downloads/)
π·οΈ WaterCrawl is a powerful web application that uses Python, Django, Scrapy, and Celery to crawl web pages and extract relevant data.
## π Quick Start
1. π³ [Quick start](#-quick-start)
2. π» [Development **(For Contributing)**](./CONTRIBUTING.md)
### π³ Quick start
To build and run WaterCrawl on Docker locally, please follow these steps:
1. Clone the repository:
```bash
git clone https://github.com/watercrawl/watercrawl.git
cd watercrawl
```
2. Build and run the Docker containers:
```bash
cd docker
cp .env.example .env
docker compose up -d
```
3. Access the application with open [http://localhost](http://localhost)
> **β οΈ IMPORTANT**: If you're deploying on a domain or IP address other than localhost, you MUST update the MinIO configuration in your .env file:
> ```bash
> # Change this from 'localhost' to your actual domain or IP
> MINIO_EXTERNAL_ENDPOINT=your-domain.com
>
> # Also update these URLs accordingly
> MINIO_BROWSER_REDIRECT_URL=http://your-domain.com/minio-console/
> MINIO_SERVER_URL=http://your-domain.com/
> ```
> Failure to update these settings will result in broken file uploads and downloads. For more details, see [DEPLOYMENT.md](./DEPLOYMENT.md).
> **Important:** Before deploying to production, ensure that you update the `.env` file with the appropriate configuration values. Additionally, make sure to set up and configure the database, MinIO, and any other required services. for more information, please read the [Deployment Guide](./DEPLOYMENT.md).
### π» Development (For Contributing)
For local development and contribution, please follow our [Contributing Guide](./CONTRIBUTING.md) π€
## β¨ Features
- **πΈοΈ Advanced Web Crawling & Scraping** - Crawl websites with highly customizable options for depth, speed, and targeting specific content
- **π Powerful Search Engine** - Find relevant content across the web with multiple search depths (basic, advanced, ultimate)
- **π Multi-language Support** - Search and crawl content in different languages with country-specific targeting
- **β‘ Asynchronous Processing** - Monitor real-time progress of crawls and searches via Server-Sent Events (SSE)
- **π REST API with OpenAPI** - Comprehensive API with detailed documentation and client libraries
- **π Rich Ecosystem** - Integrations with Dify, N8N, and other AI/automation platforms
- **π Self-hosted & Open Source** - Full control over your data with easy deployment options
- **π Advanced Results Handling** - Download and process search results with customizable parameters
Check our [API Overview](https://docs.watercrawl.dev/intro) to learn more about these features.
## π οΈ Client SDKs
- β
[**Python Client**](https://docs.watercrawl.dev/clients/python) - Full-featured SDK with support for all API endpoints
- β
[**Node.js Client**](https://docs.watercrawl.dev/clients/nodejs) - Complete JavaScript/TypeScript integration
- β
[**Go Client**](https://docs.watercrawl.dev/clients/go) - Full-featured SDK with support for all API endpoints
- β
[**PHP Client**](https://docs.watercrawl.dev/clients/php) - Full-featured SDK with support for all API endpoints
- π [**Rust Client**](https://docs.watercrawl.dev/clients/rust) - Coming soon
## π Integrations
- β
[Dify Plugin](https://marketplace.dify.ai/plugins/watercrawl/watercrawl) ([source code](https://github.com/watercrawl/watercrawl-dify-plugin))
- β
[N8N workflow node](https://www.npmjs.com/package/@watercrawl/n8n-nodes-watercrawl) ([source code](https://github.com/watercrawl/n8n-nodes-watercrawl))
- β
[Dify Knowledge Base](https://dify.ai/)
- π Langflow (Pull Request - Not Merged yet)
- π Flowise (Coming soon)
## π§ Plugins
- β
WaterCrawl plugin
- β
OpenAI Plugin
## β Star History
[](https://star-history.com/#watercrawl/watercrawl&Date)
## π Security Disclosure
β οΈ Please avoid posting security issues on GitHub. Instead, send your questions to support@watercrawl.dev and we will provide you with a more detailed answer.
## π License
This repository is available under the [WaterCrawl License](LICENSE), which is essentially MIT with a few additional restrictions.
---
Made with β€οΈ by the WaterCrawl Team