# Icrawler9 **Repository Path**: guixuqi/icrawler9 ## Basic Information - **Project Name**: Icrawler9 - **Description**: No description available - **Primary Language**: Python - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-11-05 - **Last Updated**: 2024-11-05 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README Project Demo 1. Clone the Demo repository:
```python git clone https://gitee.com/guixuqi/icrawler9.git ``` 2. (Recommended) Create a virtual environment to manage Python packages for your project:
```python cd Icrawler9 python3 -m venv venv ``` 3. Activate the virtual environment:
```python windows: cd ./venv/Scripts activate linux: .\venv\Scripts\activate source venv/bin/activate ``` 4. Install the required Python packages from requirements.txt:
```python pip install -r requirements.txt ``` 5. Add tasks redis: push 'digikey:tasks' 'https://www.digikey.com/en/products/result?keywords={}'.format(keyword) 6. Start spider:
```python scrapy crawl 'spider name' ``` Project Tree ```python ├─Material | ├─middlewares | ├─pipelines | ├─items | ├─settings | └─fileStores(文件存储器) | | ├─excels(数据excel格式) | | ├─imgs(图片统一保存目录) | | ├─pdfs(本地pdf文件) | | ├─rars(压缩文件保存目录) | | └─jsons(json格式文件) | ├─special(辅助功能器) | ├─tools(共用方法和配置目录) | | ├─configs.py(共用配置) | | └─utils.py(共用方法) | └─spiders | | ├─digikey | | ├─.... ├─Alscript(其他功能目录) | ├─dataPro(数据处理) ├─.gitignore(git忽略文件) ├─requirements.txt(依赖库) └─.... ```