# Python网络爬虫进阶_中

**Repository Path**: gitgn/spider4

## Basic Information

- **Project Name**: Python网络爬虫进阶_中
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2019-01-13
- **Last Updated**: 2020-12-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

1、使用Scrapy框架和Selenium配合爬取京东网站商品列表信息（>=50页）

项目：scrapy_selenium

mongodb数据库配置：
MONGO_URI = 'localhost'
MONGO_DB = 'jd'
存储集合：products

爬取文件：JD.py

2、使用scrapy-redis分布式爬取CSDN学院平台中所有课程信息

项目：scrapy-redis
redis配置：
REDIS_HOST = '127.0.0.1'
REDIS_PORT = 6379

将redis中爬取的课程数据保存到mongo脚本：
slave\spider_CSDNCourse\spider_CSDNCourse\saveCourse.py，

mongo存储信息：
数据库：csdn
存储集合：course