# Scrapy_Book_Code

**Repository Path**: NiceBlueChai/Scrapy_Book_Code

## Basic Information

- **Project Name**: Scrapy_Book_Code
- **Description**: 《精通 Scrapy 网络爬虫》刘硕 书中源代码
- **Primary Language**: Python
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 1
- **Forks**: 0
- **Created**: 2019-12-28
- **Last Updated**: 2020-12-24

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# 《精通 Scrapy 网络爬虫》刘硕 书中源代码
### 环境：Python3

### 第一章 初识 Scrapy
example

### 第五章 使用 Item Pipeline 处理数据
英镑转人民币,过滤重复数据 | 将数据存入 MongoDB  
charpter5

### 第七章 添加到处数据格式 Excel
charpter7

### 第八章 爬取书籍信息
toscrape_book

### 第九章 下载文件和图片
1.爬取 matplotlib 例子源码文件  
matplotlib_examples

2.下载360图片  
so_image

### 第十章 模拟登录
1.模拟登录 webscraping  
webscraping

2.验证码识别  
charpter10_captcha

3.Cookie 登录  
browser_cookie

### 第十一章 爬取动态页面
爬取 toscrape 中的名人名言 | 爬取京东商城中的书籍信息  
charpter11

### 第十二章 存入数据库
存入数据库 MySQL  
mysql_toscrape

### 第十三章 使用 HTTP 代理
1.实现随机代理  
proxy_example

2.爬取豆瓣电影信息  
douban_movie  
[Scrapy shell url 调试返回 403 错误](http://www.iamnancy.top/post/47/)

### 第十四章 分布式爬取
使用 scrapy-redis 进行分布式爬取  
toscrape_book_distributed

[redis 的使用](http://www.iamnancy.top/post/51/)  
[使用 scrapy-redis 进行分布式爬取](http://www.iamnancy.top/post/50/)