1 Star 1 Fork 1

yangjh / python

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
spider.html 57.82 KB
一键复制 编辑 原始数据 按行查看 历史
yangjh 提交于 2020-05-07 17:48 . 搬迁
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638
<!DOCTYPE html>
<html lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title>第 12 章 网络爬虫 | Python 编程</title>
<meta name="description" content="使用Python进行编程" />
<meta name="generator" content="bookdown 0.17.2 and GitBook 2.6.7" />
<meta property="og:title" content="第 12 章 网络爬虫 | Python 编程" />
<meta property="og:type" content="book" />
<meta property="og:description" content="使用Python进行编程" />
<meta name="github-repo" content="yangjh-xbmu/learningpython" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="第 12 章 网络爬虫 | Python 编程" />
<meta name="twitter:description" content="使用Python进行编程" />
<meta name="author" content="杨志宏" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="prev" href="argparse.html"/>
<link rel="next" href="httplib.html"/>
<script src="libs/jquery-2.2.3/jquery.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-clipboard.css" rel="stylesheet" />
<style type="text/css">
code.sourceCode > span { display: inline-block; line-height: 1.25; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode { white-space: pre; position: relative; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
code.sourceCode { white-space: pre-wrap; }
code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<link rel="stylesheet" href="css\style.css" type="text/css" />
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li><a href="./">LearningPython</a></li>
<li class="divider"></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html"><i class="fa fa-check"></i>Python 简介</a><ul>
<li class="chapter" data-level="0.1" data-path="index.html"><a href="index.html#python-发展历史"><i class="fa fa-check"></i><b>0.1</b> Python 发展历史</a></li>
<li class="chapter" data-level="0.2" data-path="index.html"><a href="index.html#python-特点"><i class="fa fa-check"></i><b>0.2</b> Python 特点</a></li>
<li class="chapter" data-level="0.3" data-path="index.html"><a href="index.html#使用-python-的知名项目"><i class="fa fa-check"></i><b>0.3</b> 使用 Python 的知名项目</a></li>
<li class="chapter" data-level="0.4" data-path="index.html"><a href="index.html#学习资源"><i class="fa fa-check"></i><b>0.4</b> 学习资源</a></li>
</ul></li>
<li class="chapter" data-level="1" data-path="ide.html"><a href="ide.html"><i class="fa fa-check"></i><b>1</b> Python 开发环境搭建</a><ul>
<li class="chapter" data-level="1.1" data-path="ide.html"><a href="ide.html#在-windows-中安装-python-3"><i class="fa fa-check"></i><b>1.1</b> 在 Windows 中安装 Python 3</a></li>
<li class="chapter" data-level="1.2" data-path="ide.html"><a href="ide.html#在-macos-中安装-python-3"><i class="fa fa-check"></i><b>1.2</b> 在 MacOS 中安装 Python 3</a><ul>
<li class="chapter" data-level="1.2.1" data-path="ide.html"><a href="ide.html#安装-xcode"><i class="fa fa-check"></i><b>1.2.1</b> 安装 Xcode</a></li>
<li class="chapter" data-level="1.2.2" data-path="ide.html"><a href="ide.html#安装-homebrew"><i class="fa fa-check"></i><b>1.2.2</b> 安装 Homebrew</a></li>
<li class="chapter" data-level="1.2.3" data-path="ide.html"><a href="ide.html#安装-python3"><i class="fa fa-check"></i><b>1.2.3</b> 安装 Python3</a></li>
<li class="chapter" data-level="1.2.4" data-path="ide.html"><a href="ide.html#使用-python3"><i class="fa fa-check"></i><b>1.2.4</b> 使用 Python3</a></li>
</ul></li>
<li class="chapter" data-level="1.3" data-path="ide.html"><a href="ide.html#使用-python-虚拟环境"><i class="fa fa-check"></i><b>1.3</b> 使用 python 虚拟环境</a><ul>
<li class="chapter" data-level="1.3.1" data-path="ide.html"><a href="ide.html#为什么要用虚拟环境"><i class="fa fa-check"></i><b>1.3.1</b> 为什么要用虚拟环境</a></li>
<li class="chapter" data-level="1.3.2" data-path="ide.html"><a href="ide.html#虚拟环境的创建与使用"><i class="fa fa-check"></i><b>1.3.2</b> 虚拟环境的创建与使用</a></li>
<li class="chapter" data-level="1.3.3" data-path="ide.html"><a href="ide.html#使用-pipenv-管理虚拟环境"><i class="fa fa-check"></i><b>1.3.3</b> 使用 pipenv 管理虚拟环境</a></li>
<li class="chapter" data-level="1.3.4" data-path="ide.html"><a href="ide.html#虚拟环境与系统环境的区别"><i class="fa fa-check"></i><b>1.3.4</b> 虚拟环境与系统环境的区别</a></li>
</ul></li>
<li class="chapter" data-level="1.4" data-path="ide.html"><a href="ide.html#编辑器"><i class="fa fa-check"></i><b>1.4</b> 编辑器</a><ul>
<li class="chapter" data-level="1.4.1" data-path="ide.html"><a href="ide.html#visual-studio-code-中的必要设置"><i class="fa fa-check"></i><b>1.4.1</b> Visual Studio Code 中的必要设置</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>I 核心语法</b></span></li>
<li class="chapter" data-level="2" data-path="systax.html"><a href="systax.html"><i class="fa fa-check"></i><b>2</b> 核心语法</a><ul>
<li class="chapter" data-level="2.1" data-path="systax.html"><a href="systax.html#注释"><i class="fa fa-check"></i><b>2.1</b> 注释</a></li>
<li class="chapter" data-level="2.2" data-path="systax.html"><a href="systax.html#变量"><i class="fa fa-check"></i><b>2.2</b> 变量</a><ul>
<li class="chapter" data-level="2.2.1" data-path="systax.html"><a href="systax.html#变量名称"><i class="fa fa-check"></i><b>2.2.1</b> 变量名称</a></li>
<li class="chapter" data-level="2.2.2" data-path="systax.html"><a href="systax.html#变量赋值"><i class="fa fa-check"></i><b>2.2.2</b> 变量赋值</a></li>
<li class="chapter" data-level="2.2.3" data-path="systax.html"><a href="systax.html#同步赋值"><i class="fa fa-check"></i><b>2.2.3</b> 同步赋值</a></li>
</ul></li>
<li class="chapter" data-level="2.3" data-path="systax.html"><a href="systax.html#数字类型"><i class="fa fa-check"></i><b>2.3</b> 数字类型</a><ul>
<li class="chapter" data-level="2.3.1" data-path="systax.html"><a href="systax.html#查看变量类型"><i class="fa fa-check"></i><b>2.3.1</b> 查看变量类型</a></li>
<li class="chapter" data-level="2.3.2" data-path="systax.html"><a href="systax.html#整型"><i class="fa fa-check"></i><b>2.3.2</b> 整型</a></li>
<li class="chapter" data-level="2.3.3" data-path="systax.html"><a href="systax.html#浮点类型"><i class="fa fa-check"></i><b>2.3.3</b> 浮点类型</a></li>
<li class="chapter" data-level="2.3.4" data-path="systax.html"><a href="systax.html#复数"><i class="fa fa-check"></i><b>2.3.4</b> 复数</a></li>
</ul></li>
<li class="chapter" data-level="2.4" data-path="systax.html"><a href="systax.html#运算符"><i class="fa fa-check"></i><b>2.4</b> 运算符</a><ul>
<li class="chapter" data-level="2.4.1" data-path="systax.html"><a href="systax.html#运算符的优先级别"><i class="fa fa-check"></i><b>2.4.1</b> 运算符的优先级别</a></li>
<li class="chapter" data-level="2.4.2" data-path="systax.html"><a href="systax.html#增强赋值运算符"><i class="fa fa-check"></i><b>2.4.2</b> 增强赋值运算符</a></li>
</ul></li>
<li class="chapter" data-level="2.5" data-path="systax.html"><a href="systax.html#序列"><i class="fa fa-check"></i><b>2.5</b> 序列</a><ul>
<li class="chapter" data-level="2.5.1" data-path="systax.html"><a href="systax.html#索引"><i class="fa fa-check"></i><b>2.5.1</b> 索引</a></li>
<li class="chapter" data-level="2.5.2" data-path="systax.html"><a href="systax.html#分片"><i class="fa fa-check"></i><b>2.5.2</b> 分片</a></li>
<li class="chapter" data-level="2.5.3" data-path="systax.html"><a href="systax.html#序列相加"><i class="fa fa-check"></i><b>2.5.3</b> 序列相加</a></li>
<li class="chapter" data-level="2.5.4" data-path="systax.html"><a href="systax.html#序列相乘"><i class="fa fa-check"></i><b>2.5.4</b> 序列相乘</a></li>
<li class="chapter" data-level="2.5.5" data-path="systax.html"><a href="systax.html#成员资格"><i class="fa fa-check"></i><b>2.5.5</b> 成员资格</a></li>
<li class="chapter" data-level="2.5.6" data-path="systax.html"><a href="systax.html#长度最小值最大值"><i class="fa fa-check"></i><b>2.5.6</b> 长度、最小值、最大值</a></li>
</ul></li>
<li class="chapter" data-level="2.6" data-path="systax.html"><a href="systax.html#字符串"><i class="fa fa-check"></i><b>2.6</b> 字符串</a><ul>
<li class="chapter" data-level="2.6.1" data-path="systax.html"><a href="systax.html#创建字符串"><i class="fa fa-check"></i><b>2.6.1</b> 创建字符串</a></li>
<li class="chapter" data-level="2.6.2" data-path="systax.html"><a href="systax.html#字符串的不可变性"><i class="fa fa-check"></i><b>2.6.2</b> 字符串的不可变性</a></li>
<li class="chapter" data-level="2.6.3" data-path="systax.html"><a href="systax.html#字符串操作"><i class="fa fa-check"></i><b>2.6.3</b> 字符串操作</a></li>
<li class="chapter" data-level="2.6.4" data-path="systax.html"><a href="systax.html#字符串分片"><i class="fa fa-check"></i><b>2.6.4</b> 字符串分片</a></li>
<li class="chapter" data-level="2.6.5" data-path="systax.html"><a href="systax.html#in-和-not-in-操作符"><i class="fa fa-check"></i><b>2.6.5</b> in 和 not in 操作符</a></li>
<li class="chapter" data-level="2.6.6" data-path="systax.html"><a href="systax.html#string-对象的方法"><i class="fa fa-check"></i><b>2.6.6</b> String 对象的方法</a></li>
<li class="chapter" data-level="2.6.7" data-path="systax.html"><a href="systax.html#比较字符串"><i class="fa fa-check"></i><b>2.6.7</b> 比较字符串</a></li>
<li class="chapter" data-level="2.6.8" data-path="systax.html"><a href="systax.html#遍历字符串"><i class="fa fa-check"></i><b>2.6.8</b> 遍历字符串</a></li>
<li class="chapter" data-level="2.6.9" data-path="systax.html"><a href="systax.html#字符串内容检验"><i class="fa fa-check"></i><b>2.6.9</b> 字符串内容检验</a></li>
<li class="chapter" data-level="2.6.10" data-path="systax.html"><a href="systax.html#在字符串内查找和替换"><i class="fa fa-check"></i><b>2.6.10</b> 在字符串内查找和替换</a></li>
</ul></li>
<li class="chapter" data-level="2.7" data-path="systax.html"><a href="systax.html#列表"><i class="fa fa-check"></i><b>2.7</b> 列表</a><ul>
<li class="chapter" data-level="2.7.1" data-path="systax.html"><a href="systax.html#列表赋值"><i class="fa fa-check"></i><b>2.7.1</b> 列表赋值</a></li>
<li class="chapter" data-level="2.7.2" data-path="systax.html"><a href="systax.html#删除元素"><i class="fa fa-check"></i><b>2.7.2</b> 删除元素</a></li>
<li class="chapter" data-level="2.7.3" data-path="systax.html"><a href="systax.html#分片赋值"><i class="fa fa-check"></i><b>2.7.3</b> 分片赋值</a></li>
<li class="chapter" data-level="2.7.4" data-path="systax.html"><a href="systax.html#列表对象常用内置方法"><i class="fa fa-check"></i><b>2.7.4</b> 列表对象常用内置方法</a></li>
</ul></li>
<li class="chapter" data-level="2.8" data-path="systax.html"><a href="systax.html#字典"><i class="fa fa-check"></i><b>2.8</b> 字典</a><ul>
<li class="chapter" data-level="2.8.1" data-path="systax.html"><a href="systax.html#创建字典"><i class="fa fa-check"></i><b>2.8.1</b> 创建字典</a></li>
<li class="chapter" data-level="2.8.2" data-path="systax.html"><a href="systax.html#获取修改和添加字典元素"><i class="fa fa-check"></i><b>2.8.2</b> 获取、修改和添加字典元素</a></li>
<li class="chapter" data-level="2.8.3" data-path="systax.html"><a href="systax.html#遍历字典"><i class="fa fa-check"></i><b>2.8.3</b> 遍历字典</a></li>
<li class="chapter" data-level="2.8.4" data-path="systax.html"><a href="systax.html#字典常用方法"><i class="fa fa-check"></i><b>2.8.4</b> 字典常用方法</a></li>
<li class="chapter" data-level="2.8.5" data-path="systax.html"><a href="systax.html#字典的排序"><i class="fa fa-check"></i><b>2.8.5</b> 字典的排序</a></li>
</ul></li>
<li class="chapter" data-level="2.9" data-path="systax.html"><a href="systax.html#元组"><i class="fa fa-check"></i><b>2.9</b> 元组</a><ul>
<li class="chapter" data-level="2.9.1" data-path="systax.html"><a href="systax.html#创建元组"><i class="fa fa-check"></i><b>2.9.1</b> 创建元组</a></li>
<li class="chapter" data-level="2.9.2" data-path="systax.html"><a href="systax.html#元组相关方法"><i class="fa fa-check"></i><b>2.9.2</b> 元组相关方法</a></li>
</ul></li>
<li class="chapter" data-level="2.10" data-path="systax.html"><a href="systax.html#控制声明"><i class="fa fa-check"></i><b>2.10</b> 控制声明</a><ul>
<li class="chapter" data-level="2.10.1" data-path="systax.html"><a href="systax.html#分支判断"><i class="fa fa-check"></i><b>2.10.1</b> 分支判断</a></li>
<li class="chapter" data-level="2.10.2" data-path="systax.html"><a href="systax.html#分支嵌套"><i class="fa fa-check"></i><b>2.10.2</b> 分支嵌套</a></li>
<li class="chapter" data-level="2.10.3" data-path="systax.html"><a href="systax.html#三元运算符"><i class="fa fa-check"></i><b>2.10.3</b> 三元运算符</a></li>
</ul></li>
<li class="chapter" data-level="2.11" data-path="systax.html"><a href="systax.html#循环"><i class="fa fa-check"></i><b>2.11</b> 循环</a><ul>
<li class="chapter" data-level="2.11.1" data-path="systax.html"><a href="systax.html#for-循环"><i class="fa fa-check"></i><b>2.11.1</b> for 循环</a></li>
<li class="chapter" data-level="2.11.2" data-path="systax.html"><a href="systax.html#范围循环"><i class="fa fa-check"></i><b>2.11.2</b> 范围循环</a></li>
<li class="chapter" data-level="2.11.3" data-path="systax.html"><a href="systax.html#while-循环"><i class="fa fa-check"></i><b>2.11.3</b> while 循环</a></li>
<li class="chapter" data-level="2.11.4" data-path="systax.html"><a href="systax.html#中断循环"><i class="fa fa-check"></i><b>2.11.4</b> 中断循环</a></li>
<li class="chapter" data-level="2.11.5" data-path="systax.html"><a href="systax.html#继续循环"><i class="fa fa-check"></i><b>2.11.5</b> 继续循环</a></li>
</ul></li>
<li class="chapter" data-level="2.12" data-path="systax.html"><a href="systax.html#函数"><i class="fa fa-check"></i><b>2.12</b> 函数</a><ul>
<li class="chapter" data-level="2.12.1" data-path="systax.html"><a href="systax.html#创建函数"><i class="fa fa-check"></i><b>2.12.1</b> 创建函数</a></li>
<li class="chapter" data-level="2.12.2" data-path="systax.html"><a href="systax.html#函数返回值"><i class="fa fa-check"></i><b>2.12.2</b> 函数返回值</a></li>
<li class="chapter" data-level="2.12.3" data-path="systax.html"><a href="systax.html#全局变量和局域变量"><i class="fa fa-check"></i><b>2.12.3</b> 全局变量和局域变量</a></li>
<li class="chapter" data-level="2.12.4" data-path="systax.html"><a href="systax.html#参数的默认值"><i class="fa fa-check"></i><b>2.12.4</b> 参数的默认值</a></li>
<li class="chapter" data-level="2.12.5" data-path="systax.html"><a href="systax.html#关键字参数"><i class="fa fa-check"></i><b>2.12.5</b> 关键字参数</a></li>
<li class="chapter" data-level="2.12.6" data-path="systax.html"><a href="systax.html#返回多个值"><i class="fa fa-check"></i><b>2.12.6</b> 返回多个值</a></li>
<li class="chapter" data-level="2.12.7" data-path="systax.html"><a href="systax.html#函数文档字符串"><i class="fa fa-check"></i><b>2.12.7</b> 函数文档字符串</a></li>
<li class="chapter" data-level="2.12.8" data-path="systax.html"><a href="systax.html#lambda表达式"><i class="fa fa-check"></i><b>2.12.8</b> lambda表达式</a></li>
<li class="chapter" data-level="2.12.9" data-path="systax.html"><a href="systax.html#args-和-kwargs"><i class="fa fa-check"></i><b>2.12.9</b> <code>*args</code><code>**kwargs</code></a></li>
<li class="chapter" data-level="2.12.10" data-path="systax.html"><a href="systax.html#参考资料"><i class="fa fa-check"></i><b>2.12.10</b> 参考资料</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="3" data-path="oop.html"><a href="oop.html"><i class="fa fa-check"></i><b>3</b> Python中的面向对象编程</a><ul>
<li class="chapter" data-level="3.1" data-path="oop.html"><a href="oop.html#python对象和类"><i class="fa fa-check"></i><b>3.1</b> Python对象和类</a><ul>
<li class="chapter" data-level="3.1.1" data-path="oop.html"><a href="oop.html#创建类"><i class="fa fa-check"></i><b>3.1.1</b> 创建类</a></li>
<li class="chapter" data-level="3.1.2" data-path="oop.html"><a href="oop.html#从类中创建对象"><i class="fa fa-check"></i><b>3.1.2</b> 从类中创建对象</a></li>
<li class="chapter" data-level="3.1.3" data-path="oop.html"><a href="oop.html#隐藏数据字段"><i class="fa fa-check"></i><b>3.1.3</b> 隐藏数据字段</a></li>
</ul></li>
<li class="chapter" data-level="3.2" data-path="oop.html"><a href="oop.html#操作符重载"><i class="fa fa-check"></i><b>3.2</b> 操作符重载</a></li>
<li class="chapter" data-level="3.3" data-path="oop.html"><a href="oop.html#继承和多态"><i class="fa fa-check"></i><b>3.3</b> 继承和多态</a><ul>
<li class="chapter" data-level="3.3.1" data-path="oop.html"><a href="oop.html#多重继承"><i class="fa fa-check"></i><b>3.3.1</b> 多重继承</a></li>
<li class="chapter" data-level="3.3.2" data-path="oop.html"><a href="oop.html#重写方法"><i class="fa fa-check"></i><b>3.3.2</b> 重写方法</a></li>
<li class="chapter" data-level="3.3.3" data-path="oop.html"><a href="oop.html#判断对象是否属于某类"><i class="fa fa-check"></i><b>3.3.3</b> 判断对象是否属于某类</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="4" data-path="exception.html"><a href="exception.html"><i class="fa fa-check"></i><b>4</b> 异常处理</a><ul>
<li class="chapter" data-level="4.1" data-path="exception.html"><a href="exception.html#捕获异常"><i class="fa fa-check"></i><b>4.1</b> 捕获异常</a><ul>
<li class="chapter" data-level="4.1.1" data-path="exception.html"><a href="exception.html#try-except"><i class="fa fa-check"></i><b>4.1.1</b> try-except</a></li>
<li class="chapter" data-level="4.1.2" data-path="exception.html"><a href="exception.html#多个except"><i class="fa fa-check"></i><b>4.1.2</b> 多个except</a></li>
<li class="chapter" data-level="4.1.3" data-path="exception.html"><a href="exception.html#自定义异常"><i class="fa fa-check"></i><b>4.1.3</b> 自定义异常</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="5" data-path="decorator.html"><a href="decorator.html"><i class="fa fa-check"></i><b>5</b> Python 装饰器</a><ul>
<li class="chapter" data-level="5.1" data-path="decorator.html"><a href="decorator.html#为什么要用装饰器"><i class="fa fa-check"></i><b>5.1</b> 为什么要用装饰器</a></li>
<li class="chapter" data-level="5.2" data-path="decorator.html"><a href="decorator.html#简单装饰器"><i class="fa fa-check"></i><b>5.2</b> 简单装饰器</a></li>
<li class="chapter" data-level="5.3" data-path="decorator.html"><a href="decorator.html#语法糖"><i class="fa fa-check"></i><b>5.3</b> @ 语法糖</a></li>
<li class="chapter" data-level="5.4" data-path="decorator.html"><a href="decorator.html#argskwargs"><i class="fa fa-check"></i><b>5.4</b> <code>*args</code><code>**kwargs</code></a></li>
<li class="chapter" data-level="5.5" data-path="decorator.html"><a href="decorator.html#带参数的装饰器"><i class="fa fa-check"></i><b>5.5</b> 带参数的装饰器</a></li>
<li class="chapter" data-level="5.6" data-path="decorator.html"><a href="decorator.html#类装饰器"><i class="fa fa-check"></i><b>5.6</b> 类装饰器</a></li>
<li class="chapter" data-level="5.7" data-path="decorator.html"><a href="decorator.html#functools.wraps"><i class="fa fa-check"></i><b>5.7</b> functools.wraps</a></li>
<li class="chapter" data-level="5.8" data-path="decorator.html"><a href="decorator.html#装饰器顺序"><i class="fa fa-check"></i><b>5.8</b> 装饰器顺序</a></li>
<li class="chapter" data-level="" data-path="decorator.html"><a href="decorator.html#参考资料-1"><i class="fa fa-check"></i>参考资料</a></li>
</ul></li>
<li class="chapter" data-level="6" data-path="module.html"><a href="module.html"><i class="fa fa-check"></i><b>6</b> 模块</a><ul>
<li class="chapter" data-level="6.1" data-path="module.html"><a href="module.html#创建模块"><i class="fa fa-check"></i><b>6.1</b> 创建模块</a></li>
<li class="chapter" data-level="6.2" data-path="module.html"><a href="module.html#使用模块中的指定内容"><i class="fa fa-check"></i><b>6.2</b> 使用模块中的指定内容</a></li>
<li class="chapter" data-level="6.3" data-path="module.html"><a href="module.html#dir函数"><i class="fa fa-check"></i><b>6.3</b> dir函数</a></li>
<li class="chapter" data-level="6.4" data-path="module.html"><a href="module.html#包"><i class="fa fa-check"></i><b>6.4</b></a></li>
</ul></li>
<li class="part"><span><b>II 进阶</b></span></li>
<li class="chapter" data-level="7" data-path="buildinfunctions.html"><a href="buildinfunctions.html"><i class="fa fa-check"></i><b>7</b> 内置函数</a><ul>
<li class="chapter" data-level="7.1" data-path="buildinfunctions.html"><a href="buildinfunctions.html#dict函数"><i class="fa fa-check"></i><b>7.1</b> dict函数</a><ul>
<li class="chapter" data-level="7.1.1" data-path="buildinfunctions.html"><a href="buildinfunctions.html#描述"><i class="fa fa-check"></i><b>7.1.1</b> 描述</a></li>
<li class="chapter" data-level="7.1.2" data-path="buildinfunctions.html"><a href="buildinfunctions.html#语法"><i class="fa fa-check"></i><b>7.1.2</b> 语法</a></li>
<li class="chapter" data-level="7.1.3" data-path="buildinfunctions.html"><a href="buildinfunctions.html#返回值"><i class="fa fa-check"></i><b>7.1.3</b> 返回值</a></li>
<li class="chapter" data-level="7.1.4" data-path="buildinfunctions.html"><a href="buildinfunctions.html#实例"><i class="fa fa-check"></i><b>7.1.4</b> 实例</a></li>
</ul></li>
<li class="chapter" data-level="7.2" data-path="buildinfunctions.html"><a href="buildinfunctions.html#zip函数"><i class="fa fa-check"></i><b>7.2</b> zip函数</a><ul>
<li class="chapter" data-level="7.2.1" data-path="buildinfunctions.html"><a href="buildinfunctions.html#描述-1"><i class="fa fa-check"></i><b>7.2.1</b> 描述</a></li>
<li class="chapter" data-level="7.2.2" data-path="buildinfunctions.html"><a href="buildinfunctions.html#语法-1"><i class="fa fa-check"></i><b>7.2.2</b> 语法</a></li>
<li class="chapter" data-level="7.2.3" data-path="buildinfunctions.html"><a href="buildinfunctions.html#返回值-1"><i class="fa fa-check"></i><b>7.2.3</b> 返回值</a></li>
<li class="chapter" data-level="7.2.4" data-path="buildinfunctions.html"><a href="buildinfunctions.html#特殊用法"><i class="fa fa-check"></i><b>7.2.4</b> 特殊用法</a></li>
<li class="chapter" data-level="7.2.5" data-path="buildinfunctions.html"><a href="buildinfunctions.html#实例-1"><i class="fa fa-check"></i><b>7.2.5</b> 实例</a></li>
</ul></li>
<li class="chapter" data-level="7.3" data-path="buildinfunctions.html"><a href="buildinfunctions.html#list函数"><i class="fa fa-check"></i><b>7.3</b> list函数</a><ul>
<li class="chapter" data-level="7.3.1" data-path="buildinfunctions.html"><a href="buildinfunctions.html#描述-2"><i class="fa fa-check"></i><b>7.3.1</b> 描述</a></li>
<li class="chapter" data-level="7.3.2" data-path="buildinfunctions.html"><a href="buildinfunctions.html#语法-2"><i class="fa fa-check"></i><b>7.3.2</b> 语法</a></li>
<li class="chapter" data-level="7.3.3" data-path="buildinfunctions.html"><a href="buildinfunctions.html#返回值-2"><i class="fa fa-check"></i><b>7.3.3</b> 返回值</a></li>
<li class="chapter" data-level="7.3.4" data-path="buildinfunctions.html"><a href="buildinfunctions.html#实例-2"><i class="fa fa-check"></i><b>7.3.4</b> 实例</a></li>
</ul></li>
<li class="chapter" data-level="7.4" data-path="buildinfunctions.html"><a href="buildinfunctions.html#min函数"><i class="fa fa-check"></i><b>7.4</b> min函数</a><ul>
<li class="chapter" data-level="7.4.1" data-path="buildinfunctions.html"><a href="buildinfunctions.html#描述-3"><i class="fa fa-check"></i><b>7.4.1</b> 描述</a></li>
<li class="chapter" data-level="7.4.2" data-path="buildinfunctions.html"><a href="buildinfunctions.html#语法-3"><i class="fa fa-check"></i><b>7.4.2</b> 语法</a></li>
<li class="chapter" data-level="7.4.3" data-path="buildinfunctions.html"><a href="buildinfunctions.html#返回值-3"><i class="fa fa-check"></i><b>7.4.3</b> 返回值</a></li>
<li class="chapter" data-level="7.4.4" data-path="buildinfunctions.html"><a href="buildinfunctions.html#实例-3"><i class="fa fa-check"></i><b>7.4.4</b> 实例</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="8" data-path="iterator.html"><a href="iterator.html"><i class="fa fa-check"></i><b>8</b> Python可迭代对象</a><ul>
<li class="chapter" data-level="8.1" data-path="iterator.html"><a href="iterator.html#可迭代对象"><i class="fa fa-check"></i><b>8.1</b> 可迭代对象</a></li>
<li class="chapter" data-level="8.2" data-path="iterator.html"><a href="iterator.html#迭代器"><i class="fa fa-check"></i><b>8.2</b> 迭代器</a></li>
<li class="chapter" data-level="8.3" data-path="iterator.html"><a href="iterator.html#生成器"><i class="fa fa-check"></i><b>8.3</b> 生成器</a></li>
<li class="chapter" data-level="8.4" data-path="iterator.html"><a href="iterator.html#三者之间关系"><i class="fa fa-check"></i><b>8.4</b> 三者之间关系</a></li>
<li class="chapter" data-level="8.5" data-path="iterator.html"><a href="iterator.html#迭代器长度的计算"><i class="fa fa-check"></i><b>8.5</b> 迭代器长度的计算</a></li>
<li class="chapter" data-level="8.6" data-path="iterator.html"><a href="iterator.html#yield"><i class="fa fa-check"></i><b>8.6</b> yield</a></li>
</ul></li>
<li class="chapter" data-level="9" data-path="underline.html"><a href="underline.html"><i class="fa fa-check"></i><b>9</b> 单、双下划线的区别</a><ul>
<li class="chapter" data-level="9.1" data-path="underline.html"><a href="underline.html#单下划线开头"><i class="fa fa-check"></i><b>9.1</b> 单下划线开头</a></li>
<li class="chapter" data-level="9.2" data-path="underline.html"><a href="underline.html#双下划线开头"><i class="fa fa-check"></i><b>9.2</b> 双下划线开头</a></li>
<li class="chapter" data-level="9.3" data-path="underline.html"><a href="underline.html#双下划线开头和结尾"><i class="fa fa-check"></i><b>9.3</b> 双下划线开头和结尾</a></li>
</ul></li>
<li class="chapter" data-level="10" data-path="re.html"><a href="re.html"><i class="fa fa-check"></i><b>10</b> 正则表达式</a><ul>
<li class="chapter" data-level="10.1" data-path="re.html"><a href="re.html#正则表达式发展简史"><i class="fa fa-check"></i><b>10.1</b> 正则表达式发展简史</a></li>
<li class="chapter" data-level="10.2" data-path="re.html"><a href="re.html#python中使用正则表达式的流程"><i class="fa fa-check"></i><b>10.2</b> Python中使用正则表达式的流程</a></li>
<li class="chapter" data-level="10.3" data-path="re.html"><a href="re.html#正则表达式的构成要素"><i class="fa fa-check"></i><b>10.3</b> 正则表达式的构成要素</a><ul>
<li class="chapter" data-level="10.3.1" data-path="re.html"><a href="re.html#定界符"><i class="fa fa-check"></i><b>10.3.1</b> 定界符</a></li>
<li class="chapter" data-level="10.3.2" data-path="re.html"><a href="re.html#原子"><i class="fa fa-check"></i><b>10.3.2</b> 原子</a></li>
</ul></li>
<li class="chapter" data-level="10.4" data-path="re.html"><a href="re.html#元字符"><i class="fa fa-check"></i><b>10.4</b> 元字符</a><ul>
<li class="chapter" data-level="10.4.1" data-path="re.html"><a href="re.html#匹配单个字符的元字符"><i class="fa fa-check"></i><b>10.4.1</b> 匹配单个字符的元字符</a></li>
<li class="chapter" data-level="10.4.2" data-path="re.html"><a href="re.html#常用转义符"><i class="fa fa-check"></i><b>10.4.2</b> 常用转义符</a></li>
<li class="chapter" data-level="10.4.3" data-path="re.html"><a href="re.html#提供计数功能的元字符"><i class="fa fa-check"></i><b>10.4.3</b> 提供计数功能的元字符</a></li>
<li class="chapter" data-level="10.4.4" data-path="re.html"><a href="re.html#匹配位置的元字符"><i class="fa fa-check"></i><b>10.4.4</b> 匹配位置的元字符</a></li>
<li class="chapter" data-level="10.4.5" data-path="re.html"><a href="re.html#其他元字符"><i class="fa fa-check"></i><b>10.4.5</b> 其他元字符</a></li>
</ul></li>
<li class="chapter" data-level="10.5" data-path="re.html"><a href="re.html#模式修饰符"><i class="fa fa-check"></i><b>10.5</b> 模式修饰符</a></li>
<li class="chapter" data-level="10.6" data-path="re.html"><a href="re.html#在python中使用正则表达式"><i class="fa fa-check"></i><b>10.6</b> 在Python中使用正则表达式</a><ul>
<li class="chapter" data-level="10.6.1" data-path="re.html"><a href="re.html#search"><i class="fa fa-check"></i><b>10.6.1</b> search()</a></li>
<li class="chapter" data-level="10.6.2" data-path="re.html"><a href="re.html#compile"><i class="fa fa-check"></i><b>10.6.2</b> compile()</a></li>
<li class="chapter" data-level="10.6.3" data-path="re.html"><a href="re.html#findall"><i class="fa fa-check"></i><b>10.6.3</b> findall()</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="11" data-path="argparse.html"><a href="argparse.html"><i class="fa fa-check"></i><b>11</b> 命令行参数模块argparse</a><ul>
<li class="chapter" data-level="11.1" data-path="argparse.html"><a href="argparse.html#创建解析器"><i class="fa fa-check"></i><b>11.1</b> 创建解析器</a></li>
<li class="chapter" data-level="11.2" data-path="argparse.html"><a href="argparse.html#添加参数选项"><i class="fa fa-check"></i><b>11.2</b> 添加参数选项</a><ul>
<li class="chapter" data-level="11.2.1" data-path="argparse.html"><a href="argparse.html#name-or-flags"><i class="fa fa-check"></i><b>11.2.1</b> name or flags</a></li>
<li class="chapter" data-level="11.2.2" data-path="argparse.html"><a href="argparse.html#help"><i class="fa fa-check"></i><b>11.2.2</b> help</a></li>
<li class="chapter" data-level="11.2.3" data-path="argparse.html"><a href="argparse.html#default和type"><i class="fa fa-check"></i><b>11.2.3</b> default和type</a></li>
</ul></li>
<li class="chapter" data-level="11.3" data-path="argparse.html"><a href="argparse.html#参数解析"><i class="fa fa-check"></i><b>11.3</b> 参数解析</a></li>
<li class="chapter" data-level="11.4" data-path="argparse.html"><a href="argparse.html#小结"><i class="fa fa-check"></i><b>11.4</b> 小结</a></li>
<li class="chapter" data-level="11.5" data-path="argparse.html"><a href="argparse.html#参考文献"><i class="fa fa-check"></i><b>11.5</b> 参考文献</a></li>
</ul></li>
<li class="part"><span><b>III 数据抓取</b></span></li>
<li class="chapter" data-level="12" data-path="spider.html"><a href="spider.html"><i class="fa fa-check"></i><b>12</b> 网络爬虫</a><ul>
<li class="chapter" data-level="12.1" data-path="spider.html"><a href="spider.html#网络爬虫原理"><i class="fa fa-check"></i><b>12.1</b> 网络爬虫原理</a></li>
<li class="chapter" data-level="12.2" data-path="spider.html"><a href="spider.html#http请求原理"><i class="fa fa-check"></i><b>12.2</b> HTTP请求原理</a><ul>
<li class="chapter" data-level="12.2.1" data-path="spider.html"><a href="spider.html#url"><i class="fa fa-check"></i><b>12.2.1</b> URL</a></li>
<li class="chapter" data-level="12.2.2" data-path="spider.html"><a href="spider.html#请求方式"><i class="fa fa-check"></i><b>12.2.2</b> 请求方式</a></li>
<li class="chapter" data-level="12.2.3" data-path="spider.html"><a href="spider.html#requests包"><i class="fa fa-check"></i><b>12.2.3</b> requests包</a></li>
</ul></li>
<li class="chapter" data-level="12.3" data-path="spider.html"><a href="spider.html#编码"><i class="fa fa-check"></i><b>12.3</b> 编码</a><ul>
<li class="chapter" data-level="12.3.1" data-path="spider.html"><a href="spider.html#编码方式"><i class="fa fa-check"></i><b>12.3.1</b> 编码方式</a></li>
<li class="chapter" data-level="12.3.2" data-path="spider.html"><a href="spider.html#编码转换"><i class="fa fa-check"></i><b>12.3.2</b> 编码转换</a></li>
</ul></li>
<li class="chapter" data-level="12.4" data-path="spider.html"><a href="spider.html#存储"><i class="fa fa-check"></i><b>12.4</b> 存储</a><ul>
<li class="chapter" data-level="12.4.1" data-path="spider.html"><a href="spider.html#存储到文件"><i class="fa fa-check"></i><b>12.4.1</b> 存储到文件</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="13" data-path="httplib.html"><a href="httplib.html"><i class="fa fa-check"></i><b>13</b> HTTP库</a><ul>
<li class="chapter" data-level="13.1" data-path="httplib.html"><a href="httplib.html#urllib"><i class="fa fa-check"></i><b>13.1</b> urllib</a><ul>
<li class="chapter" data-level="13.1.1" data-path="httplib.html"><a href="httplib.html#发送请求"><i class="fa fa-check"></i><b>13.1.1</b> 发送请求</a></li>
<li class="chapter" data-level="13.1.2" data-path="httplib.html"><a href="httplib.html#处理异常"><i class="fa fa-check"></i><b>13.1.2</b> 处理异常</a></li>
<li class="chapter" data-level="13.1.3" data-path="httplib.html"><a href="httplib.html#解析链接"><i class="fa fa-check"></i><b>13.1.3</b> 解析链接</a></li>
</ul></li>
<li class="chapter" data-level="13.2" data-path="httplib.html"><a href="httplib.html#使用requests"><i class="fa fa-check"></i><b>13.2</b> 使用requests</a><ul>
<li class="chapter" data-level="13.2.1" data-path="httplib.html"><a href="httplib.html#安装requests"><i class="fa fa-check"></i><b>13.2.1</b> 安装requests</a></li>
<li class="chapter" data-level="13.2.2" data-path="httplib.html"><a href="httplib.html#发起请求"><i class="fa fa-check"></i><b>13.2.2</b> 发起请求</a></li>
</ul></li>
<li class="chapter" data-level="13.3" data-path="httplib.html"><a href="httplib.html#网页编码检测及转换"><i class="fa fa-check"></i><b>13.3</b> 网页编码检测及转换</a></li>
<li class="chapter" data-level="13.4" data-path="httplib.html"><a href="httplib.html#参考资料-2"><i class="fa fa-check"></i><b>13.4</b> 参考资料</a></li>
</ul></li>
<li class="chapter" data-level="14" data-path="pyquery.html"><a href="pyquery.html"><i class="fa fa-check"></i><b>14</b> Pyquery</a><ul>
<li class="chapter" data-level="14.1" data-path="pyquery.html"><a href="pyquery.html#安装"><i class="fa fa-check"></i><b>14.1</b> 安装</a></li>
<li class="chapter" data-level="14.2" data-path="pyquery.html"><a href="pyquery.html#初始化"><i class="fa fa-check"></i><b>14.2</b> 初始化</a></li>
<li class="chapter" data-level="14.3" data-path="pyquery.html"><a href="pyquery.html#获取信息"><i class="fa fa-check"></i><b>14.3</b> 获取信息</a><ul>
<li class="chapter" data-level="14.3.1" data-path="pyquery.html"><a href="pyquery.html#通过选择符选定元素"><i class="fa fa-check"></i><b>14.3.1</b> 通过选择符选定元素</a></li>
<li class="chapter" data-level="14.3.2" data-path="pyquery.html"><a href="pyquery.html#通过迭代获取最终结果"><i class="fa fa-check"></i><b>14.3.2</b> 通过迭代获取最终结果</a></li>
</ul></li>
<li class="chapter" data-level="14.4" data-path="pyquery.html"><a href="pyquery.html#参考资料-3"><i class="fa fa-check"></i><b>14.4</b> 参考资料</a></li>
</ul></li>
<li class="chapter" data-level="15" data-path="databases.html"><a href="databases.html"><i class="fa fa-check"></i><b>15</b> 数据存储</a><ul>
<li class="chapter" data-level="15.1" data-path="databases.html"><a href="databases.html#mysql的存储"><i class="fa fa-check"></i><b>15.1</b> MySQL的存储</a><ul>
<li class="chapter" data-level="15.1.1" data-path="databases.html"><a href="databases.html#安装pymysql和mysql"><i class="fa fa-check"></i><b>15.1.1</b> 安装PyMySQL和MySQL</a></li>
<li class="chapter" data-level="15.1.2" data-path="databases.html"><a href="databases.html#连接数据库"><i class="fa fa-check"></i><b>15.1.2</b> 连接数据库</a></li>
<li class="chapter" data-level="15.1.3" data-path="databases.html"><a href="databases.html#对数据库进行操作"><i class="fa fa-check"></i><b>15.1.3</b> 对数据库进行操作</a></li>
</ul></li>
<li class="chapter" data-level="15.2" data-path="databases.html"><a href="databases.html#mongodb"><i class="fa fa-check"></i><b>15.2</b> MongoDB</a><ul>
<li class="chapter" data-level="15.2.1" data-path="databases.html"><a href="databases.html#在macos中的安装"><i class="fa fa-check"></i><b>15.2.1</b> 在macOS中的安装</a></li>
<li class="chapter" data-level="15.2.2" data-path="databases.html"><a href="databases.html#连接mongodb"><i class="fa fa-check"></i><b>15.2.2</b> 连接MongoDB</a></li>
<li class="chapter" data-level="15.2.3" data-path="databases.html"><a href="databases.html#指定数据库"><i class="fa fa-check"></i><b>15.2.3</b> 指定数据库</a></li>
<li class="chapter" data-level="15.2.4" data-path="databases.html"><a href="databases.html#指定集合"><i class="fa fa-check"></i><b>15.2.4</b> 指定集合</a></li>
<li class="chapter" data-level="15.2.5" data-path="databases.html"><a href="databases.html#插入数据"><i class="fa fa-check"></i><b>15.2.5</b> 插入数据</a></li>
<li class="chapter" data-level="15.2.6" data-path="databases.html"><a href="databases.html#查询数据-1"><i class="fa fa-check"></i><b>15.2.6</b> 查询数据</a></li>
<li class="chapter" data-level="15.2.7" data-path="databases.html"><a href="databases.html#计数-1"><i class="fa fa-check"></i><b>15.2.7</b> 计数</a></li>
<li class="chapter" data-level="15.2.8" data-path="databases.html"><a href="databases.html#排序"><i class="fa fa-check"></i><b>15.2.8</b> 排序</a></li>
<li class="chapter" data-level="15.2.9" data-path="databases.html"><a href="databases.html#偏移和限定"><i class="fa fa-check"></i><b>15.2.9</b> 偏移和限定</a></li>
<li class="chapter" data-level="15.2.10" data-path="databases.html"><a href="databases.html#更新"><i class="fa fa-check"></i><b>15.2.10</b> 更新</a></li>
<li class="chapter" data-level="15.2.11" data-path="databases.html"><a href="databases.html#删除"><i class="fa fa-check"></i><b>15.2.11</b> 删除</a></li>
<li class="chapter" data-level="15.2.12" data-path="databases.html"><a href="databases.html#其他操作"><i class="fa fa-check"></i><b>15.2.12</b> 其他操作</a></li>
</ul></li>
<li class="chapter" data-level="15.3" data-path="databases.html"><a href="databases.html#参考资料-4"><i class="fa fa-check"></i><b>15.3</b> 参考资料</a></li>
</ul></li>
<li class="chapter" data-level="16" data-path="selenium.html"><a href="selenium.html"><i class="fa fa-check"></i><b>16</b> 使用Selenium抓取动态渲染页面</a><ul>
<li class="chapter" data-level="16.1" data-path="selenium.html"><a href="selenium.html#安装-1"><i class="fa fa-check"></i><b>16.1</b> 安装</a></li>
<li class="chapter" data-level="16.2" data-path="selenium.html"><a href="selenium.html#selenium的使用"><i class="fa fa-check"></i><b>16.2</b> Selenium的使用</a><ul>
<li class="chapter" data-level="16.2.1" data-path="selenium.html"><a href="selenium.html#初始化浏览器对象"><i class="fa fa-check"></i><b>16.2.1</b> 初始化浏览器对象</a></li>
<li class="chapter" data-level="16.2.2" data-path="selenium.html"><a href="selenium.html#访问页面"><i class="fa fa-check"></i><b>16.2.2</b> 访问页面</a></li>
<li class="chapter" data-level="16.2.3" data-path="selenium.html"><a href="selenium.html#查找节点"><i class="fa fa-check"></i><b>16.2.3</b> 查找节点</a></li>
<li class="chapter" data-level="16.2.4" data-path="selenium.html"><a href="selenium.html#节点操作"><i class="fa fa-check"></i><b>16.2.4</b> 节点操作</a></li>
<li class="chapter" data-level="16.2.5" data-path="selenium.html"><a href="selenium.html#执行javascript"><i class="fa fa-check"></i><b>16.2.5</b> 执行JavaScript</a></li>
<li class="chapter" data-level="16.2.6" data-path="selenium.html"><a href="selenium.html#获取节点信息"><i class="fa fa-check"></i><b>16.2.6</b> 获取节点信息</a></li>
<li class="chapter" data-level="16.2.7" data-path="selenium.html"><a href="selenium.html#延时等待"><i class="fa fa-check"></i><b>16.2.7</b> 延时等待</a></li>
</ul></li>
<li class="chapter" data-level="16.3" data-path="selenium.html"><a href="selenium.html#扩展阅读"><i class="fa fa-check"></i><b>16.3</b> 扩展阅读</a></li>
</ul></li>
<li class="part"><span><b>IV 网络应用开发</b></span></li>
<li class="chapter" data-level="17" data-path="django.html"><a href="django.html"><i class="fa fa-check"></i><b>17</b> Django 框架的安装</a><ul>
<li class="chapter" data-level="17.1" data-path="django.html"><a href="django.html#安装-2"><i class="fa fa-check"></i><b>17.1</b> 安装</a><ul>
<li class="chapter" data-level="17.1.1" data-path="django.html"><a href="django.html#安装-django"><i class="fa fa-check"></i><b>17.1.1</b> 安装 Django</a></li>
<li class="chapter" data-level="17.1.2" data-path="django.html"><a href="django.html#验证安装"><i class="fa fa-check"></i><b>17.1.2</b> 验证安装</a></li>
<li class="chapter" data-level="17.1.3" data-path="django.html"><a href="django.html#创建项目并预览效果"><i class="fa fa-check"></i><b>17.1.3</b> 创建项目并预览效果</a></li>
</ul></li>
<li class="chapter" data-level="17.2" data-path="django.html"><a href="django.html#学习资源-1"><i class="fa fa-check"></i><b>17.2</b> 学习资源</a></li>
</ul></li>
<li class="chapter" data-level="18" data-path="quicktutorial.html"><a href="quicktutorial.html"><i class="fa fa-check"></i><b>18</b> 快速上手案例</a><ul>
<li class="chapter" data-level="18.1" data-path="quicktutorial.html"><a href="quicktutorial.html#在-django-项目中创建-app"><i class="fa fa-check"></i><b>18.1</b> 在 Django 项目中创建 APP</a><ul>
<li class="chapter" data-level="18.1.1" data-path="quicktutorial.html"><a href="quicktutorial.html#初始化-app"><i class="fa fa-check"></i><b>18.1.1</b> 初始化 APP</a></li>
<li class="chapter" data-level="18.1.2" data-path="quicktutorial.html"><a href="quicktutorial.html#将应用添加到项目中"><i class="fa fa-check"></i><b>18.1.2</b> 将应用添加到项目中</a></li>
<li class="chapter" data-level="18.1.3" data-path="quicktutorial.html"><a href="quicktutorial.html#创建视图"><i class="fa fa-check"></i><b>18.1.3</b> 创建视图</a></li>
<li class="chapter" data-level="18.1.4" data-path="quicktutorial.html"><a href="quicktutorial.html#规划路由"><i class="fa fa-check"></i><b>18.1.4</b> 规划路由</a></li>
<li class="chapter" data-level="18.1.5" data-path="quicktutorial.html"><a href="quicktutorial.html#预览效果"><i class="fa fa-check"></i><b>18.1.5</b> 预览效果</a></li>
</ul></li>
<li class="chapter" data-level="18.2" data-path="quicktutorial.html"><a href="quicktutorial.html#模型的创建及使用"><i class="fa fa-check"></i><b>18.2</b> 模型的创建及使用</a><ul>
<li class="chapter" data-level="18.2.1" data-path="quicktutorial.html"><a href="quicktutorial.html#配置项目信息"><i class="fa fa-check"></i><b>18.2.1</b> 配置项目信息</a></li>
<li class="chapter" data-level="18.2.2" data-path="quicktutorial.html"><a href="quicktutorial.html#创建模型"><i class="fa fa-check"></i><b>18.2.2</b> 创建模型</a></li>
<li class="chapter" data-level="18.2.3" data-path="quicktutorial.html"><a href="quicktutorial.html#激活模型"><i class="fa fa-check"></i><b>18.2.3</b> 激活模型</a></li>
<li class="chapter" data-level="18.2.4" data-path="quicktutorial.html"><a href="quicktutorial.html#数据库的生成与迁移"><i class="fa fa-check"></i><b>18.2.4</b> 数据库的生成与迁移</a></li>
</ul></li>
<li class="chapter" data-level="18.3" data-path="quicktutorial.html"><a href="quicktutorial.html#后台管理"><i class="fa fa-check"></i><b>18.3</b> 后台管理</a><ul>
<li class="chapter" data-level="18.3.1" data-path="quicktutorial.html"><a href="quicktutorial.html#创建管理员账号"><i class="fa fa-check"></i><b>18.3.1</b> 创建管理员账号</a></li>
<li class="chapter" data-level="18.3.2" data-path="quicktutorial.html"><a href="quicktutorial.html#添加引用到后台管理"><i class="fa fa-check"></i><b>18.3.2</b> 添加引用到后台管理</a></li>
<li class="chapter" data-level="18.3.3" data-path="quicktutorial.html"><a href="quicktutorial.html#进入后台管理界面"><i class="fa fa-check"></i><b>18.3.3</b> 进入后台管理界面</a></li>
</ul></li>
<li class="chapter" data-level="18.4" data-path="quicktutorial.html"><a href="quicktutorial.html#创建视图-1"><i class="fa fa-check"></i><b>18.4</b> 创建视图</a><ul>
<li class="chapter" data-level="18.4.1" data-path="quicktutorial.html"><a href="quicktutorial.html#添加多个视图"><i class="fa fa-check"></i><b>18.4.1</b> 添加多个视图</a></li>
<li class="chapter" data-level="18.4.2" data-path="quicktutorial.html"><a href="quicktutorial.html#模版命名空间"><i class="fa fa-check"></i><b>18.4.2</b> 模版命名空间</a></li>
<li class="chapter" data-level="18.4.3" data-path="quicktutorial.html"><a href="quicktutorial.html#在视图中使用模板"><i class="fa fa-check"></i><b>18.4.3</b> 在视图中使用模板</a></li>
<li class="chapter" data-level="18.4.4" data-path="quicktutorial.html"><a href="quicktutorial.html#url命名空间"><i class="fa fa-check"></i><b>18.4.4</b> URL命名空间</a></li>
</ul></li>
<li class="chapter" data-level="18.5" data-path="quicktutorial.html"><a href="quicktutorial.html#模型"><i class="fa fa-check"></i><b>18.5</b> 模型</a><ul>
<li class="chapter" data-level="18.5.1" data-path="quicktutorial.html"><a href="quicktutorial.html#设置数据库信息-1"><i class="fa fa-check"></i><b>18.5.1</b> 设置数据库信息</a></li>
<li class="chapter" data-level="18.5.2" data-path="quicktutorial.html"><a href="quicktutorial.html#创建模型-1"><i class="fa fa-check"></i><b>18.5.2</b> 创建模型</a></li>
<li class="chapter" data-level="18.5.3" data-path="quicktutorial.html"><a href="quicktutorial.html#使用模型"><i class="fa fa-check"></i><b>18.5.3</b> 使用模型</a></li>
</ul></li>
<li class="chapter" data-level="18.6" data-path="quicktutorial.html"><a href="quicktutorial.html#参考文献-1"><i class="fa fa-check"></i><b>18.6</b> 参考文献</a></li>
</ul></li>
<li class="divider"></li>
<li><a href="https://bookdown.org" target="blank">本书由 bookdown 强力驱动</a></li>
</ul>
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./">Python 编程</a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<div id="spider" class="section level1">
<h1><span class="header-section-number">第 12 章</span> 网络爬虫</h1>
<p>网络爬虫就是能够按照一定规则,自动收集网络中的数据的程序。</p>
<div id="网络爬虫原理" class="section level2">
<h2><span class="header-section-number">12.1</span> 网络爬虫原理</h2>
<ol style="list-style-type: decimal">
<li>请求网页,获取网页源代码</li>
<li>提取信息</li>
<li>存储数据</li>
<li>自动化程序</li>
</ol>
<p>网页本质上是一个存储在指定位置的文本文件(不一定是静态的,可能是由远程计算机根据一定条件计算出来的),因此,网络爬虫的任务,就是获取远程文本文件,然后,对文件进行分析,提取出我们想要的数据。</p>
</div>
<div id="http请求原理" class="section level2">
<h2><span class="header-section-number">12.2</span> HTTP请求原理</h2>
<p>HTTP遵循请求(Request)/应答(Response)模型。</p>
<p>Web浏览器向Web服务器发送请求,Web服务器处理请求并返回适当的应答。所有HTTP连接都被构造成一套请求和应答。</p>
<div class="figure">
<img src="images/http.png" alt="" />
<p class="caption">PNG</p>
</div>
<p>浏览器作为一个客户端,向服务器端发送了一次浏览该地址所对应的网页的请求</p>
<p>服务器同意了客户端的请求</p>
<p>客户端把服务器端的文件下载到本地</p>
<p>浏览器对文件进行解释、展现</p>
<div id="url" class="section level3">
<h3><span class="header-section-number">12.2.1</span> URL</h3>
<p>统一资源定位符(Uniform Resource Locator) 是互联网上标准资源的地址。互联网上的每个文件都有一个唯一的URL。
基本URL包含模式(或称协议)、服务器名称(或IP地址)、路径和文件名,</p>
<p>如:协议://授权/路径?查询</p>
<pre><code>http://alumni.xjtu.edu.cn:9090/donation/namelist?pageNo=1&amp;pageSize=10&amp;billnum=&amp;donateUserName=&amp;orderWay=&amp;donationid=0</code></pre>
<p>爬虫最主要的处理对象就是URL,它根据URL地址取得所需要的文件内容,然后对它进行进一步的处理。</p>
</div>
<div id="请求方式" class="section level3">
<h3><span class="header-section-number">12.2.2</span> 请求方式</h3>
<div id="get" class="section level4">
<h4><span class="header-section-number">12.2.2.1</span> GET</h4>
<p>GET方法是默认的HTTP请求方法,我们日常用GET方法来提交表单数据,然而用GET方法提交的表单数据只经过了简单的编码,同时它将作为URL的一部分向Web服务器发送。</p>
</div>
<div id="post" class="section level4">
<h4><span class="header-section-number">12.2.2.2</span> POST</h4>
<p>POST方法是GET方法的一个替代方法,它主要是向Web服务器提交表单数据,尤其是大批量的数据。POST方法克服了GET方法的一些缺点。通过POST方法提交表单数据时,数据不是作为URL请求的一部分而是作为标准数据传送给Web服务器,这就克服了GET方法中的信息无法保密和数据量太小的缺点。</p>
</div>
</div>
<div id="requests包" class="section level3">
<h3><span class="header-section-number">12.2.3</span> requests包</h3>
<p>Requests 是 Python 中的 HTTP 库,Requests库允许你发送符合标准的 HTTP/1.1 请求,无需手工劳动。你不需要手动为 URL 添加查询字串,也不需要对 POST 数据进行表单编码。Keep-alive 和 HTTP 连接池的功能是 100% 自动化的。</p>
<p>官方网站:<a href="http://cn.python-requests.org/zh_CN/latest/">http://cn.python-requests.org/zh_CN/latest/</a></p>
<p>安装方式:</p>
<pre class="shell"><code>pip install requests</code></pre>
</div>
</div>
<div id="编码" class="section level2">
<h2><span class="header-section-number">12.3</span> 编码</h2>
<div id="编码方式" class="section level3">
<h3><span class="header-section-number">12.3.1</span> 编码方式</h3>
<ol style="list-style-type: decimal">
<li>ASCII编码:是对英语字符和二进制之间的关系做的统一规定</li>
<li>GBK编码:是汉字编码标准之一,是在 GB2312-80 标准基础上的内码扩展规范,使用了双字节编码</li>
<li>GB2312编码:适用于汉字处理、汉字通信等系统之间的信息交换</li>
<li>GB18030编码:国家标准GB18030-2005《信息技术 中文编码字符集》是我国继GB2312-1980和GB13000.1-1993之后最重要的汉字编码标准,是我国计算机系统必须遵循的基础性标准之一。 GB18030有两个版本:GB18030-2000和GB18030-2005。GB18030-2000是GBK的取代版本,它的主要特点是在GBK基础上增加了CJK统一汉字扩充A的汉字。GB18030-2005的主要特点是在GB18030-2000基础上增加了CJK统一汉字扩充B的汉字。</li>
<li>UTF-8编码:是 Unicode Transformation Format - 8 bit 的缩写。它是可变长的编码方式,可以使用1~4个字节表示一个字符,可根据不同的符号而变化字节长度</li>
<li>Unicode编码:这是一种世界上所有字符的编码。当然了它没有规定的存储方式</li>
</ol>
</div>
<div id="编码转换" class="section level3">
<h3><span class="header-section-number">12.3.2</span> 编码转换</h3>
<p>通常是要以Unicode作为中间编码进行转换,即先将其他编码的字符串解码(decode)成 Unicode,再从 Unicode编码(encode)成另一种编码。</p>
</div>
</div>
<div id="存储" class="section level2">
<h2><span class="header-section-number">12.4</span> 存储</h2>
<p>存储的时候,可以将字符串编码之后,再存储到文件。</p>
<div id="存储到文件" class="section level3">
<h3><span class="header-section-number">12.4.1</span> 存储到文件</h3>
<p>使用open函数,可以创建文件,并将内容写入到文件中。</p>
<div class="sourceCode" id="cb221"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb221-1"><a href="spider.html#cb221-1"></a>f <span class="op">=</span> <span class="bu">open</span>(<span class="st">&#39;xjtu.html&#39;</span>, <span class="st">&#39;w&#39;</span>)</span>
<span id="cb221-2"><a href="spider.html#cb221-2"></a>f.write(content.encode(<span class="st">&#39;utf-8&#39;</span>))</span>
<span id="cb221-3"><a href="spider.html#cb221-3"></a>f.close()</span></code></pre></div>
</div>
</div>
</div>
</section>
</div>
</div>
</div>
<a href="argparse.html" class="navigation navigation-prev " aria-label="Previous page"><i class="fa fa-angle-left"></i></a>
<a href="httplib.html" class="navigation navigation-next " aria-label="Next page"><i class="fa fa-angle-right"></i></a>
</div>
</div>
<script src="libs/gitbook-2.6.7/js/app.min.js"></script>
<script src="libs/gitbook-2.6.7/js/lunr.js"></script>
<script src="libs/gitbook-2.6.7/js/clipboard.min.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="libs/gitbook-2.6.7/js/jquery.highlight.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-clipboard.js"></script>
<script>
gitbook.require(["gitbook"], function(gitbook) {
gitbook.start({
"sharing": {
"github": true,
"facebook": false,
"twitter": true,
"linkedin": false,
"weibo": false,
"instapaper": false,
"vk": false,
"all": ["facebook", "twitter", "linkedin", "weibo", "instapaper"]
},
"fontsettings": {
"theme": "white",
"family": "sans",
"size": 2
},
"edit": {
"link": "https://github.com/yangjh-xbmu/learningpython/edit/master/1200-spider-spider.Rmd",
"text": "编辑"
},
"history": {
"link": null,
"text": null
},
"view": {
"link": null,
"text": null
},
"download": ["learningpython.pdf", "learningpython.epub"],
"toc": {
"collapse": "section"
}
});
});
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
var src = "true";
if (src === "" || src === "true") src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-MML-AM_CHTML";
if (location.protocol !== "file:")
if (/^https?:/.test(src))
src = src.replace(/^https?:/, '');
script.src = src;
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>
1
https://gitee.com/yangjh/python.git
git@gitee.com:yangjh/python.git
yangjh
python
python
master

搜索帮助