2 Star 19 Fork 1

g1879 / DrissionPage-demos

Create your Gitee Account
Explore and code with more than 6 million developers,Free private repositories !:)
Sign up
Clone or download
爬书籍列表.py 941 Bytes
Copy Edit Web IDE Raw Blame History
g1879 authored 2020-12-29 15:55 . 调整目录结构
#!/usr/bin/env python
# -*- coding:utf-8 -*-
from ListPage import ListPage, Xpaths, Targets, Recorder
# 所有已完结免费作品列表页
列表首页url = 'http://book.zongheng.com/store/c0/c0/b0/u0/p1/v0/s1/t0/u0/i1/ALL.html'
# 定义页面结构
xpaths = Xpaths()
xpaths.pages_count = '//a[@title="下一页"]/preceding-sibling::a[1]'
xpaths.rows = '//div[@class="bookinfo"]'
xpaths.set_col('书名', '//div[@class="bookname"]/a')
xpaths.set_col('作者', '//div[@class="bookilnk"]/a')
# 定义爬取目标
targets = Targets(xpaths)
targets.add_target('书名', '书名')
targets.add_target('作者', '作者')
targets.add_target('链接', '书名', 'href')
targets.add_target('作者链接', '作者', 'href')
# 创建记录器
recorder = Recorder('书籍列表.csv', 200)
# 创建列表页对象
page = ListPage(xpaths, 列表首页url)
page.num_param = '/p'
page.get_list(targets, 3, 10, recorder=recorder, return_data=False)

Comment ( 0 )

Sign in for post a comment

Python
1
https://gitee.com/g1879/DrissionPage-demos.git
git@gitee.com:g1879/DrissionPage-demos.git
g1879
DrissionPage-demos
DrissionPage-demos
master

Search