1 Star 0 Fork 0

倪思涵 / pa

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
pa.py 1.03 KB
一键复制 编辑 原始数据 按行查看 历史
倪思涵 提交于 2020-06-12 15:53 . Initial commit
import requests
import threading
from bs4 import BeautifulSoup
import re
import os.path
import time
for i in range(48632245,48632255):
header = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'}
req = requests.get('https://www.52bqg.com/book_126955/%d.html' % i,
headers=header)
result = req.content
result = result.decode('gbk')
title_re = re.compile(r'<h1>(.*?)</h1>')
text_re = re.compile(r'&nbsp;&nbsp;&nbsp;&nbsp;([\s\S]*?) ')
title = re.findall(title_re,result)
text = re.findall(text_re,result)
count = i - 48632244
file = open("C:\\Users\\86150\\Desktop\\小说\\第%d章.txt" % count,'a')
for sentence in text:
sentence = sentence.replace('<br', '')
file.write(sentence)
file.write('\n')
path = os.walk("C:\\Users\\86150\\Desktop\\小说\\第%d章.txt" % count)
file.close()
file = file = open("C:\\Users\\86150\\Desktop\\小说\\第%d章.txt" % count,'r')
print("第%d章下载完毕" % count)
1
https://gitee.com/ni_si_han/pa.git
git@gitee.com:ni_si_han/pa.git
ni_si_han
pa
pa
master

搜索帮助