2 Star 0 Fork 0

刘洋/2019数据采集与融合

Create your Gitee Account
Explore and code with more than 14 million developers,Free private repositories !:)
Sign up
文件
This repository doesn't specify license. Please pay attention to the specific project description and its upstream code dependency when using it.
Clone or Download
task3.py 937 Bytes
Copy Edit Raw Blame History
刘洋 authored 2021-10-13 17:54 +08:00 . 作业3
import requests
import re
url = 'https://www.shanghairanking.cn/_nuxt/static/1632381606/rankings/bcur/2021/payload.js'
loginheaders = {
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36',
}
#获取
def getHTMLText(url, loginheaders):
try:
r = requests.get(url, headers=loginheaders, timeout=30)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except:
return ""
data = getHTMLText(url,loginheaders)
#print(data)
#匹配
name = re.findall(r'univNameCn:"(.*?)"', data)
score = re.findall(r'score:(.*?),', data)
#print(name)
#print(score)
#输出
tplt = "{0:^10}\t{1:{3}^10}\t{2:^10}"
print(tplt.format("排名 ", "学校名称", "总分",chr(12288)))
for i in range(0,len(name)):
print(tplt.format(i+1, name[i], score[i], chr(12288)))
Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/liu-yangz/crawl_project.git
git@gitee.com:liu-yangz/crawl_project.git
liu-yangz
crawl_project
2019数据采集与融合
master

Search