目录标题
- 前言
- 知识点:
- 开发环境:
- 基本流程:
- 代码展示
- 尾语
前言
嗨喽~大家好呀,这里是魔王呐 ❤ ~!
知识点:
-
爬虫基本流程
-
requests的使用
-
动态数据抓包
开发环境:
-
解释器: python 3.8
-
编辑器: pycharm 2022.3
-
requests >>> pip install requests
第三方模块安装:
win + R 输入cmd 输入安装命令 pip install 模块名 (如果你觉得安装速度比较慢, 你可以切换国内镜像源)
python资料、源码、教程\福利皆: 点击此处跳转文末名片获取
基本流程:
一. 思路分析
找到数据来源当前的这个数据 是动态数据还是静态数据network 网络资源抓包 捋清楚整个案例的实现过程访问该网址 获取到 数据内容并且将我们需要的数据内容提取出来保存 (单页)多页采集 分析 链接变化规律 构建翻页规律 实现多页采集
二. 代码实现
-
发送请求
-
获取数据
-
解析数据
-
保存数据
代码展示
import requests # 第三方库 需要额外安装
import csvheaders = {'Accept': 'application/json, text/javascript, */*; q=0.01','Accept-Encoding': 'gzip, deflate','Accept-Language': 'zh-CN,zh;q=0.9','Cache-Control': 'no-cache','Connection': 'keep-alive','Cookie': 'HMF_CI=1b17efcb79bb1c54b0972d1e27d1af031f8912351c906f5874e3ee7ad1ca9563806c6b7e37f7dc287b3165e3422da231f587a0c6a2923dea32cb0e422e6553046a; 21_vq=4','Host': 'www.cwl.gov.cn','Pragma': 'no-cache','Referer': 'http://*****/ygkj/wqkjgg/ssq/','User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36','X-Requested-With': 'XMLHttpRequest',
}
f = open('双色球.csv', mode='a', newline='', encoding='utf-8')
源码、解答、教程、安装包等资料加V:qian97378免费领
csv_writer = csv.writer(f)
csv_writer.writerow(["日期", "红球", "蓝球", "奖池金额", "中奖情况", "一等奖人数", "一等奖金额", "二等奖人数", "二等奖金额", "三等奖人数", "三等奖金额", "四等奖人数", "四等奖金额", "五等奖人数", "五等奖金额", "六等奖人数", "六等奖金额"])
for page in range(1, 54):print(f"正在抓取第{page}页")url = f'http://*****/cwl_admin/front/cwlkj/search/kjxx/findDrawNotice?name=ssq&issueCount=&issueStart=&issueEnd=&dayStart=&dayEnd=&pageNo={page}&pageSize=30&week=&systemType=PC'response = requests.get(url=url, headers=headers)json_data = response.json()# red->0->resultresult = json_data['result']for res in result:reds = res['red']blue = res['blue']date = res['date']poolmoney = res['poolmoney']content = res['content']prizegrades = res['prizegrades']one_prize, one_price, two_prize, two_price, three_prize, three_price, four_prize, four_price, five_prize, five_price, six_prize, six_price = "", "", "", "", "", "", "", "", "", "", "", ""for prizegrad in prizegrades:if prizegrad['type'] == 1:one_prize = prizegrad['typenum']one_price = prizegrad['typemoney']elif prizegrad['type'] == 2:two_prize = prizegrad['typenum']two_price = prizegrad['typemoney']elif prizegrad['type'] == 3:three_prize = prizegrad['typenum']three_price = prizegrad['typemoney']elif prizegrad['type'] == 4:four_prize = prizegrad['typenum']four_price = prizegrad['typemoney']elif prizegrad['type'] == 5:five_prize = prizegrad['typenum']five_price = prizegrad['typemoney']elif prizegrad['type'] == 6:six_prize = prizegrad['typenum']six_price = prizegrad['typemoney']print(date, reds, blue, poolmoney, content, one_prize, one_price, two_prize, two_price, three_prize, three_price, four_prize, four_price, five_prize, five_price, six_prize, six_price)# 我要保存为一个表格# 期数 红球 蓝球 中奖情况 奖池金额 一等奖中奖人数 一等奖中奖金额 二等奖中奖人数 二等奖中奖金额csv_writer.writerow([date, reds, blue, poolmoney, content, one_prize, one_price, two_prize, two_price, three_prize, three_price, four_prize, four_price, five_prize, five_price, six_prize, six_price])
尾语
感谢你观看我的文章呐~本次航班到这里就结束啦 🛬
希望本篇文章有对你带来帮助 🎉,有学习到一点知识~
躲起来的星星🍥也在努力发光,你也要努力加油(让我们一起努力叭)。