大家好,我是@xiaomeng 小孟
欢迎大家阅读今天的文章————Python爬取图片(爬虫)
最近爬虫挺火的,所以我今天也来一个爬虫!
正文:
首先,我们先下载模块,pip install requests(不多说了)
#1.导入模块
import requests
import re
##2.填写网址 并且 请求 (网址需要你们自己填写,想什么图片就是什么)
#确定网址url='https://image.baidu.com/search/index?tn=baiduimage&ps=1&ct=201326592&lm=-1&cl=2&nc=1&ie=utf-8&word=%E6%B0%B4%E6%9E%9C'
form_header = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36","Host":"image.baidu.com","Accept-Language":"zh-CN,zh;q=0.9","Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"}
res=requests.get(url, headers=form_header).text
print(res)
###3.正则表达式筛选数据
image_urls = re.findall('"objURL":"(.*?)",', res)
####4.利用for.....in:的句型找到图片网址
for image_url in image_urls:print(image_url)
#####5.设置图片名称,如果图片后缀没有,就添加一个‘.jpg’(正则筛选数据 re.search)
#图片名称image_name = image_url.split('/')[-1]print(image_name)image_end=re.search('(.jpg/.png/.jpeg/.gif)$',image_name)if image_end == None:image_name = image_name + '.jpg'
######6. 下载保存图片(必须在同目录内新建一个 image 文件夹,才可以下载保存)
#保存image = requests.get(image_url).contentwith open('./image/%s'% image_name.split("&")[0], 'wb')as file:file.write(image)
以下为完整的源代码:
#请求import requests
import re#确定网址url='https://image.baidu.com/search/index?tn=baiduimage&ps=1&ct=201326592&lm=-1&cl=2&nc=1&ie=utf-8&word=%E6%B0%B4%E6%9E%9C'
form_header = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36","Host":"image.baidu.com","Accept-Language":"zh-CN,zh;q=0.9","Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"}
res=requests.get(url, headers=form_header).text
print(res)
image_urls = re.findall('"objURL":"(.*?)",', res)for image_url in image_urls:print(image_url)#图片名称image_name = image_url.split('/')[-1]print(image_name)image_end=re.search('(.jpg/.png/.jpeg/.gif)$',image_name)if image_end == None:image_name = image_name + '.jpg'#保存image = requests.get(image_url).contentwith open('./image/%s'% image_name.split("&")[0], 'wb')as file:file.write(image)
运行截图:
最后我再说一句:
编程语言很神奇
还有很多知识等待着我们去探索,加油!!
_____________________________________________我是华丽的分割线~_________________________________________________
如果你喜欢这篇文章,请关注我并且点个赞吧!
谢谢您的阅读!下期再见!
*我的邮箱是 cv6_post@163.com 大家有问题可以联系我哦~*
@文章作者: 小孟