一、引言
百度是我们常用的搜索工具,其翻译是与爱词霸合作,总体看其反应速度较快,可以作为项目中重要的翻译工具。根据大家的需要,现提供两种Python获取百度翻译的两种办法:
二、requests法
我们引用requests模块,向百度发出post请求,得到回复后,进行解释为json格式数据。代码如下:
import requests #找一个趁手的工具,1. 导入这个工具
url = "https://fanyi.baidu.com/sug" # 2. 确定爬取的网址,存到变量里面url
#3. 伪装爬虫
headers = {"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36","cookie":'PSTM=1715648276; BIDUPSID=4FAECEB2016218E5337788D59D97E028; BAIDUID=A4621D4AF575EB35000B164A4D71A6C9:FG=1; BDSFRCVID=gDDOJexroG3M1QJttZbIbSp-4cGQd7cTDYLEOwXPsp3LGJLVYVGdEG0PtEXPGLtMnhULogKKBeOTHg4F_2uxOjjg8UtVJeC6EG0Ptf8g0M5; H_BDCLCKID_SF=Jn-O_C82JKL3HJOmMJjE5bcHMhoK--Q0KKJyLR-8Kb7VbPQCyUnkbfJBDloD-l3-ajnJ0qboKfO2KDbTyT5OL447yajDJjOuXKJi_JC-QbT8ob53DJ7pQT8r2xDOK5OibmDqslnNab3vOpoNXpO1bT0zBN5thURB2DkO-4bCWJ5TMl5jDh3Mb6ksDhAtqj0OfnIe_I0KfbjjfbTCMJOBq4k0-qJ-a4JQa5TbsJOOaCv5OpOOy4oTj6D0KqonK46JbIcBsRoR-TCMepk4MU5o3MvBKHKjWlRQWbcLof3V2fnq_Ij9Qft20bIEeMtjBbQuJtj7KR7jWhvIeq72y-tMQlRX5q79atTMfNTJ-qcH0KQpsIJM5-DWbT8bjHCJt6K8JbujoKvK-nO5KRQphRrhq4tehH4O5xJ9WDTOQJ7TthCaVqcm5h6D0-D9K2FJ5MbI5nv2-pbwBp5Cfb5K0jJHyUIzynjkLfbb3mkjbPbNyUJn8-bEh47lX-4syPRGKxRnWIkJKfA-b4ncjRcTehoM3xI8LNj405OTbIFO0KJzJCFMMDLwjTK2DTPBKMbeet5Kaj5tsJOOaCkWHqOOy4oWK441DPFDQxTJbNQabnoR-TCMHp3hb5Jj3M04X-oN-PQpLRvKahcIWMQZSl3gQft20b0kWb3xWn3uaDbP2n7jWhvIeq72y-tMQlRX5q79atTMfNTJ-qcH0KQpsIJM5-DWbT8IjHCJt6K8JbujoKvt-5rDHJTg5DTjhPrMbPORWMT-MTryKKOztxjkf-bL-xrD0nODjMbxQ40fKanRhlRNB-3iV-OxDUvnyxAZWJjxKUQxtNRJKxnx3hRM8Jrd3fvobUPULxo9LUvMJgcdot5yBbc8eIna5hjkbfJBQttjQn3hfIkj2CKLK-oj-D8lj5AM3j; BD_UPN=12314753; H_WISE_SIDS=60360_60600_60728_60749; delPer=0; BD_CK_SAM=1; BAIDUID_BFESS=A4621D4AF575EB35000B164A4D71A6C9:FG=1; BDSFRCVID_BFESS=gDDOJexroG3M1QJttZbIbSp-4cGQd7cTDYLEOwXPsp3LGJLVYVGdEG0PtEXPGLtMnhULogKKBeOTHg4F_2uxOjjg8UtVJeC6EG0Ptf8g0M5; H_BDCLCKID_SF_BFESS=Jn-O_C82JKL3HJOmMJjE5bcHMhoK--Q0KKJyLR-8Kb7VbPQCyUnkbfJBDloD-l3-ajnJ0qboKfO2KDbTyT5OL447yajDJjOuXKJi_JC-QbT8ob53DJ7pQT8r2xDOK5OibmDqslnNab3vOpoNXpO1bT0zBN5thURB2DkO-4bCWJ5TMl5jDh3Mb6ksDhAtqj0OfnIe_I0KfbjjfbTCMJOBq4k0-qJ-a4JQa5TbsJOOaCv5OpOOy4oTj6D0KqonK46JbIcBsRoR-TCMepk4MU5o3MvBKHKjWlRQWbcLof3V2fnq_Ij9Qft20bIEeMtjBbQuJtj7KR7jWhvIeq72y-tMQlRX5q79atTMfNTJ-qcH0KQpsIJM5-DWbT8bjHCJt6K8JbujoKvK-nO5KRQphRrhq4tehH4O5xJ9WDTOQJ7TthCaVqcm5h6D0-D9K2FJ5MbI5nv2-pbwBp5Cfb5K0jJHyUIzynjkLfbb3mkjbPbNyUJn8-bEh47lX-4syPRGKxRnWIkJKfA-b4ncjRcTehoM3xI8LNj405OTbIFO0KJzJCFMMDLwjTK2DTPBKMbeet5Kaj5tsJOOaCkWHqOOy4oWK441DPFDQxTJbNQabnoR-TCMHp3hb5Jj3M04X-oN-PQpLRvKahcIWMQZSl3gQft20b0kWb3xWn3uaDbP2n7jWhvIeq72y-tMQlRX5q79atTMfNTJ-qcH0KQpsIJM5-DWbT8IjHCJt6K8JbujoKvt-5rDHJTg5DTjhPrMbPORWMT-MTryKKOztxjkf-bL-xrD0nODjMbxQ40fKanRhlRNB-3iV-OxDUvnyxAZWJjxKUQxtNRJKxnx3hRM8Jrd3fvobUPULxo9LUvMJgcdot5yBbc8eIna5hjkbfJBQttjQn3hfIkj2CKLK-oj-D8lj5AM3j; channel=baidusearch; baikeVisitId=356e66e5-f4fb-4dd9-99a8-06bbc9ceb739; __bid_n=18ea75651ea4944bc97197; BD_HOME=1; H_WISE_SIDS_BFESS=60360_60600_60728_60749; B64_BOT=1; COOKIE_SESSION=5687_0_7_7_7_16_0_0_4_6_0_0_715485_0_6_0_1727255911_0_1727255905|9#169_6_1718709886|3; PSINO=7; BDRCVFR[S4-dAuiWMmn]=mk3SLVN4HKm; BDRCVFR[k-3xBxsWSJs]=mk3SLVN4HKm; H_PS_PSSID=60600_60794_60826_60844; BA_HECTOR=85808lal0h0g0k8g8h258h8h8nq73q1jfd5bi1v; ZFY=CcXTJMCi9ht2ML4gezdFtbE9Uar1JmpUfQgd:BgWOiis:C'}
while True:word = input("请输入单词:")resp = requests.post(url,headers = headers, data = {"kw":word}) # 发请求for item in resp.json()['data']:#想看效果,就打印Itemprint(item["v"])
三、urllib法
导入urllib,发出Request向url请求,再用urlopen读取并编码为utf-8,最后获取单词的翻译,代码如下:
from urllib import request, parse
import json
def translate_word(string):url = "https://fanyi.baidu.com/sug" # 请求数据的url链接data = {'kw': string}data_url = parse.urlencode(data)req = request.Request(url=url, data=data_url.encode('utf-8'))response = request.urlopen(req).read()res = json.loads(response)if 'data' in res and len(res['data']) > 0:trans_list = res['data'][0]['v']if ';' in trans_list:trans = trans_list.split(';')else:trans = trans_listelse:trans = "翻译失败"return trans_list
print(translate_word("think"))
四、学后反思
对比两种两法,获取的数据内容不尽相同,第二种获取的内容较少,第一种获取的内容更为全面,方法也简单,但是获取的内容有重复。
大家可以根据个人的需求,选择适合自己的代码。