本文提到的翻译准确率测试指标是BLEU,以及使用Python库-fuzzywuzzy来计算相似度
一、基于BLEU值评估
1.只评估一段话,代码如下
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction# 机器翻译结果
machine_translation = "How are you"
# 参考翻译列表
human_translations = ["How have you been lately?"]# 使用SmoothingFunction.method1
chencherry = SmoothingFunction()# 计算BLEU分数
bleu_score = sentence_bleu(human_translations, machine_translation, smoothing_function=chencherry.method1)print(f"BLEU score: {bleu_score:.4f}")
评估结果如下,BLEU分数为0.17,换算为百分比后为17%
2.需要评估多段内容,将待评估的文字统一放入excel,代码如下
import pandas as pd
from nltk.translate.bleu_score import sentence_bleu# 读取Excel文件
file_path = r"C:\Users\Anita\Desktop\XXX\EN.xlsx" # excel文件路径
df = pd.read_excel(file_path, engine='openpyxl')# 对比
references = df['机器翻译结果'].tolist()
candidates = df['参考语料'].tolist()# 初始化BLEU分数的列表
bleu_scores = []# 遍历每一对参考文本和候选文本,计算BLEU值
for ref, cand in zip(references, candidates):if isinstance(ref, str) and isinstance(cand, str):# 空格分词ref_words = ref.split()cand_words = cand.split()# 计算bleu值score = sentence_bleu([ref_words], cand_words)bleu_scores.append(score)else:print("参考或候选文本不是字符串类型")# 打印BLEU值
for i, score in enumerate(bleu_scores):print(f"第{i + 1}行的BLEU值为: {score:.4f}")# 计算平均BLEU值
average_bleu = sum(bleu_scores) / len(bleu_scores)
print(f"平均BLEU值为: {average_bleu:.4f}")
评估结果如下
ps:如果是其他语言,比如中文,建议用其他工具进行分词,如Jieba
二、基于fuzzywuzzy评估
from fuzzywuzzy import fuzz
import pandas as pd# 读取Excel文件
file_path = r"C:\Users\Anita\Desktop\AIStation\CN-CN.xlsx" # Excel文件路径
df = pd.read_excel(file_path, engine='openpyxl')# 初始化一个列表来存储准确率
accuracies = []# 遍历DataFrame的前几行(根据需要调整行数)
for i in range(12): # 处理前x行value_a = df.iloc[i, 1] # 获取B列的值value_b = df.iloc[i, 2] # 获取C列的值# 使用fuzzywuzzy计算相似度similarity_score = fuzz.ratio(str(value_a), str(value_b)) # 将值转换为字符串# 计算准确率(相似度直接作为准确率)accuracy = similarity_score / 100# 将准确率添加到列表中accuracies.append(accuracy)# 打印当前行的准确率# print(f"A{i + 2}和B{i + 2}的对比相似度为: {similarity_score}")print(f"B{i + 2}和C{i + 2}的对比准确率为: {accuracy:.2%}")
评估结果如下