实验目的:了解离散信源数学模型和信息熵
实验内容:以附件中英文文本文件中的内容为来源,构建26个英文字母(区分大小写)为信源符号的数学模型,要求输出字母的概率和该模型的信息熵。
要求:请使用自己熟悉的编程语言,完成信源建模,输出英文字母的概率和信源的信息熵。
使用python编写,最后输出相应柱状图,展示出字母的输出概率。
- 等消息个数信源,消息概率分布差异大,信源熵小,不确定程度小;消息等概分布,信源熵大,不确定程度大。
- 消息等概分布,消息个数多,信源熵大,不确定程度大。
信源熵有三种物理含义:
- 信源熵H(X)表示信源输出后,离散消息所提供的平均信息量。
- 信源熵H(X)表示信源输出前,信源的平均不确定度。
- 信源熵H(X)反映了变量X的随机性。
信息熵计算过程为:
for i in dict3:if dict3[i] != 0:sum1 += dict3[i] * (math.log(1 / (dict3[i]), 2))
全部代码如下所示:
import string
import matplotlib.pyplot as plt
import mathdef draw_from_dict(dicdata, RANGE, heng=0):# dicdata:字典的数据。# RANGE:截取显示的字典的长度。# heng=0,代表条状图的柱子是竖直向上的。heng=1,代表柱子是横向的。考虑到文字是从左到右的,让柱子横向排列更容易观察坐标轴。by_value = sorted(dicdata.items(), key=lambda item: item[0], reverse=False)x = []y = []plt.xlabel("Sequential letters")plt.ylabel("Probability of occurrence of each letter")plt.title("Character probability statistics")for xx, yy in zip(dicdata.keys(), dicdata.values()):# plt.text(xx, yy + 0.1, str(yy), ha='center')if yy != 0:plt.text(xx, yy, '%.3f' % yy, ha='center', va='bottom', fontsize=5)for d in by_value:x.append(d[0])y.append(d[1])if heng == 0:plt.bar(x[0:RANGE], y[0:RANGE])plt.show()returnelif heng == 1:plt.barh(x[0:RANGE], y[0:RANGE])plt.show()returnelse:return "heng的值仅为0或1!"def countLetters(string):s_count = 0for i in s:if i.isalpha():s_count += 1print('字母的个数有:', s_count, '个')return s_counts = 'Love is a set of emotions and behaviors characterized by intimacy, passion, and commitment. It involves care, ' \'closeness, protectiveness, attraction, affection, and trust. Love can vary in intensity and can change over ' \'time. It is associated with a range of positive emotions, including happiness, excitement, life satisfaction, ' \'and euphoria, but it can also result in negative emotions such as jealousy and stress. '
letterSum = countLetters(s)
print(letterSum)
asciiAll = string.ascii_lowercase + string.ascii_uppercase
dict3 = {key: 0.0 for key in asciiAll}
for x in s:if x.isalpha():dict3[x] = round(s.count(x) / letterSum, 6)for i in sorted(dict3):print((i, dict3[i]), end="\n")
sum1 = 0
for i in dict3:if dict3[i] != 0:sum1 += dict3[i] * (math.log(1 / (dict3[i]), 2))print('计算出的熵是', round(sum1, 4))draw_from_dict(dict3, 52, 0)
通过matplotlib包输出可视化图形,输出的柱状图如下