🎯要点
- 优化损失函数评估指标
- 海岸线检测算法评估
- 遥感视觉表征和文本增强
- 乳腺癌预测模型算法
- 液体中闪烁光和切伦科夫光分离
- 多标签分类任务性能评估
- 有向无环图、多路径标记和非强制叶节点预测二元分类评估
- 特征归因可信性评估
- 马修斯相关系数对比其他准确度
Python桑基图混淆矩阵
桑基图是一种数据可视化技术或流程图,强调从一种状态到另一种状态或从一个时间到另一个时间的流动/移动/变化,其中箭头的宽度与所描绘的广泛属性的流速成正比。桑基图还可以可视化能源账户、区域或国家层面的物质流账户以及成本细目。该图表通常用于物质流分析的可视化。桑基图强调系统内的主要转移或流动。它们有助于确定流动中最重要的贡献。它们通常显示定义的系统边界内的守恒量。
Python桑基图和混淆矩阵
import pandas as pd
import numpy as np
from plotly import graph_objects as go
RED = "rgba(245,173,168,0.6)"
GREEN = "rgba(211,255,216,0.6)"
def create_df_from_confusion_matrix(confusion_matrix, class_labels=None):if not len(class_labels):df = pd.DataFrame(data=confusion_matrix, index=[f"True Class-{i+1}" for i in range(confusion_matrix.shape[0])],columns=[f"Predicted Class-{i+1}" for i in range(confusion_matrix.shape[0])])else:df = pd.DataFrame(data=confusion_matrix, index=[f"True {i}" for i in class_labels],columns=[f"Predicted {i}" for i in class_labels])df = df.stack().reset_index()df.rename(columns={0:'instances', 'level_0':'actual', 'level_1':'predicted'}, inplace=True)df["colour"] = df.apply(lambda x: GREEN if x.actual.split()[1:] == x.predicted.split()[1:] else RED, axis=1)node_labels = pd.concat([df.actual, df.predicted]).unique()node_labels_indices = {label:index for index, label in enumerate(node_labels)}df = df.assign(actual = df.actual.apply(lambda x: node_labels_indices[x]),predicted = df.predicted.apply(lambda x: node_labels_indices[x]))def get_link_text(row):if row["colour"] == GREEN:instance_count = row["instances"]source_class = ' '.join(node_labels[row['actual']].split()[1:])target_class = ' '.join(node_labels[row['predicted']].split()[1:])return f"{instance_count} {source_class} instances correctly classified as {target_class}"else:instance_count = row["instances"]source_class = ' '.join(node_labels[row['actual']].split()[1:])target_class = ' '.join(node_labels[row['predicted']].split()[1:])return f"{instance_count} {source_class} instances incorrectly classified as {target_class}"df["link_text"] = df.apply(get_link_text, axis = 1)return df, node_labels
根据混淆矩阵和类别标签绘制桑基图
def plot_confusion_matrix_as_sankey(confusion_matrix, class_labels = None):df, labels = create_df_from_confusion_matrix(confusion_matrix, class_labels)fig = go.Figure(data=[go.Sankey(node = dict(pad = 20,thickness = 20,line = dict(color = "gray", width = 1.0),label = labels,hovertemplate = "%{label} has total %{value:d} instances<extra></extra>"),link = dict(source = df.actual, target = df.predicted,value = df.instances,color = df.colour,customdata = df['link_text'], hovertemplate = "%{customdata}<extra></extra>" ))])fig.update_layout(title_text="Confusion Matrix Sankey Diagram", font_size=15,width=500, height=400)return fig
confusion_matrix = np.array([[10, 4],[2, 20]])plot_confusion_matrix_as_sankey(confusion_matrix, ['Fraud', 'Legit'])