文章目录
- 简述
- 测试结果
- 完整代码
简述
先将前面两篇文章的代码重构一下,抽离共同函数到utils.py
。
重构后结构:
ComputationGraphLinearNet.py
:
使用计算图(forward、backward)求梯度构建的线性模型,代码不变;
代码:https://itsven.blog.csdn.net/article/details/141168288
NumericalGradientLinearNet.py
:
使用数值微分求梯度构建的线性模型,代码不变;
代码:https://itsven.blog.csdn.net/article/details/141168156
utils.py
:
测试数据构建;
main.py
:
对上述代码整合运行;
测试结果
测试前提:
不考虑epoch、batch_size、learning_rate是否合理,只在这些参数相同情况下,比较数值微分求梯度、计算图求梯度两种方式构建的线性模型运算速度、运算结果的差异。
参数误差:
计算方式:相同参数的绝对值误差平均值,运行结果来看,误差很小。
for key in graphNet.params.keys(): diff = np.average(np.abs(graphNet.params[key] - numericalNet.params[key])) params[key] = diff
时间差异:
epoch、batch_size=、learning_rate相同情况下,每次使用不同数据(数据大小5000),分别运行5次,明显计算图求梯度方法运算速度更快(时间:毫秒)。
完整代码
ComputationGraphLinearNet.py
、NumericalGradientLinearNet.py
见:
使用计算图(forward、backward)求梯度构建的线性模型https://itsven.blog.csdn.net/article/details/141168288
使用数值微分求梯度构建的线性模型:https://itsven.blog.csdn.net/article/details/141168156
utils.py
import numpy as np def build_data(weights, bias, num_examples): x = np.random.randn(num_examples, len(weights)) y = x.dot(weights) + bias # 给y加个噪声 y += np.random.rand(1) return x, y def data_iter(features, labels, batch_size): num_examples = len(features) # 按样本数量构造索引 indices = list(range(num_examples)) # 打乱索引数组 np.random.shuffle(indices) for i in range(0, num_examples, batch_size): batch_indices = np.array(indices[i:min(i + batch_size, num_examples)]) yield features[batch_indices], labels[batch_indices]
main.py
from ComputationGraphLinearNet import Network as GraphNet
from NumericalGradientLinearNet import Network as NumericalNet import utils
import numpy as np
import matplotlib.pyplot as plt
import time def numerical_test(x_train, y_train, batch_size, num_epochs): numericalNet = NumericalNet(2, 1, 0.01) loss_history = list() for i in range(num_epochs): for x, y in utils.data_iter(x_train, y_train, batch_size): grads = numericalNet.numerical_gradient(x, y) for key in grads: numericalNet.params[key] -= learning_rate * grads[key] running_loss = numericalNet.loss(x, y) loss_history.append(running_loss) print(f'最后一次损失值:{loss_history[-1]}') print(f'预测参数: true_w1={numericalNet.params["w1"]}, true_b1={numericalNet.params["b1"]}') return numericalNet def graph_test(x_train, y_train, batch_size, num_epochs): graphNet = GraphNet(2, 1, 0.01) loss_history = list() for i in range(num_epochs): for x_batch, y_batch in utils.data_iter(x_train, y_train, batch_size): grads = graphNet.gradient(x_batch, y_batch) for key in grads: graphNet.params[key] -= learning_rate * grads[key] running_loss = graphNet.loss(x_batch, y_batch) loss_history.append(running_loss) print(f'最后一次损失值:{loss_history[-1]}') print(f'预测参数: true_w1={graphNet.params["w1"]}, true_b1={graphNet.params["b1"]}') return graphNet if __name__ == '__main__': test_num = 5 num_epochs = 2 batch_size = 50 learning_rate = 0.01 graph_time = list() numerical_time = list() error_list = list() for i in range(test_num): true_w1 = np.random.rand(2, 1) true_b1 = np.random.rand(1) x_train, y_train = utils.build_data(true_w1, true_b1, 5000) print(f'\n----------------------------第{i+1}次') print(f'第{i+1}次, 正确参数: true_w1={true_w1}, true_b1={true_b1}\n') print("------------数值微分法:") start = time.perf_counter() numericalNet = numerical_test(x_train, y_train, batch_size, num_epochs) end = time.perf_counter() print(f"运行时间:{(end - start) * 1000}毫秒") numerical_time.append((end - start) * 1000) print("------------计算图法:") start = time.perf_counter() graphNet = graph_test(x_train, y_train, batch_size, num_epochs) end = time.perf_counter() print(f"运行时间:{(end - start) * 1000}毫秒") graph_time.append((end - start) * 1000) params = {} for key in graphNet.params.keys(): diff = np.average(np.abs(graphNet.params[key] - numericalNet.params[key])) params[key] = diff error_list.append(params) plt.title("数值微分、计算图两种求导速度差异", fontproperties="STSong") plt.xlabel("nums") plt.ylabel("time") plt.xticks(range(0, test_num)) plt.plot(graph_time, linestyle='dotted', marker='o', label='graph_time') plt.plot(numerical_time, linestyle='dotted', marker='*', label='numerical_time') plt.legend(loc='upper right') plt.show() for i in range(len(error_list)): print(f'第{i+1}次运行, 各参数绝对误差的平均值{error_list[i]}', end="\n")