Python数据分析案例62——基于MAGU-LSTM的时间序列预测(记忆增强门控单元)

案例背景

时间序列lstm系列预测在学术界发论文都被做烂了，现在有一个新的MAGU-LSTM层的代码，并且效果还可以，非常少见我觉得还比较创新，然后我就分享一下它的代码演示一下，并且结合模态分解等方法做一次全面的深度学习的时间序列模型对比。并且这次代码都把我的看家绝学——分位数神经网络都掏出来了。

记忆增强门控单元（Memory Augmentation Gating Unit, MAGU）是一种新型的神经网络层，用于增强长短期记忆网络（LSTM）的记忆能力。其设计通过在LSTM的标准结构中引入门控机制，来更有效地结合输入信息和历史记忆。具体而言，MAGU通过对输入和记忆状态进行加权和，再经过sigmoid激活函数，生成门控值。这一过程通过输入权重矩阵和记忆权重矩阵的乘积实现，偏置项用于调节输出。门控值用于动态调整输入与记忆的比例，使得增强后的输出能够更好地利用历史信息。MAGU的结构设计能够在处理长距离依赖时提供更强的记忆保持能力，从而提高模型在时间序列预测任务中的表现。通过在多个基准数据集上的实验，采用MAGU的LSTM模型展现出在预测精度和收敛速度上的显著提升，表明这种增强机制在复杂时间序列数据处理中的潜力和优势。

数据介绍

之前各种类型的时间序列都做过了，参考之前的文章：

Python数据分析案例24——基于深度学习的锂电池寿命预测_锂离子电池寿命预测

Python数据分析案例25——海上风力发电预测（多变量循环神经网络）

Python数据分析案例41——基于CNN-BiLSTM的沪深300收盘价预测

Python数据分析案例42——基于Attention-BiGRU的时间序列数据预测

Python数据分析案例44——基于模态分解和深度学习的电负荷量预测(VMD+BiGRU+注意力)

Python数据分析案例50——基于EEMD-LSTM的石油价格预测

Python数据分析案例52——基于SSA-LSTM的风速预测(麻雀优化)

还有什么空气质量，太阳黑子，血糖浓度，交通流量，降雨量，锂电池寿命，风速，海上风电功率....实在做烂了.....

这次还是用最简单的就用股票的价格吧，沪深300去做时间序列预测。

我的这一套代码封装性太高，使用很简单，并且水论文非常通用，我看已经有人把我的这套代码拿去发论文了.....代码都已经开源了。我拿那个文章一看，这tm不是我写的代码嘛......没事能帮助大家就行，代码就是开源嘛，不过希望有的人别暗自当二道贩子去卖给别人割韭菜。

当然需要本次案例的全部代码文件和数据的同学还是可以参考：记忆增强门控单元

代码实现

由于这套代码我已经发过了太多太多次了，封装程度非常高，每次只是对里面的一些小模块做改动，所以说这套代码我本次案例不会写得很详细。可以从我以前的文章来了解我每个模块儿到底是干嘛用的。

首先导入包，深度学习框架用的是keras。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt 
import seaborn as sns
import os
import time
from datetime import datetime
import random as rn
import scipy.special as sc_special
plt.rcParams ['font.sans-serif'] ='SimHei'               #显示中文
plt.rcParams ['axes.unicode_minus']=False               #显示负号from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error,r2_scoreimport tensorflow as tf
import keras
import keras.backend as K
from keras.models import Model, Sequential
from keras.layers import Dense,Input, Dropout, Flatten,MaxPooling1D,Conv1D,SimpleRNN,LSTM,GRU,GlobalMaxPooling1D,Layer
from keras.layers import BatchNormalization,GlobalAveragePooling1D,MultiHeadAttention,AveragePooling1D,Bidirectional,LayerNormalization
from keras.callbacks import EarlyStopping

然后读取沪深300指数数据

data0=pd.read_excel('指数数据.xlsx',parse_dates=['trade_date'], usecols=['trade_date','close','open','high','low']).set_index('trade_date')[['open','high','low','close']].sort_index()
data0.head()

数据是从2014年7月份开始一直到2024年7月份结束的，10年的数据应该算是量还比较多的了吧。

然后自定义划分训练集，测试集，构造时间序列的数据函数。

def build_sequences(text, window_size=24):#text:list of capacityx, y = [],[]for i in range(len(text) - window_size):sequence = text[i:i+window_size]target = text[i+window_size]x.append(sequence)y.append(target)return np.array(x), np.array(y)def get_traintest(data,train_ratio=0.8,window_size=24):train_size=int(len(data)*train_ratio)train=data[:train_size]test=data[train_size-window_size:]X_train,y_train=build_sequences(train,window_size=window_size)X_test,y_test=build_sequences(test,window_size=window_size)return X_train,y_train[:,-1],X_test,y_test[:,-1]

数据进行标准化。

data=data0.to_numpy()
scaler = MinMaxScaler() 
scaler = scaler.fit(data[:,:-1])
X=scaler.transform(data[:,:-1])   y_scaler = MinMaxScaler() 
y_scaler = y_scaler.fit(data[:,-1].reshape(-1,1))
y=y_scaler.transform(data[:,-1].reshape(-1,1))

划分训练集和测试集

train_ratio=0.8     #训练集比例   
window_size=64      #滑动窗口大小，即循环神经网络的时间步长
X_train,y_train,X_test,y_test=get_traintest(np.c_[X,y],window_size=window_size,train_ratio=train_ratio)
print(X_train.shape,y_train.shape,X_test.shape,y_test.shape)

画图查看

y_test1 = y_scaler.inverse_transform(y_test.reshape(-1,1))
test_size=int(len(data)*(1-train_ratio))
plt.figure(figsize=(10,5),dpi=256)
plt.plot(data0.index[:-test_size],data0.iloc[:,-1].iloc[:-test_size],label='train',color='#FA9905')
plt.plot(data0.index[-test_size:],data0.iloc[:,-1].iloc[-(test_size):],label='test',color='#FB8498',linestyle='dashed')
plt.legend()
plt.ylabel('CSI 300',fontsize=14)
plt.xlabel('date',fontsize=14)
plt.show()

查看一下训练集和测试几的分别的时间

print(f'训练集开始时间{data0.index[:-test_size][0]},结束时间{data0.index[:-test_size][-1]}')
print(f'测试集开始时间{data0.index[-test_size:][0]},结束时间{data0.index[-test_size:][-1]}')

制定一下随机数种子函数跟回归问题常用的评价指标函数。

def set_my_seed():os.environ['PYTHONHASHSEED'] = '0'np.random.seed(1)rn.seed(12345)tf.random.set_seed(123)def evaluation(y_test, y_predict):mae = mean_absolute_error(y_test, y_predict)mse = mean_squared_error(y_test, y_predict)rmse = np.sqrt(mean_squared_error(y_test, y_predict))mape=(abs(y_predict -y_test)/ y_test).mean()r_2=r2_score(y_test, y_predict)return rmse, mae, mape ,r_2

我们来构建一个最简单的时间序列预测模型，我们都知道滑动平均模型ma，例如用过去5天的平均值来作为下一步要预测的值。但是我们可以极端一点用过去1天的平均值来作为下一步要预测的值，也就是说，用昨天的值来作为今天的值做预测，来看一下这么简单的预测模型的这个效果能有多少。

### 基准预测情况
result = pd.DataFrame()
result['t'] = data0['close']
# 生成第1列到第10列，每一列是t+1到t+10滑动窗口的值
for i in range(1, 6):result[f't+{i}'] = data0['close'].shift(i)
result=result.dropna()for t in result.columns[1:]:score=list(evaluation(result['t'], result[t]))s=[round(i,3) for i in score]print(f'{t}的预测效果为：RMSE:{s[0]},MAE:{s[1]},MAPE:{s[2]},R2:{s[3]}')

可以看到用过去1天的数据预测拟合优度就能够达到99.3%，mae为36，真的很小。

记住这么简单的一个模型的预测效果评价指标的数值，因为你待会儿会发现....很多神经网络都不如它........

好了，到了本次案例的主角了，MAGU-LSTM神经网络的定义。

class MemoryAugmentationGatingUnit(Layer):def __init__(self, units):super(MemoryAugmentationGatingUnit, self).__init__()self.units = unitsdef build(self, input_shape):self.w_input = self.add_weight(shape=(input_shape[-1], self.units),initializer='random_normal',trainable=True)self.w_memory = self.add_weight(shape=(self.units, self.units),initializer='random_normal',trainable=True)self.b = self.add_weight(shape=(self.units,), initializer='zeros',trainable=True)def call(self, inputs, memory):if inputs is None or memory is None:raise ValueError("输入和记忆不能为None")# 将 memory 扩展到与 inputs 形状匹配memory_expanded = tf.expand_dims(memory, axis=1)memory_expanded = tf.tile(memory_expanded, [1, tf.shape(inputs)[1], 1])# 计算门控值gate = tf.sigmoid(tf.matmul(inputs, self.w_input) + tf.matmul(memory_expanded, self.w_memory) + self.b)# 增强记忆augmented_memory = gate * memory_expanded + (1 - gate) * inputsreturn augmented_memory# 自定义LSTM模型
class CustomLSTMModel(Model):def __init__(self, lstm_units, mag_units, input_shape):super(CustomLSTMModel, self).__init__()self.lstm = LSTM(lstm_units, return_sequences=True, return_state=True)self.magu = MemoryAugmentationGatingUnit(mag_units)self.dropout = Dropout(0.2)self.dense = Dense(1)# 输入层self.build((None,) + input_shape)def call(self, inputs):lstm_output, h_state, c_state = self.lstm(inputs)# 使用最后的隐藏状态作为记忆memory = h_state# 通过记忆增强门控单元处理LSTM输出enhanced_output = self.magu(lstm_output, memory)dropout_output = self.dropout(enhanced_output)output = self.dense(dropout_output)return output

然后还要自定义注意力层

class AttentionLayer(Layer):    #自定义注意力层def __init__(self, **kwargs):super(AttentionLayer, self).__init__(**kwargs)def build(self, input_shape):self.W = self.add_weight(name='attention_weight',shape=(input_shape[-1], input_shape[-1]),initializer='random_normal',trainable=True)self.b = self.add_weight(name='attention_bias',shape=(input_shape[1], input_shape[-1]),initializer='zeros',trainable=True)super(AttentionLayer, self).build(input_shape)def call(self, x):# Applying a simpler attention mechanisme = K.tanh(K.dot(x, self.W) + self.b)a = K.softmax(e, axis=1)output = x * areturn outputdef compute_output_shape(self, input_shape):return input_shape

再自定义所有的模型的构建函数

def build_model(X_train,mode='LSTM',hidden_dim=[64,32]):set_my_seed()if mode=='MLP':model = Sequential()model.add(Flatten())model.add(Dense(hidden_dim[0],activation='relu',input_shape=(X_train.shape[-2],X_train.shape[-1])))model.add(Dense(hidden_dim[1],activation='relu'))#model.add(Dense(16,activation='relu'))model.add(Dense(1))elif mode=='RNN':model = Sequential()model.add(SimpleRNN(hidden_dim[0],return_sequences=True, input_shape=(X_train.shape[-2],X_train.shape[-1])))model.add(Dropout(0.2))model.add(SimpleRNN(hidden_dim[1])) model.add(Dropout(0.2))model.add(Dense(1))#     elif mode=='CNN':
#         model = Sequential()
#         model.add(Conv1D(hidden_dim[0],8,activation='relu',input_shape=(X_train.shape[-2],X_train.shape[-1])))
#         model.add(GlobalMaxPooling1D())
#         model.add(Dense(hidden_dim[1],activation='relu'))
#         model.add(Dense(1))elif mode=='LSTM':model = Sequential()model.add(LSTM(hidden_dim[0],return_sequences=True, input_shape=(X_train.shape[-2],X_train.shape[-1])))model.add(Dropout(0.2))model.add(LSTM(hidden_dim[1]))model.add(Dropout(0.2))model.add(Dense(1))elif mode=='GRU':model = Sequential()model.add(GRU(hidden_dim[0],return_sequences=True, input_shape=(X_train.shape[-2],X_train.shape[-1])))model.add(Dropout(0.2))model.add(GRU(hidden_dim[1]))model.add(Dropout(0.2))model.add(Dense(1))elif mode=='BiLSTM':model = Sequential()model.add(Bidirectional(LSTM(hidden_dim[0],return_sequences=True, input_shape=(X_train.shape[-2],X_train.shape[-1]))))model.add(Dropout(0.2))model.add(Bidirectional(LSTM(hidden_dim[1])))model.add(Dropout(0.2))model.add(Dense(1))elif mode=='BiGRU':model = Sequential()model.add(Bidirectional(GRU(hidden_dim[0],return_sequences=True, input_shape=(X_train.shape[-2],X_train.shape[-1]))))model.add(Dropout(0.2))model.add(Bidirectional(GRU(hidden_dim[1])))model.add(Dropout(0.2))model.add(Dense(1))elif mode == 'CustomLSTM':input_shape = (X_train.shape[-2], X_train.shape[-1])model = Sequential()model.add(Input(shape=input_shape))custom_lstm = CustomLSTMModel(lstm_units=hidden_dim[0], mag_units=hidden_dim[0], input_shape=input_shape)model.add(custom_lstm)model.add(LSTM(hidden_dim[1]))model.add(Dense(1))elif mode == 'BiGRU-Attention':model = Sequential()model.add(GRU(hidden_dim[0], return_sequences=True, input_shape=(X_train.shape[-2], X_train.shape[-1])))model.add(AttentionLayer())# Adding normalization and dropout for better training stability and performancemodel.add(LayerNormalization())#model.add(Dropout(0.1))model.add(GRU(hidden_dim[1]))model.add(Dense(1))elif mode == 'BiLSTM-Attention':model = Sequential()model.add(Bidirectional(LSTM(hidden_dim[0], return_sequences=True), input_shape=(X_train.shape[-2], X_train.shape[-1])))model.add(AttentionLayer())model.add(LayerNormalization())model.add(Dropout(0.2))model.add(Bidirectional(LSTM(hidden_dim[1])))#model.add(Flatten())model.add(Dense(hidden_dim[1],activation='relu'))model.add(Dense(1))elif mode == 'CustomBiLSTM-Attention':input_shape = (X_train.shape[-2], X_train.shape[-1])model = Sequential()model.add(Input(shape=input_shape))custom_lstm = CustomLSTMModel(lstm_units=hidden_dim[0], mag_units=hidden_dim[0], input_shape=input_shape)model.add(custom_lstm)model.add(AttentionLayer())model.add(Bidirectional(LSTM(hidden_dim[1])))model.add(Dense(1))elif mode=='CustomBiLSTM-Attention-QR': input_shape = (X_train.shape[-2], X_train.shape[-1])model = Sequential()model.add(Input(shape=input_shape))custom_lstm = CustomLSTMModel(lstm_units=hidden_dim[0], mag_units=hidden_dim[0], input_shape=input_shape)model.add(custom_lstm)model.add(AttentionLayer())model.add(Bidirectional(LSTM(hidden_dim[1])))model.add(Dense(1))elif mode=='EEMD-CustomBiLSTM-Attention-QR': input_shape = (X_train.shape[-2], X_train.shape[-1])model = Sequential()model.add(Input(shape=input_shape))custom_lstm = CustomLSTMModel(lstm_units=hidden_dim[0], mag_units=hidden_dim[0], input_shape=input_shape)model.add(custom_lstm)model.add(AttentionLayer())model.add(Bidirectional(LSTM(hidden_dim[1])))model.add(Dense(1))else:raise ValueError("Unsupported mode: " + mode)return model

可以看到我上面构建了大概十几种模型吧，都是待会儿会用的上的。

自定义损失函数，哇，分位数神经网络的核心就在于下面这一段分位数损失函数。

这可是我发论文的看家绝学，我连这都开源了......目前的市面上的文章做分位数神经网络的还真的很少，当然我也不怕大家学去了，因为我还有更厉害的方法发论文 [/龇牙]

def get_lossfun(model=None,loss_kind='QR',tau=0.5):#定义损失函数     if loss_kind=='QR':print('QRloss')def loss_func(y_true, y_pred):loss_1 = tf.constant(tau,dtype=tf.float32) ; loss_2 = tf.constant(1-tau,dtype=tf.float32)loss_mean = (tf.reduce_mean(tf.where(tf.greater(y_true, y_pred),(tf.abs(y_true-y_pred))*loss_1,(tf.abs(y_true-y_pred))*loss_2)))         return loss_meanmodel.compile(optimizer='Adam', loss=loss_func ,metrics=[tf.keras.metrics.RootMeanSquaredError(),"mape","mae"])else:print('mseloss')model.compile(optimizer='Adam', loss='mse' ,metrics=[tf.keras.metrics.RootMeanSquaredError(),"mape","mae"])return model

然后在自定义化损失和化拟合函数对比图的函数

def plot_loss(hist,imfname):plt.subplots(1,4,figsize=(16,2))for i,key in enumerate(hist.history.keys()):n=int(str('14')+str(i+1))plt.subplot(n)plt.plot(hist.history[key], 'k', label=f'Training {key}')plt.title(f'{imfname} Training {key}')plt.xlabel('Epochs')plt.ylabel(key)plt.legend()plt.tight_layout()plt.show()
def plot_fit(y_test, y_pred):plt.figure(figsize=(10,3.5))plt.plot(y_test, color="red", label="actual")plt.plot(y_pred, color="blue", label="predict")plt.title("拟合值和真实值对比")plt.xlabel("Time")plt.ylabel("data")plt.legend()plt.show()

自定义训练函数

df_eval=pd.DataFrame(columns=['RMSE','MAE','MAPE','R2'])
df_pred=pd.DataFrame()
def train_fun(mode='LSTM',batch_size=32,epochs=30,hidden_dim=[32,16],loss_kind='MSE',tau=0.5,verbose=1,show_loss=True,show_fit=True):set_my_seed()model=build_model(X_train,mode=mode,hidden_dim=hidden_dim)model=get_lossfun(model=model,loss_kind=loss_kind,tau=tau)earlystop = EarlyStopping(monitor='loss', min_delta=0, patience=5)s = time.time()#hist=model.fit(np.concatenate((X_train,X_test),axis=0),np.concatenate((y_train,y_test_s),axis=0),#batch_size=batch_size,epochs=epochs,callbacks=[earlystop],verbose=0)hist=model.fit(X_train,y_train,batch_size=batch_size,epochs=epochs,verbose=verbose,callbacks=[earlystop])#if show_loss:plot_loss(hist,mode)y_pred = model.predict(X_test)y_pred = y_scaler.inverse_transform(y_pred)#print(f'真实y的形状：{y_test.shape},预测y的形状：{y_pred.shape}')if show_fit:plot_fit(y_test1, y_pred)e=time.time()print(f"运行时间为{e-s}")score=list(evaluation(y_test1, y_pred))df_pred[mode]=y_pred.reshape(-1,)df_eval.loc[mode,:]=scores=[round(i,3) for i in score]print(f'{mode}的预测效果为：RMSE:{s[0]},MAE:{s[1]},MAPE:{s[2]},R2:{s[3]}')print("=======================================运行结束==========================================")return s

初始化参数：

batch_size=32
epochs=50
hidden_dim=[32,16]loss_kind='MSE'
verbose=0
show_fit=True
show_loss=True

开始训练各种神经网络的效果进行对比，首先训练mlp。

train_fun(mode='MLP',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)

RNN有点慢我就不运行了。

#train_fun(mode='RNN',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)

然后是LSTM

train_fun(mode='LSTM',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)

然后是GUR

train_fun(mode='GRU',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)

下面为了方便我下面模型就直接放一起了，结果先不截图了，后面会统一的来把所有的预测评价指标放在一起画图进行对比。

train_fun(mode='BiLSTM',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)
train_fun(mode='BiGRU',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)
train_fun(mode='CustomLSTM',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)
train_fun(mode='BiGRU-Attention',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)
train_fun(mode='BiLSTM-Attention',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)
train_fun(mode='CustomBiLSTM-Attention',batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)
train_fun(mode='CustomBiLSTM-Attention-QR',loss_kind='QR',batch_size=batch_size,epochs=60,hidden_dim=hidden_dim)

是不是简单的一批..... 十几种模型就这样慢慢等就跑完了

不同模型就换个字符串参数就行。。

加入EEMD

如果要做模态分解的话，稍微麻烦一点，就得按照如下步骤加入下面的一些其他的东西。

from PyEMD import EMD, EEMD, CEEMDAN,Visualisation def plot_imfs(data, method='EEMD',max_imf=4):# 提取信号S = data[:,0]t = np.arange(len(S))# 根据选择的方法初始化分解算法if method == 'EMD':decomposer = EMD()imfs = decomposer.emd(S,max_imf=max_imf)elif method == 'EEMD':decomposer = EEMD()imfs = decomposer.eemd(S,max_imf=max_imf)elif method == 'CEEMDAN':decomposer = CEEMDAN()decomposer.ceemdan(S,max_imf=max_imf)imfs, res = decomposer.get_imfs_and_residue()else:raise ValueError("Unsupported method. Choose 'EMD', 'EEMD', or 'CEEMDAN'.")# 指定不同的颜色colors = ['r', 'g', 'b', 'c', 'm', 'y', 'k']# 绘制原始数据和IMFsplt.figure(figsize=(8,len(imfs)+1), dpi=128)plt.subplot(len(imfs) + 1, 1, 1)plt.plot(t, S, 'k',lw=1)  # 使用黑色绘制原始数据plt.ylabel('raw data')# 绘制每个IMFfor n, imf in enumerate(imfs[::-1]):plt.subplot(len(imfs) + 1, 1, n + 2)plt.plot(t, imf, colors[n % len(colors)])plt.ylabel(f'IMF {n+1}')plt.tight_layout()plt.show()

画图可视化

plot_imfs(data0['close'].to_numpy().reshape(-1,1))

模态分解

decomposer = EEMD()
imfs = decomposer.eemd(data0['close'].to_numpy(),max_imf=4)
imfs.shape

分解的模态放入数据框中

df_names=pd.DataFrame()
for i  in range(len(imfs)):a = imfs[::-1][i,:]dataframe = pd.DataFrame({'v{}'.format(i+1):a})df_names['imf'+str(i+1)]=dataframe
df_names

重新定义训练集和测试集的函数。

def get_traintest2(data,train_size=len(df_names),window_size=24):train=data[:train_size]test=data[train_size-window_size:]X_train,y_train=build_sequences(train,window_size=window_size)X_test,y_test=build_sequences(test,window_size=window_size)return X_train,y_train,X_test,y_test

定义评估函数

def evaluation_all(df_eval_all,mode,show_fit=True):df_eval_all['all_pred']=df_eval_all.iloc[:,1:].sum(axis=1)#MAE2,RMSE2,MAPE2=evaluation(df_eval_all['actual'],df_eval_all['all_pred'])df_eval_all.rename(columns={'all_pred':'predict'},inplace=True)if show_fit:df_eval_all.loc[:,['predict','actual']].plot(figsize=(12,4),title=f'{mode} predict')score=list(evaluation(df_eval_all['actual'], df_eval_all['predict']))    print('总体预测效果：')s=[round(i,3) for i in score]print(f'{mode}的预测效果为：RMSE:{s[0]},MAE:{s[1]},MAPE:{s[2]},R2:{s[3]}')df_pred[mode]=df_eval_all['predict'].to_numpy().reshape(-1,)df_eval.loc[mode,:]=score

定义模态分解下的训练函数

def train_fuc2(mode='EEMD-CustomBiLSTM-Attention-QR',train_rat=0.8,window_size=64,batch_size=32,epochs=50,loss_kind='QR',hidden_dim=[32,16],tau=0.5,show_imf=True,show_loss=True,show_fit=True):df_all=df_names.copy()train_size=int(len(df_all)*train_rat)df_pred_all=pd.DataFrame(data0['close'].values[train_size:],columns=['actual'])for i,name in  enumerate(df_names):print(f'正在训练第：{name}条分量')data=df_all[name]X_train,y_train,X_test,y_test=get_traintest2(data.values,window_size=window_size,train_size=train_size)#归一化scaler = MinMaxScaler() scaler = scaler.fit(X_train) X_train = scaler.transform(X_train)  X_test = scaler.transform(X_test)scaler_y = MinMaxScaler() scaler_y = scaler_y.fit(y_train.reshape(-1,1)) y_train = scaler_y.transform(y_train.reshape(-1,1))set_my_seed()X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))print(X_train.shape, y_train.shape, X_test.shape,y_test.shape)model=build_model(X_train=X_train,mode=mode,hidden_dim=hidden_dim)model=get_lossfun(model=model,loss_kind=loss_kind,tau=tau)earlystop = EarlyStopping(monitor='loss', min_delta=0, patience=5)start = datetime.now()hist=model.fit(X_train, y_train,batch_size=batch_size,epochs=epochs,verbose=0,callbacks=[earlystop])if show_loss:plot_loss(hist,name)#预测y_pred = model.predict(X_test)y_pred =scaler_y.inverse_transform(y_pred)#print(y_pred.shape)end = datetime.now()if show_imf:df_eval=pd.DataFrame()df_eval['actual']=y_testdf_eval['pred']=y_preddf_eval.plot(figsize=(7,3))plt.show()score=list(evaluation(y_test=y_test, y_predict=y_pred))s=[round(i,3) for i in score]print(f'{mode}的{name}分量的效果：RMSE:{s[0]},MAE:{s[1]},MAPE:{s[2]},R2:{s[3]}')time=end-startdf_pred_all[name+'_pred']=y_predprint(f'running time is {time}')print('============================================================================================================================')evaluation_all(df_pred_all,mode=mode,show_fit=True)

开始训练模态分解之后的神经网络模型。

train_rat=0.8
set_my_seed()
train_fuc2(mode='EEMD-CustomBiLSTM-Attention-QR',window_size=window_size,train_rat=train_rat,batch_size=batch_size,epochs=epochs,hidden_dim=hidden_dim)

结果太长我就不展示完了，这里只展示部分分量的预测效果。

由于我在写前面训练函数的时候，已经把所有的评价指标都已经存下来了，我们下面可以直接查看这个预测结果对比就可以了。

df_eval.astype('float').style.bar(color='pink')

结果其实一目了然，我也懒得写分析，让gpt同学帮我们写分析吧。

“模型性能排序:

在所有模型中，EEMD-CustomBiLSTM-Attention-QR模型的性能最佳，其在所有误差指标（RMSE、MAE、MAPE）上都取得了最低值，并且具有最高的决定系数（R² = 0.982109），表明该模型在预测准确性上最优。
CustomBiLSTM-Attention-QR和CustomBiLSTM-Attention分别位列第二和第三，显示出在CustomLSTM及其他模型之上的性能优势。

误差分析:

RMSE（均方根误差）和MAE（平均绝对误差）作为误差度量，数值越低，代表模型的预测结果与实际值之间的偏差越小。EEMD-CustomBiLSTM-Attention-QR在这两个指标上均表现出色，表明其预测值与实际值的偏差最小。
MAPE（平均绝对百分比误差）是一个衡量预测准确度的无量纲指标，EEMD-CustomBiLSTM-Attention-QR也在此指标上表现突出，进一步确认了其优越的预测性能。

模型结构影响:

引入EEMD（集合经验模态分解）和Attention机制显然提升了CustomBiLSTM模型的表现。特别是EEMD的使用，可能有助于处理输入数据中的非平稳性和噪声，增强了预测的稳定性和准确性。
添加Attention机制也明显改善了模型捕获重要时序特征的能力，从而提升预测效果。

传统模型对比:

传统的LSTM和MLP（多层感知器）模型的表现相对较差，尤其是MLP，显示出在处理时间序列数据时，未能像复杂结构（如BiLSTM或GRU）那样有效地捕捉时序依赖特性。
相比之下，BiLSTM和BiGRU通过双向结构捕获序列信息的能力有所增强，但仍低于引入Attention和其他增强机制的模型。

R²值分析:

R²值为模型的解释力提供了一个定量指标，接近1的值表示模型能够很好地解释数据的变异性。EEMD-CustomBiLSTM-Attention-QR的R²值最高，表明其模型对数据波动的解释能力最强。

整体而言，综合使用EEMD、BiLSTM、Attention和QR等技术的模型相较于基础的序列模型具有显著的性能提升。这种组合通过增强对非线性、噪声和时序依赖的处理能力，显著提高了模型的预测准确性和稳定性。”

我们对预测结果简单画个图看看吧

df_pred.plot()

我们把预测结果储存，这样就可以去写论文，画各种图，计算各种指标了。

df_pred.to_csv('预测结果.csv',index=False)

创作不易，看官觉得写得还不错的话点个关注和赞吧，本人会持续更新python数据分析领域的代码文章~(需要定制类似的代码可私信)

当然需要本次案例的全部代码文件和数据的同学还是可以参考：记忆增强门控单元

以往的文章可以在这里查看：数据分析案例合集