当前位置:网站首页>[LSTM regression prediction] Based on MATLAB tpa-lstm time attention mechanism, long-term and short-term memory neural network regression prediction (multiple input and single output) [including Matla
[LSTM regression prediction] Based on MATLAB tpa-lstm time attention mechanism, long-term and short-term memory neural network regression prediction (multiple input and single output) [including Matla
2022-07-18 11:56:00 【Poseidon light】
One 、 Time pattern attention mechanism BiLSTM forecast
1 BiLSTM Principle structure
LSTM On 1997 Was proposed in , Used to deal with long time series problems , A typical LSTM Structure is shown in figure 2 Shown .
chart 2 in ,xt Represents the current input of the time series ;Ct At present LSTM The cellular state of the unit , Usually only in LSTM Internal flow , yes LSTM The internal memory of ;ht Represents the current encoded hidden state vector ;ft Indicates the degree of forgetting information ;it Indicates the retention degree of input information ;C t Processing information indicating the current state .ot Express the degree of retention of output information ;tanh Represents hyperbolic tangent function . Subscript t-1 Represents the last moment LSTM The state quantity corresponding to the unit .
LSTM Unit has 3 Daomen , The forgotten door 、 Input gate and output gate . Forgetting gate can forget a certain proportion of past information ; The input gate records part of the input information at the current time into the cell state ; The output gate will encode the hidden state vector and the cell state selectively as the next moment LSTM Unit input .
The output of the current moment may not only be related to past information , But also related to future information . but LSTM Cannot encode information from back to front , and BiLSTM By reversing the time series , From positive to negative LSTM form , It can better capture the impact of bidirectional sequences .BiL STM The output expression is 
In style :ht Express BiLSTM The hidden state vector of ;concat It means splicing in the output dimension ;htf,htb It means forward and backward LSTM The hidden state vector of .
BiLSTM The structure diagram is shown in the figure 3 Shown . By reversing the forward sequence as a backward LSTM The input of , Two neural networks can be trained at the same time . Forward direction LSTM Use past information to predict future information , Back LSTM Use future information to predict past information , The output result is determined by the output of the two networks .BiLSTM It has a better prediction effect for time series that rely on both before and after information , Therefore, this paper adopts BiLSTM Neural network combined with wind power for two-way information prediction .
2 TPA Mechanism
The attention mechanism mimics the human brain , Pay more attention to important information , And ignore the relatively useless information , It has been widely used in natural language processing 、 In image and speech recognition , In recent years, it has also been widely used in various prediction problems . The traditional attention mechanism pays attention to the weight distribution at different time points , It has better effect when each time step contains only one variable . But for the power prediction of multiple wind turbines in the region , Each time step contains multiple variables , There may be complex nonlinear internal relations between various variables , And each variable sequence has its own characteristics and period , It is difficult to choose a time step alone as the focus of attention . and TPA By multiple one-dimensional CNN Filter from BiLSTM Hidden state line vector extraction feature , So that the model can learn the interdependence between multivariable from different time steps .TPA The structure diagram is shown in the figure 4 Shown .
chart 2 A typical LSTM structure 
chart 3 BiL STM Structural sketch
For the original time series BiLSTM Handle , Got ht-w—ht Express BiL STM The hidden state vector corresponding to different time inputs ,w Is the length of the time series . Define the implicit state matrix H=(ht-w,ht-w+1,…,ht-1), The implicit state column vector represents the same time step BiLSTM Variables composed of parameters of internal gate neurons , The row vector represents the state of a single variable in all time steps .
chart 4 Implicit state matrix H The boxes on represent different one-dimensional convolution kernels , Use one-dimensional convolution along H Of m Feature convolution , Extract the time pattern matrix of the variable signal model HC
In style :Cj It means the first one j A length of T Filter for ;T Indicates the maximum length that needs attention , Usually it can be taken as w;* For convolution . The convolution kernel of one-dimensional filter has k individual , Each convolution kernel is convoluted along the row vector of the implicit state matrix . The time pattern matrix contains the complex internal relations and temporal relationships of different sequences , It is a high-dimensional embodiment of the complex nonlinear relationship of different sequences .
The following attention mechanism function is defined to calculate the Correlation :
In style :HiC yes HC The row vector ;Wa by m×k The weight matrix of ;αi For attention weight ;σ Express Sigmoid function . Use the attention weight obtained αi And HC To sum by weight , Get attention vector vt
In style ,n Represents the input variable x The characteristic number of .
take vt And ht The final prediction value is obtained by adding after linear mapping 
In style :yt-1+△ Represents the final predicted value ;h′ Is the intermediate variable used to generate the final value ;Δ Represents the prediction time scale of different prediction tasks ;Wh′,Wh and Wv Is the different weight matrix of the corresponding variable .
The traditional attention mechanism directly utilizes the time series of the original data CNN Feature extraction , Only the temporal features of a single sequence can be extracted , Unable to take into account the correlation between different sequences . and BiLSTM The variables of the implicit state matrix contain the complex relationships between different sequences under different time steps , utilize CNN Feature extraction of the row vector of the implicit state matrix , It can simultaneously extract the complex relationship between time series and different variables . And attention vector vt Is the weighted sum of row vectors of the time pattern matrix containing time information , Therefore, the model can select relevant information from different time steps . When dealing with the complex and nonlinear interdependence between time steps and different sequences such as multi fan ultra short-term power prediction ,TPA Its advanced performance shows unique advantages .
Two 、 Partial source code
% Data sets List as a feature , Number of behavior samples
% Data sets List as a feature , Number of behavior samples
%% Clear the command line window and workspace variables
clc
clear
close all
%% Path settings
addpath('./')
%% Data import and processing
load('./Train.mat')
Train.weekend = dummyvar(Train.weekend);
Train.month = dummyvar(Train.month);
Train = movevars(Train,{
'weekend','month'},'After','demandLag');
Train.ts = [];
% Train.hour = dummyvar(Train.hour);
% Take the initiative to observe the variable format of the right workspace , Change and replace the previous data
Train(1,:) =[];
y = Train.demand;
x = Train{
:,2:5};
[xnorm,xopt] = mapminmax(x',0,1);
[ynorm,yopt] = mapminmax(y',0,1);
%
% xnorm = [xnorm;Train.weekend';Train.month'];
%%
% x = x';
xnorm = xnorm(:,1:1000);
ynorm = ynorm(1:1000);
k = 24; % Lag length
% convert to 2-D image
for i = 1:length(ynorm)-k
Train_xNorm(:,i,:) = xnorm(:,i:i+k-1);
Train_yNorm(i) = ynorm(i+k-1);
Train_y(i) = y(i+k-1);
end
Train_yNorm= Train_yNorm';
ytest = Train.demand(1001:1170);
xtest = Train{
1001:1170,2:5};
[xtestnorm] = mapminmax('apply', xtest',xopt);
[ytestnorm] = mapminmax('apply',ytest',yopt);
% xtestnorm = [xtestnorm; Train.weekend(1001:1170,:)'; Train.month(1001:1170,:)'];
xtest = xtest';
for i = 1:length(ytestnorm)-k
Test_xNorm(:,i,:) = xtestnorm(:,i:i+k-1);
Test_yNorm(i) = ytestnorm(i+k-1);
Test_y(i) = ytest(i+k-1);
end
Test_yNorm = Test_yNorm';
clear k i x y
%
Train_xNorm = dlarray(Train_xNorm,'CBT');
Train_yNorm = dlarray(Train_yNorm,'BC');
Test_xNorm = dlarray(Test_xNorm,'CBT');
Test_yNorm = dlarray(Test_yNorm,'BC');
%% Division of training set and verification set
TrainSampleLength = length(Train_yNorm);
validatasize = floor(TrainSampleLength * 0.1);
Validata_xNorm = Train_xNorm(:,end - validatasize:end,:);
Validata_yNorm = Train_yNorm(:,TrainSampleLength-validatasize:end);
Validata_y = Train_y(TrainSampleLength-validatasize:end);
Train_xNorm = Train_xNorm(:,1:end-validatasize,:);
Train_yNorm = Train_yNorm(:,1:end-validatasize);
Train_y = Train_y(1:end-validatasize);
%%
% Parameter setting
inputSize = size(Train_xNorm,1); % data input x The characteristic dimension of
outputSize = 1; % Data output y Dimensions
numhidden_units1=50;
[params,~] = paramsInit(numhidden_units1,inputSize,outputSize); % Import initialization parameters
[~,validatastate] = paramsInit(numhidden_units1,inputSize,outputSize); % Import initialization parameters
[~,TestState] = paramsInit(numhidden_units1,inputSize,outputSize); % Import initialization parameters
% Training related parameters
TrainOptions;
numIterationsPerEpoch = floor((TrainSampleLength-validatasize)/minibatchsize);
LearnRate = 0.01;
%% Loop over epochs.
figure
start = tic;
lineLossTrain = animatedline('color','r');
validationLoss = animatedline('color',[0 0 0]./255,'Marker','o','MarkerFaceColor',[150 150 150]./255);
xlabel('Iteration')
ylabel('Loss')
% epoch to update
iteration = 0;
for epoch = 1 : numEpochs
[~,state] = paramsInit(numhidden_units1,inputSize,outputSize); % Every round epoch,state initialization
disp(['Epoch: ', int2str(epoch)])
% batch to update
for i = 1 : numIterationsPerEpoch
iteration = iteration + 1;
disp(['Iteration: ', int2str(iteration)])
idx = (i-1)*minibatchsize+1:i*minibatchsize;
dlX = gpuArray(Train_xNorm(:,idx,:));
dlY = gpuArray(Train_yNorm(idx));
[gradients,loss,state] = dlfeval(@TPAModel,dlX,dlY,params,state);
% L2 Regularization
% L2regulationFactor = 0.000011;
% gradients = dlupdate( @(g,parameters) L2Regulation(g,parameters,L2regulationFactor),gradients,params);
% gradients = dlupdate(@(g) thresholdL2Norm(g, gradientThreshold),gradients);
[params,averageGrad,averageSqGrad] = adamupdate(params,gradients,averageGrad,averageSqGrad,iteration,LearnRate);
% Verification set test
if iteration == 1 || mod(iteration,validationFrequency) == 0
output_Ynorm = TPAModelPredict(gpuArray(Validata_xNorm),params,validatastate);
lossValidation = mse(output_Ynorm, gpuArray(Validata_yNorm));
end
3、 ... and 、 Running results




Four 、matlab Edition and references
1 matlab edition
2014a
2 reference
[1] Wang Yuhong , Shi Yunxiang . Based on the attention mechanism of time pattern BiLSTM Ultra short term power prediction of multi wind turbines [J]. High voltage technology . 2022,48(05)
3 remarks
This part of the introduction is taken from the Internet , For reference only , If infringement , Contact deletion
边栏推荐
- Thinkphp5 read multiline text, read files, and split multiline text
- Basic functions commonly used in unity
- How to use jedis to operate redis database?
- 每日学习Redis——拿捏Redis中的通用指令(键和数据库通用指令)
- 读书笔记:程序员的自我修养---第一章
- mysql报错 1142 - SELECT command denied to user ‘dev‘@‘localhost‘ for table ‘user‘ (已解决)
- Unity中常用到的基础函数
- 数据库系统概率 -- 关系型数据库
- Matlab_ Interpolation and extraction
- 深度学习------不同方法实现resnet-18、resnet34
猜你喜欢

语音芯片ic分类以及sop8的otp语音芯片对比 选型

汇编语言程序设计技巧详解(附例题)

数据库系统概述--数据模型概述

Celsius的破产全过程是怎样的?这篇文章带你亲身体会

中台建设利器-SPI插件机制

Explanation of exercises related to binary tree

设计师离职,iPhone14或变成砖头,用iPhone更有逼格了
![[Load balancer does not contain an instance for the service mall-coupon]和项目正常启动但注册不上nacos](/img/ff/0f617ddad80a1b042a6c647c33188f.png)
[Load balancer does not contain an instance for the service mall-coupon]和项目正常启动但注册不上nacos

mysql json字段查询

删除.idea目录后,svn菜单恢复操作
随机推荐
常用音频特征:梅尔频谱(melspectrogram)、幅度谱(短时傅里叶变换谱/STFT)、梅尔倒谱(MFCC)
如何基于知识图谱技术构建现代搜索引擎系统、智能问答系统、智能推荐系统?
深度学习------验证码
Celery package structure, celery task, double write consistency
UDS之0x22、0x2E服务
Go read yaml's pit
[C language] string function and memory operation function
【ES实战】Spark写入ES支持
redis配置文件
如何使用Jedis操作Redis数据库?
Camera image quality debugging, learning materials sharing
Explanation of exercises related to binary tree
每日学习Redis——拿捏Redis中的通用指令(键和数据库通用指令)
2022-7-11 pcl double free or corruption(out) . valgrind. - march=native
[es practice] spark writes es support
为何消费者都不愿买国产而买iPhone?因为国产手机贬值太快了
中台建设利器-SPI插件机制
com.alibaba.fastjson.JSONException: unclosed string
Pdf manual | 1666 pages zabbix6.0 official Chinese operation manual pdf free!
【LSTM回归预测】基于matlab TPA-LSTM时间注意力机制长短期记忆神经网络回归预测(多输入单输出)【含Matlab源码 1984期】