https://imgse.com/i/ppdkAc6
其中每一个列代表一个居民的用电,想要通过编码器重构数据进行预测。代码如下所示:
[color=]import
torch
[color=]import
torch.nn
[color=]as
nn
[color=]import
numpy
[color=]as
np
[color=]import
matplotlib.pyplot
[color=]as
plt
[color=]import
pandas
[color=]as
pd
[color=]import
sys
# 将LSTM和自动编码器-解码器结合起来,可以用LSTM作为自动编码器的编码器,将输入序列编码为隐藏状态,
# 然后使用LSTM的反向传播将隐藏状态解码为输出序列。这种方法可以用于对序列数据进行特征提取和重建
[color=]class
LSTMAutoencoder
(nn.Module):
[color=]def
[color=]__init__
(
[color=]self
, input_size, hidden_size):
super
(LSTMAutoencoder,
[color=]self
).
[color=]__init__
()
[color=]self
.input_size = input_size
[color=]self
.hidden_size = hidden_size
# LSTM encoder input_size = 10 hidden_size = 5
[color=]self
.encoder = nn.LSTM(
input_size
=input_size,
hidden_size
=hidden_size,
num_layers
=
[color=]1
,
batch_first
=
[color=]True
)
# LSTM decoder 5 10
[color=]self
.decoder = nn.LSTM(
input_size
=hidden_size,
hidden_size
=input_size,
num_layers
=
[color=]1
,
batch_first
=
[color=]True
)
[color=]def
forward
(
[color=]self
, x):
# 编码
encoded, (
hidden
,
cell
) =
[color=]self
.encoder(x)
# 解码
decoded, _ =
[color=]self
.decoder(encoded) #[21,10,1]
[color=]return
decoded
# Usage example
input_size = 1
hidden_size = 16
batch_size = 10
sequence_length = 50
LR = 0.005
EPOCH = 1000
# Create random input data
# 读取文件中的所有数据
all_data = pd.read_csv("./dataset/LD2011_2014.csv",
index_col
=
[color=]0
,
parse_dates
=[
[color=]0
])
all_data = all_data.replace(
[color=]0.000000
, np.nan) # 填充
# 找到一户不不含有nan的数据
one_home_data = all_data["MT_158"].resample("W").sum()
(np.any(one_home_data == np.nan)) # 打印False代表没有nan
plt.figure()
plt.plot(one_home_data,
label
='test') # 横纵坐标
one_home_data = torch.from_numpy(one_home_data.astype('float32').values.reshape(-
[color=]1
, batch_size, input_size)) #[21,10,1]
model = LSTMAutoencoder(
input_size
=input_size,
hidden_size
=hidden_size)
(model)
loss_function = nn.MSELoss() #损失函数
optimizer = torch.optim.Adam(model.parameters(),
lr
=LR) #优化器
[color=]for
epoch
[color=]in
range
(EPOCH):
reconstruction = model(one_home_data) #[1,21,1]
loss = loss_function(reconstruction, one_home_data)
loss.backward() # 反向传播,计算当前梯度
optimizer.step()
optimizer.zero_grad()
[color=]if
epoch %
[color=]100
==
[color=]0
:
('Epoch :', epoch, '|', 'train_loss:%.4f' % loss.data)在预测完之后,每次decoded输出的结果都很小,与原始数据差距很大,每次的均方误差损失上亿,不知道是模型哪里设置的不合适,还是哪里出了问题,是不是要在模型组以后加一个全连接层还是什么。原始的数据经过处理后,大概是这样:https://imgse.com/i/ppdkd4s每一个数据都很大,但是经过编码器解码器重构之后,得到的数据都没有超过1的,是不是经过激活函数归一化了啊??我不理解经过解码后的数据decoded的数据如下图所示:https://imgse.com/i/ppdkrvV原始数据画出来的图示这样子的:https://imgse.com/i/ppdk6DU所以不知道问题出在了哪里,请大神指导。