大佬教程收集整理的这篇文章主要介绍了训练 lstm 编码器-解码器模型时,损失和预测为 nan,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
我是神经网络的初学者,正在尝试创建用于关系提取的基本编码器-解码器模型。代码中的输入是用于检查代码的迷你示例(与真实数据形式相同)。 index_** 是文本标记的索引,tag_** 是标签的索引。现在有一个问题,损失和预测都是nan。有人可以帮忙吗?
这里是输出:
2021-04-27 23:59:22.275718: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creaTing XLA devices,tf_xla_enable_xla_devices not set
2021-04-27 23:59:22.276153: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network library (onednN) to use the following cpu instructions in perfoRMANce-critical operations: AVX2 FMA
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2021-04-27 23:59:25.994977: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MliR optimization passes are enabled (registered 2)
WARNING:tensorflow:Model was constructed with shape (None,5) for input KerasTensor(type_spec=TensorSpec(shape=(None,5),dtype=tf.float32,name='embedding_input'),name='embedding_input',description="created by layer 'embedding_input'"),but it was called on an input with incompatible shape (None,1).
WARNING:tensorflow:Model was constructed with shape (None,1).
1/1 - 13s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
[[[nan]
[nan]
[nan]
[nan]
[nan]]
[[nan]
[nan]
[nan]
[nan]
[nan]]]
这是我的代码:
import pickle
from keras import Sequential
from keras.layers import BIDirectional,LSTM,Dense,Embedding,Dropout,Activation,softmax,TimediStributed
def create_same_length(tokens,Tags,max_s):
W = []
T = []
for i in range(len(tokens)):
w = [0] * (max_s - len(tokens[i]))
t = [0] * (max_s - len(tokens[i]))
w += tokens[i]
W.append(w)
t += Tags[i]
T.append(t)
return W,T
def build_model(W_Train,T_Train,max_s,W_dev,T_dev,souce_size,tag_sizE):
model = Sequential()
model.add(
Embedding(input_dim=souce_size + 1,output_dim=300,input_length=max_s,mask_zero=TruE))
model.add(
BIDirectional(LSTM(300,return_sequences=TruE),merge_mode='concat'))
model.add(LSTM(300,return_sequences=TruE))
model.add(TimediStributed(Dense(1,activation='softmax')))
model.add(Dropout(0.3))
model.compile(loss='categorical_crossentropy',optimizer='rmsprop')
return model
if __name__ == '__main__':
index_Train = [[1,2,3],[4,5,6,7],[8,9,10,11,12],[13,14]]
Tags_Train = [[1,1],[3,1,4,5],[1,1]]
max_s_Train = 5
index_dev = [[1,14,15]]
max_s_dev = 3
Tags_dev = [[1,3]]
souce_size = 15
tag_size = 5
max_s = max(max_s_Train,max_s_dev) #,max_test)
W_Train,T_Train = create_same_length(index_Train,Tags_Train,max_s)
W_dev,T_dev = create_same_length(index_dev,Tags_dev,max_s)
# W_test=create_same_length(index_test,max_s)
blmodel = build_model(W_Train,tag_sizE)
for epoch in range(5):
# fit model for one epoch on this sequence
# blmodel.fit(W_Train,epochs=1,batch_size=32,verbose=2)
for i in range(len(W_Train)):
blmodel.fit(W_Train[i],T_Train[i],verbose=2)
T_dev_pre = blmodel.preDict(W_dev,batch_size=None,verbose=0)
print(T_dev_prE)
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)
以上是大佬教程为你收集整理的训练 lstm 编码器-解码器模型时,损失和预测为 nan全部内容,希望文章能够帮你解决训练 lstm 编码器-解码器模型时,损失和预测为 nan所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。