程序问答   发布时间:2022-06-02  发布网站:大佬教程  code.js-code.com
大佬教程收集整理的这篇文章主要介绍了训练 lstm 编码器-解码器模型时,损失和预测为 nan大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。

如何解决训练 lstm 编码器-解码器模型时,损失和预测为 nan?

开发过程中遇到训练 lstm 编码器-解码器模型时,损失和预测为 nan的问题如何解决?下面主要结合日常开发的经验,给出你关于训练 lstm 编码器-解码器模型时,损失和预测为 nan的解决方法建议,希望对你解决训练 lstm 编码器-解码器模型时,损失和预测为 nan有所启发或帮助;

我是神经网络的初学者,正在尝试创建用于关系提取的基本编码器-解码器模型。代码中的输入是用于检查代码的迷你示例(与真实数据形式相同)。 index_** 是文本标记的索引,tag_** 是标签的索引。现在有一个问题,损失和预测都是nan。有人可以帮忙吗?

这里是输出:

2021-04-27 23:59:22.275718: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creaTing XLA devices,tf_xla_enable_xla_devices not set
2021-04-27 23:59:22.276153: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network library (onednN) to use the following cpu instructions in perfoRMANce-critical operations:  AVX2 FMA
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2021-04-27 23:59:25.994977: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MliR optimization passes are enabled (registered 2)
WARNING:tensorflow:Model was constructed with shape (None,5) for input KerasTensor(type_spec=TensorSpec(shape=(None,5),dtype=tf.float32,name='embedding_input'),name='embedding_input',description="created by layer 'embedding_input'"),but it was called on an input with incompatible shape (None,1).
WARNING:tensorflow:Model was constructed with shape (None,1).
1/1 - 13s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
[[[nan]
  [nan]
  [nan]
  [nan]
  [nan]]

 [[nan]
  [nan]
  [nan]
  [nan]
  [nan]]]

这是我的代码:

import pickle
from keras import Sequential
from keras.layers import BIDirectional,LSTM,Dense,Embedding,Dropout,Activation,softmax,TimediStributed


def create_same_length(tokens,Tags,max_s):
    W = []
    T = []
    for i in range(len(tokens)):
        w = [0] * (max_s - len(tokens[i]))
        t = [0] * (max_s - len(tokens[i]))
        w += tokens[i]
        W.append(w)
        t += Tags[i]
        T.append(t)
    return W,T


def build_model(W_Train,T_Train,max_s,W_dev,T_dev,souce_size,tag_sizE):
    model = Sequential()
    model.add(
        Embedding(input_dim=souce_size + 1,output_dim=300,input_length=max_s,mask_zero=TruE))
    model.add(
        BIDirectional(LSTM(300,return_sequences=TruE),merge_mode='concat'))
    model.add(LSTM(300,return_sequences=TruE))
    model.add(TimediStributed(Dense(1,activation='softmax')))
    model.add(Dropout(0.3))
    model.compile(loss='categorical_crossentropy',optimizer='rmsprop')
    return model


if __name__ == '__main__':
    index_Train = [[1,2,3],[4,5,6,7],[8,9,10,11,12],[13,14]]
    Tags_Train = [[1,1],[3,1,4,5],[1,1]]
    max_s_Train = 5
    index_dev = [[1,14,15]]
    max_s_dev = 3
    Tags_dev = [[1,3]]
    souce_size = 15
    tag_size = 5
    max_s = max(max_s_Train,max_s_dev)  #,max_test)
    W_Train,T_Train = create_same_length(index_Train,Tags_Train,max_s)
    W_dev,T_dev = create_same_length(index_dev,Tags_dev,max_s)
    # W_test=create_same_length(index_test,max_s)
    blmodel = build_model(W_Train,tag_sizE)
    for epoch in range(5):
        # fit model for one epoch on this sequence
        # blmodel.fit(W_Train,epochs=1,batch_size=32,verbose=2)
        for i in range(len(W_Train)):
            blmodel.fit(W_Train[i],T_Train[i],verbose=2)
    T_dev_pre = blmodel.preDict(W_dev,batch_size=None,verbose=0)
    print(T_dev_prE)

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

大佬总结

以上是大佬教程为你收集整理的训练 lstm 编码器-解码器模型时,损失和预测为 nan全部内容,希望文章能够帮你解决训练 lstm 编码器-解码器模型时,损失和预测为 nan所遇到的程序开发问题。

如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。