程序问答   发布时间:2022-06-01  发布网站:大佬教程  code.js-code.com
大佬教程收集整理的这篇文章主要介绍了如何解决 PyTorch 自定义损失函数中的梯度爆炸/损失问题?大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。

如何解决如何解决 PyTorch 自定义损失函数中的梯度爆炸/损失问题??

开发过程中遇到如何解决 PyTorch 自定义损失函数中的梯度爆炸/损失问题?的问题如何解决?下面主要结合日常开发的经验,给出你关于如何解决 PyTorch 自定义损失函数中的梯度爆炸/损失问题?的解决方法建议,希望对你解决如何解决 PyTorch 自定义损失函数中的梯度爆炸/损失问题?有所启发或帮助;

我已经按照本文的说明编写了一个自定义损失函数:https://epubs.siam.org/doi/pdf/10.1137/1.9781611976236.18 但是,当我尝试使用此损失函数训练模型时,所有输出值在第一次损失后都变为 NaN。向后()操作。我的猜测是此代码中的某些内容导致渐变爆炸或丢失,但我不知道它在哪里/是什么。有人可以帮忙吗?一个最低限度的可重现示例如下:

import torch
from torch import optim
import torchvision
from torchvision import datasets
import timm # you may need to pip install this!!

### LOSS FUNCTION ###

def class_means(Z,Y):
    '''Returns unique classes in batch and per class means'''
    classes = torch.unique(Y)
    means = None
    for c in classes:
        if means == None:
            means = torch.mean(Z[Y == c],axis = 0)
        else:
            means = torch.vstack([means,torch.mean(Z[Y == c],axis = 0)])
    return classes,means


def intra_spread(Z,Y,classes,means):
    ''' Takes the L2 norm of all outputs (z) from their respective class means
    and averages them'''
    N = Z.shape[0]
    intraSpread = 0
    for i in range(classes.shape[0]):
        intraSpread += torch.sqrt(torch.sum((means[i] - Z[Y == classes[i]]) ** 2))
    return intraSpread / N

def similarity_matrix(mat):
    '''Return the distances between all rows of the input matrix'''
    r = torch.mm(mat,mat.t())
    diag = r.diag().unsqueeze(0)
    diag = diag.expand_as(r)
    D = diag + diag.t() - 2*r
    return D.sqrt()

def inter_separation(means):
    '''Returns the distance between the two closest means in input means'''
    return torch.min(similarity_matrix(means)[similarity_matrix(means) > 0])

def ii_loss(Z,Y):
    '''Returns intraSpread - interSep'''
    classes,means = class_means(Z,Y)
    intraSpread = intra_spread(Z,means)
    interSep = inter_separation(means)
    return intraSpread - interSep

criterion = ii_loss # use ii_loss as the criterion

# Loading in CIFAR 10. Make a files folder in your working directory
batch_size_Train = 32
Train_loader = torch.utils.data.DataLoader(
torchvision.datasets.CIFAR10('./files/',Train=True,download=True,transform=torchvision.transforms.Compose([
                           torchvision.transforms.Resize((300,300)),torchvision.transforms.ToTensor(),torchvision.transforms.normalize(
                             (0.4914,0.4822,0.4465),(0.2023,0.1994,0.2010))
                         ])),batch_size=batch_size_Train,shuffle=TruE)

### FitTing Model ###
## Load model
device = 0 # or 0 for gpu
m = timm.create_model('efficIEntneT_B3',preTrained = TruE)
m.classifIEr = torch.nn.linear(m.classifIEr.in_features,10) # 10 classes in cifar10
m = m.to(devicE)

## Establish hyperparameters
learning_rate = 1e-4
momentum = 0.9
reg = 1e-4
epochs = 18
decay_epochs = 15
decay = 0.1
optimizer = optim.Adam(m.parameters(),lr = learning_ratE)

## Train model
for i in range(epochs):
    print("Epoch: " + str(i + 1))
    for batch_IDx,(inputs,targets) in enumerate(Train_loader):
        print("  " + str(batch_IDx + 1) + "...",end = "")
        inputs = inputs.to(devicE) # X_batch
        targets = targets.to(devicE) # Y_batch
        optimizer.zero_grad()
        outputs = m(inputs) # Z_batch
        loss = criterion(outputs,targets) # loss_batch
        loss.BACkWARD()
        optimizer.step()
        print("DONE")

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

大佬总结

以上是大佬教程为你收集整理的如何解决 PyTorch 自定义损失函数中的梯度爆炸/损失问题?全部内容,希望文章能够帮你解决如何解决 PyTorch 自定义损失函数中的梯度爆炸/损失问题?所遇到的程序开发问题。

如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。
标签:如何解决