程序问答   发布时间:2022-06-01  发布网站:大佬教程  code.js-code.com
大佬教程收集整理的这篇文章主要介绍了如何在 pytorch 上保存训练模型的权重?大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。

如何解决如何在 pytorch 上保存训练模型的权重??

开发过程中遇到如何在 pytorch 上保存训练模型的权重?的问题如何解决?下面主要结合日常开发的经验,给出你关于如何在 pytorch 上保存训练模型的权重?的解决方法建议,希望对你解决如何在 pytorch 上保存训练模型的权重?有所启发或帮助;

我在 Colab 中使用 PyTorch 编写了 yolo 实现代码。我可以进行训练,但无法保存模型的权重。我不断收到“用尽输入”错误。有什么解决办法吗?

Code:
"""
Main file for Training Yolo model on Pascal VOC dataset
"""

import torch
import torchvision.transforms as transforms
import torch.optim as optim
import torchvision.transforms.functional as FT #resim transformları için 
from tqdm import tqdm  #progressbar için 

from torch.utils.data import DataLoader
from model import Yolov1
from dataset import VOcdataset
from utils import (
    non_max_suppression,mean_average_precision,intersection_over_union,cellBoxes_to_Boxes,geT_BBoxes,plot_image,save_checkpoint,load_checkpoint,)
from loss import Yololoss

seed = 123
torch.manual_seed(seed) #same datasets loading

# Hyperparameters etc. 
LEARNING_RATE = 2e-5
DEVICE = "cuda" if torch.cuda.is_available else "cpu"
BATCH_SIZE = 16 
WEIGHT_DECAY = 0 #sadece bir batchte Train yapmak için 0 dedik yani regularizasyon yapmıyoruz
EPOCHS = 1000
NUM_WORKERS = 2
PIN_MEMORY = True
LOAD_MODEL = True
LOAD_MODEL_file = "/content/drive/MyDrive/YolO 5 adım/archive/best.pt" #over fit olan modeli aldık
img_DIR = "/content/drive/MyDrive/YolO 5 adım/archive/images" 
LABEL_DIR = "/content/drive/MyDrive/YolO 5 adım/archive/labels"


class Compose(object): #buradaki neden,göndereceğimiz dönüşümümüzün yalnızca resim üzerinde işlemesi çünkü sadece resize yapıyoruz
    def __init__(self,transforms):
        self.transforms = transforms

    def __call__(self,img,bBoxes):
        for t in self.transforms:
            img,bBoxes = t(img),bBoxes

        return img,bBoxes


transform = Compose([transforms.Resize((448,448)),transforms.ToTensor(),]) #resize yapıp tensöre çevirdik


def Train_fn(Train_loader,model,optimizer,loss_fn): #tüm datasette bir Training döngüsü olacak
    loop = tqdm(Train_loader,leave=TruE) #progressbar vIDeosunda varmış
    mean_loss = []

    for batch_IDx,(x,y) in enumerate(loop):
        x,y = x.to(DEVICE),y.to(DEVICE)
        out = model(X)
        loss = loss_fn(out,y)
        mean_loss.append(loss.item())
        optimizer.zero_grad()
        loss.BACkWARD()
        optimizer.step()

        # update progress bar progress barda her batchteki lossu göreceğiz
        loop.set_postfix(loss=loss.item())

    print(f"Mean loss was {sum(mean_loss)/len(mean_loss)}")


def main():
    model = Yolov1(split_size=7,num_Boxes=2,num_classes=20).to(DEVICE)
    optimizer = optim.Adam(
        model.parameters(),lr=LEARNING_RATE,weight_decay=WEIGHT_DECAY
    )
    loss_fn = Yololoss()

    **if LOAD_MODEL:
        load_checkpoint(torch.load(LOAD_MODEL_filE),optimizer)  ** I got the error from this line

    Train_dataset = VOcdataset(
        "/content/drive/MyDrive/YolO 5 adım/archive/8examples.csv",#eğer tüm dataseti istiyorsak Train.csv yaparız biz burada 100 örnek aldık sadece
        transform=transform,img_dir=img_DIR,label_dir=LABEL_DIR,)

    test_dataset = VOcdataset( #test kısmı
        "/content/drive/MyDrive/YolO 5 adım/archive/test.csv",transform=transform,)

    Train_loader = DataLoader(
        dataset=Train_dataset,batch_size=BATCH_SIZE,num_workers=NUM_WORKERS,pin_memory=PIN_MEMORY,shuffle=True,drop_last=false,#8 tanede denerken burayı false yaptık eğer çok az örnekte deniyorsak false fazla örnekte deniyorsak true yapıyoruz. çok az örnek yani tek bir batchte gerçeklesen örnekse false
    )

    test_loader = DataLoader(
        dataset=test_dataset,drop_last=True,)

  
    for epoch in range(EPOCHS):
        for x,y in Train_loader:  #BU KISIM GÖRSELLEŞTİRME İÇİN SONUCU BİZE GÖSTERİYOR TEK TEK RESİMLERDE
            x = x.to(DEVICE)              #Bu kısmı uygularken yukarıda hiperparametreler kısmındaki LOAD_MODAL'ı True yapmamız gerekiyor.
            for IDx in range(8):
                bBoxes = cellBoxes_to_Boxes(model(X))
                bBoxes = non_max_suppression(bBoxes[IDx],IoU_threshold=0.5,threshold=0.4,Box_format="mIDpoint")
                plot_image(x[IDx].permute(1,2,0).to("cpu"),bBoxes)

            import sys
            sys.exit()

        pred_Boxes,targeT_Boxes = geT_BBoxes( Train_loader,threshold=0.4 )

        mean_avg_prec = mean_average_precision(
            pred_Boxes,targeT_Boxes,Box_format="mIDpoint"
        )
        print(f"Train mAP: {mean_avg_prec}") #her epoch için yazdırdık

        if mean_avg_prec > 0.9:
            checkpoint = {
                "state_Dict": model.state_Dict(),"optimizer": optimizer.state_Dict(),}
            save_checkpoint(checkpoint,filename=LOAD_MODEL_filE)   #kaydetmek için buu
            import time
            time.sleep(10)

        Train_fn(Train_loader,loss_fn)


if __name__ == "__main__":
    main()

https://colab.research.google.com/drive/1l6Z86Qk8qu7Oo7fV6YNaWmJRFB9NPjCv?usp=sharing

如何在 pytorch 上保存训练模型的权重?

我从这一行出错(在图片中)

#save_checkpoint function from utils
def save_checkpoint(state,filename="my_checkpoint.pth.tar"):
    print("=> Saving checkpoint")
    torch.save(state,fileName)


def load_checkpoint(checkpoint,optimizer):
    print("=> Loading checkpoint")
    model.load_state_Dict(checkpoint["state_Dict"])
    optimizer.load_state_Dict(checkpoint["optimizer"])

解决方法

我看到你的代码是由 Youtuber "Aladdin Persson" 实现的

如果我看到 load_checkpoint ,就是这样。

def load_checkpoint(checkpoint,model,optimizer):
    print("=> Loading checkpoint")
    model.load_state_Dict(checkpoint["state_Dict"])
    optimizer.load_state_Dict(checkpoint["optimizer"])  

当您第一次训练模型时,您应该有 LOAD_MODEL = false &
一旦检查点以这个名称 "overfit.pth.tar" 保存,那么只有你可以加载它..
我已经测试了代码并且可以运行

大佬总结

以上是大佬教程为你收集整理的如何在 pytorch 上保存训练模型的权重?全部内容,希望文章能够帮你解决如何在 pytorch 上保存训练模型的权重?所遇到的程序开发问题。

如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。