大佬教程收集整理的这篇文章主要介绍了预计所有张量都在同一设备上,但发现至少有两个设备,cpu 和 cuda:0(在方法 wrapper_addmm,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
为了检测工具,我训练了一个更快的 r cnn。我已经定义了我的模型并且一切正常。但是为了有一个没有全局变量的更清晰的代码,我尝试编写一个 Mymodel 类,它会自动定义每个对象并训练模型。所以在这个类上我定义了一个名为 self.dataset = ToolDataset 的类。
在第一堂课上,我定义了我的输入(一个图像)和我的输出(一个目标,它是一个带有 bBox、标签、区域……的字典)。 然后我构建了一个数据加载器(所以我@R_874_9872@elf.data_loader),并且我使用了引擎库的函数Train_one_epoch。在这个函数中,我输入了我的模型(一个更快的 r cnn)、我的数据加载器和 cuda:0 的设备(我打印了它)。这个函数在我的数据加载器上迭代。该函数定义了一个图像列表和一个目标列表,并将列表的值转换为好的设备。 然后它调用:@H_288_5@model(images,targets)。在这一步中,我在建立两个设备时遇到了错误(我在消息末尾粘贴了错误)。
即使每个张量(我的图像和目标字典的每个值)都为命令 tensor.is_cuda 返回 True,我仍然收到错误消息。所以我真的不明白为什么错误说我也有cpu设备。我向您展示我的函数 Train 、 Train_one_epoch 以及我的变量图像和目标:
训练方法:
def Train(self,num_epoch = 10,gpu = TruE):
if gpu :
CUDA_LAUNCH_BLOCKING="1"
#torch.set_default_tensor_type(torch.floatTensor)
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(preTrained=TruE)
use_cuda = torch.cuda.is_available()
device = torch.device("cuda:0" if use_cuda else "cpu")
model.to(devicE)
if self.multi_object_detection == false :
num_classes = 2 # ['Tool','BACkground']
else :
print("need to set a multi object detection code")
in_features = torch.tensor(model.roi_heads.box_preDictor.cls_score.in_features,dtype = torch.int64).to(devicE)
print("in_features = {}".format(in_features))
model.roi_heads.box_preDictor = FastRCNNPreDictor(in_features,num_classes)
print( "model.roi_heads.box_preDictor {}".format( model.roi_heads.box_preDictor))
model_parameters = filter(lambda p: p.requires_grad,model.parameters())
#params = sum([np.prod(p.size()) for p in model_parameters])
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params,lr=0.001,momentum=0.9,weight_decay=0.0005)
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,step_size=10,gAMMa=0.5)
gc.collect()
num_epochs = 5
file_model_Dict_gpu = "model_state_Dict__gpu_lab2_and_lab7_5epoch.pth"
List_of_List_losses = []
print("device = ",devicE)
if (self.data_loader.dataset) == None :
self.build_DataLoader(devicE)
for epoch in tqdm(range(num_epochs)):
# Train for one epoch,prinTing every 10 iterations
Train_his_,List_losses,List_losses_Dict = Train_one_epoch(model,optimizer,self.data_loader,device,epoch,print_freq=10)
List_of_List_losses.append(List_losses)
# Compute losses over the valIDation set
#val_his_ = valIDate_one_epoch(model,val_data_loader,print_freq=10)
# update the learning rate
print("lr before update : ",lr_scheduler)
lr_scheduler.step()
print("lr after update : ",lr_scheduler)
# Store loss values to plot learning curves afterwork.
if epoch == 0:
Train_history = {k: [v] for k,v in Train_his_.items()}
#val_history = {k: [v] for k,v in val_his_.items()}
else:
for k,v in Train_his_.items():Train_historY[k] += [v]
# for k,v in val_his_.items():val_historY[k] += [v]
# On peut save le modèle dans la boucle en ajoutant un critère : si la valIDation decroit
# torch.save(model,save_path)
torch.cuda.empty_cache()
gc.collect()
Train_one_epoch 函数(我打印了一些信息,将在消息末尾的输出中显示)
def Train_one_epoch(model,data_loader,print_freq):
model.Train()
metric_logger = utilss.MetricLogger(delimiter=" ")
metric_logger.add_meter('lr',utilss.SmoothedValue(window_size=1,fmt='{value:.6f}'))
header = 'Epoch: [{}]'.format(epoch)
List_losses = []
List_losses_Dict = []
for i,values in tqdm(enumerate(metric_logger.log_every(data_loader,print_freq,header))):
images,targets = values
for image in images :
print("before the to(devicE) operation,image.is_cuda = {}".format(image.is_cuda))
images = List(image.to(device,dtype=torch.float) for image in images)
targets = [{k: v.to(devicE) for k,v in t.items()} for t in targets]
#images = [image.cuda() for image in images]
for image in images :
print(imagE)
print("after the to(devicE) operation,image.is_cuda = {}".format(image.is_cuda))
for target in targets :
for t,Dict_value in target.items():
print("after the to(devicE) operation,Dict_value.is_cuda = {}".format(Dict_value.is_cuda))
print("images = {}".format(images))
print("targets = {}".format(targets))
# Feed the Training samples to the model and compute the losses
loss_Dict = model(images,targets)
losses = sum(loss for loss in loss_Dict.values())
# reduce losses over all GPUs for logging purposes
loss_Dict_reduced = utilss.reduce_Dict(loss_Dict)
losses_reduced = sum(loss for loss in loss_Dict_reduced.values())
loss_value = losses_reduced.item()
print("Loss is {},stopPing Training".format(loss_value))
if not math.isfinite(loss_value):
print("Loss is {},stopPing Training".format(loss_value))
print(loss_Dict_reduced)
sys.exit(1)
List_losses.append(loss_value)
# Pytorch function to initialize optimizer
optimizer.zero_grad()
# Compute gradIEnts or the BACkpropagation
losses.BACkWARD()
# update current gradIEnt
optimizer.step()
我向你展示了我的输出和错误(包括我的图像和目标,以及错误):
in_features = 1024
model.roi_heads.box_preDictor FastRCNNPreDictor(
(cls_score): linear(in_features=1024,out_features=2,bias=TruE)
(bBox_pred): linear(in_features=1024,out_features=8,bias=TruE)
)
device = cuda:0
100%|██████████| 515/515 [00:00<00:00,112118.06it/s]
100%|██████████| 761/761 [00:00<00:00,111005.96it/s]
0%| | 0/5 [00:00<?,?it/s]
0it [00:00,?it/s]
before the to(devicE) operation,image.is_cuda = True
tensor([[[0.0078,0.0078,...,0.0000,0.0000],[0.0078,0.0118,0.0118],[0.0235,0.0235,0.0235],[0.0353,0.0353,0.0314,0.0314]],[[0.0078,0.0039,0.0039],0.0157,0.0157],0.0235]],0.0078],0.0196,0.0196],0.0275,0.0275]]],device='cuda:0')
after the to(devicE) operation,image.is_cuda = True
after the to(devicE) operation,Dict_value.is_cuda = True
after the to(devicE) operation,Dict_value.is_cuda = True
images = [tensor([[[0.0078,device='cuda:0')]
targets = [{'Boxes': tensor([[1118.8964,1368.9186,399.3243],[1043.0958,111.4863,1332.4319,426.1295]],device='cuda:0',dtype=torch.float64),'labels': tensor([1,1],device='cuda:0'),'index': tensor([311],'area': tensor([99839.9404,91037.6485],'iscrowd': tensor([0],device='cuda:0')}]
/home/nathaneberrebi/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: named tensors and all @R_696_8917@ associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (triggered internally at /opt/conda/conda-bld/pytorch_1623448278899/work/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input,kernel_size,StriDe,padding,dilation,ceil_modE)
0it [00:02,?it/s]
0%| | 0/5 [00:02<?,?it/s]
---------------------------------------------------------------------------
RuntimeError TraceBACk (most recent call last)
<ipython-input-15-51a35da5b1fe> in <module>
----> 1 class_model.Train()
<ipython-input-7-d44d099a7743> in Train(self,num_epoch,gpu)
144
145 # Train for one epoch,prinTing every 10 iterations
--> 146 Train_his_,print_freq=10)
147 List_of_List_losses.append(List_losses)
148 # Compute losses over the valIDation set
<ipython-input-6-347c12a81a2f> in Train_one_epoch(model,print_freq)
519
520 # Feed the Training samples to the model and compute the losses
--> 521 loss_Dict = model(images,targets)
522 losses = sum(loss for loss in loss_Dict.values())
523
~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self,*input,**kwargs)
1049 if not (self._BACkWARD_hooks or self._forWARD_hooks or self._forWARD_pre_hooks or _global_BACkWARD_hooks
1050 or _global_forWARD_hooks or _global_forWARD_pre_hooks):
-> 1051 return forWARD_call(*input,**kwargs)
1052 # Do not call functions when jit is used
1053 full_BACkWARD_hooks,non_full_BACkWARD_hooks = [],[]
~/anaconda3/lib/python3.8/site-packages/torchvision/models/detection/generalized_rcnn.py in forWARD(self,images,targets)
95 features = orderedDict([('0',features)])
96 proposals,proposal_losses = self.rpn(images,features,targets)
---> 97 detections,detector_losses = self.roi_heads(features,proposals,images.image_sizes,targets)
98 detections = self.transform.postprocess(detections,original_image_sizes)
99
~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self,[]
~/anaconda3/lib/python3.8/site-packages/torchvision/models/detection/roi_heads.py in forWARD(self,image_shapes,targets)
752 Box_features = self.box_roi_pool(features,image_shapes)
753 Box_features = self.box_head(Box_features)
--> 754 class_logits,Box_regression = self.box_preDictor(Box_features)
755
756 result: List[Dict[str,torch.Tensor]] = []
~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self,[]
~/anaconda3/lib/python3.8/site-packages/torchvision/models/detection/faster_rcnn.py in forWARD(self,X)
280 assert List(x.shape[2:]) == [1,1]
281 x = x.flatten(start_dim=1)
--> 282 scores = self.cls_score(X)
283 bBox_deltas = self.bBox_pred(X)
284
~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self,[]
~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py in forWARD(self,input)
94
95 def forWARD(self,input: Tensor) -> Tensor:
---> 96 return F.linear(input,self.weight,self.bias)
97
98 def extra_repr(self) -> str:
~/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py in linear(input,weight,bias)
1845 if has_torch_function_variaDic(input,weight):
1846 return handle_torch_function(linear,(input,weight),input,bias=bias)
-> 1847 return torch._C._nn.linear(input,bias)
1848
1849
RuntimeError: Expected all tensors to be on the same device,but found at least two devices,cpu and cuda:0! (when checking arugment for argument mat1 in method wrapper_addmm)
非常感谢您的帮助,我一直有这个问题。由于相同的错误,我无法 torch.jit.trace 我的最后一个模型(在尝试使用类清理我的代码以仅使用一个函数序列自动构建每个对象之前)。我需要修复它才能在 C++ 代码中使用这个模型。 如果您需要更多信息,请告诉我。
这是我的 toch 环境:
PyTorch version: 1.9.0
Is deBUG build: false
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: Could not collect
libc version: glibc-2.31
Python version: 3.8 (64-bit runtimE)
Python platform: linux-5.8.0-59-generic-x86_64-with-glibc2.10
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce RTX 3060 LaPTOP GPU
NvIDia driver version: 460.80
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant librarIEs:
[pip3] numpy==1.20.2
[pip3] numpydoc==1.1.0
[pip3] torch==1.9.0
[pip3] torchaudio==0.9.0a0+33b2469
[pip3] torchvision==0.10.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.1.74 h6bb024c_0 nvIDia
[conda] mkl 2021.2.0 h06a4308_296
[conda] mkl-service 2.4.0 py38h497a2fe_0 conda-forge
[conda] mkl_fft 1.3.0 py38h42c9631_2
[conda] mkl_random 1.2.2 py38h1abd341_0 conda-forge
[conda] numpy 1.18.5 pypi_0 pypi
[conda] numpy-base 1.20.2 py38hfae3a4d_0
[conda] numpydoc 1.1.0 py_1 conda-forge
[conda] pytorch 1.9.0 py3.8_cuda11.1_cudnn8.0.5_0 pytorch
[conda] torch 1.9.0 pypi_0 pypi
[conda] torchaudio 0.9.0 py38 pytorch
[conda] torchvision 0.10.0 py38_cu111 pytorch
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)
以上是大佬教程为你收集整理的预计所有张量都在同一设备上,但发现至少有两个设备,cpu 和 cuda:0(在方法 wrapper_addmm全部内容,希望文章能够帮你解决预计所有张量都在同一设备上,但发现至少有两个设备,cpu 和 cuda:0(在方法 wrapper_addmm所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。