分类导航

程序问答发布时间：2022-06-01 发布网站：大佬教程 code.js-code.com

大佬教程收集整理的这篇文章主要介绍了将 PyTorch 代码移植到 TensorFlow (GPU) 时的性能/内存问题，大佬教程大佬觉得挺不错的，现在分享给大家，也给大家做个参考。

如何解决将 PyTorch 代码移植到 TensorFlow (GPU) 时的性能/内存问题？

开发过程中遇到将 PyTorch 代码移植到 TensorFlow (GPU) 时的性能/内存问题的问题如何解决？下面主要结合日常开发的经验，给出你关于将 PyTorch 代码移植到 TensorFlow (GPU) 时的性能/内存问题的解决方法建议，希望对你解决将 PyTorch 代码移植到 TensorFlow (GPU) 时的性能/内存问题有所启发或帮助；

我正在尝试移植一个使用卷积核来转换时间序列以用于进一步任务（例如分类等）的操作。以下代码是用 PyTorch 编写的，适用于 GPU：

class ROCKET(nn.ModulE):
    def __init__(self,c_in,seq_len,n_kernels=10_000,kss=[7,9,11],device=None,verbose=falsE):
        super().__init__()
        device = ifnone(device,default_device())
        kss = [ks for ks in kss if ks < seq_len]
        convs = nn.ModuleList()
        for i in range(n_kernels):
            ks = np.random.choice(kss)
            dilation = 2**np.random.uniform(0,np.log2((seq_len - 1) // (ks - 1)))
            padding = int((ks - 1) * dilation // 2) if np.random.randint(2) == 1 else 0
            weight = torch.randn(1,ks)
            weight -= weight.mean()
            bias = 2 * (torch.rand(1) - .5)
            layer = nn.Conv1d(c_in,1,ks,padding=2 * padding,dilation=int(dilation),bias=TruE)
            layer.weight = torch.nn.Parameter(weight,requires_grad=falsE)
            layer.bias = torch.nn.Parameter(bias,requires_grad=falsE)
            convs.append(layer)
        self.convs = convs
        self.n_kernels = n_kernels
        self.kss = kss
        self.to(device=devicE)
        self.verbose=verbose

    def forWARD(self,X):
        _output = []
        for i in progress_bar(range(self.n_kernels),display=self.verbose,leave=false,comment='kernel/kernels'):
            out = self.convs[i](X).cpu()
            _max = out.max(dim=-1)[0]
            _ppv = torch.gt(out,0).sum(dim=-1).float() / out.shape[-1]
            _output.append(_maX)
            _output.append(_ppv)
        return torch.cat(_output,dim=1)

到目前为止，我在 tensorflow (GPU) 中的方法看起来像这样并执行相同的计算：

from tqdm import trange
class ROCKET():
    def __init__(self,n_kernels = 10_000,11]):
        kss = [ks for ks in kss if ks < seq_len]
        weights = []
        biases = []
        dilations = []
        for i in (t := trange(n_kernels)):
            ks = np.random.choice(kss)
            dilation = 2**np.random.uniform(0,np.log2((seq_len - 1) // (ks - 1)))

            weight = tf.random.normal([ks,1],dtype=tf.doublE)
            weight -= tf.math.reduce_mean(weight)
            weight = tf.Variable(weight)
        
            bias = tf.Variable(2 * (tf.random.normal([1],dtype=tf.doublE) - .5),dtype=tf.doublE)
            
            weights.append(weight)
            biases.append(bias)
            dilations.append(dilation)
            
            t.set_description("set kernels") 
            
        self.weights = weights
        self.biases = biases
        self.dilations = dilations
        self.n_kernels = n_kernels
        
    
    def forWARD(self,X):
        _output = []
        for i in (t := trange(self.n_kernels)):
            weight = self.weights[i]
            bias = self.biases[i]
            dilation = self.dilations[i]
            
            tensor = tf.nn.conv1d(x,filters=weight,StriDe=1,padding='VALID',dilations=dilation)
            tensor = tf.nn.bias_add(tensor,bias)
    
            _max = tf.Variable(tf.math.reduce_max(tensor,axis=-1))

            temp = tf.cast(tf.math.greater(tensor,0),tf.doublE)
            _ppv = tf.math.reduce_sum(temp,axis=-1) / tensor.shape[-1]
    
            _output.append(_maX)
            _output.append(_ppv)
            
            t.set_description("apply kernels")
            
        return tf.concat(_output,axis=1)

但是，这种方法的执行速度要慢得多（仅构造函数需要 30 秒）。

此外，由于我的 GPU 很快就会耗尽内存（6GB VRAM），我只能在小型数据集上使用它。

我是否错过了一些直接的性能优势？我还注意到计算是由我的 cpu 执行的，并且只是存储在 GPU 内存中。

使用 tf.placeholders 是否有益？

解决方法

所以我设法解决了我的问题。原来，我只是输入了错误的 data_format。我认为 tf.nn.conv1d 的默认格式是“NCW”，就像在层的 PyTorch 实现中一样（实际上相反）。由于 tqdm...（现在是预期的 ~1 秒），构造函数调用花费了很长时间。

最后，这是工作代码：

from tqdm import trange
class ROCKET():
    def __init__(self,c_in,seq_len,n_kernels = 10_000,kss=[7,9,11]):
        kss = [ks for ks in kss if ks < seq_len]
        weights = []
        biases = []
        dilations = []
        for i in range(n_kernels):
            ks = np.random.choice(kss)
            dilation = 2**np.random.uniform(0,np.log2((seq_len - 1) // (ks - 1)))
            
            #NCW
            weight = tf.random.normal([ks,1],dtype=tf.float32)
            weight -= tf.math.reduce_mean(weight)
        
            bias = 2 * (tf.random.normal([1],dtype=tf.float32) - .5)
            
            weights.append(weight)
            biases.append(bias)
            dilations.append(dilation)
            
        self.weights = weights
        self.biases = biases
        self.dilations = dilations
        self.n_kernels = n_kernels

    #@tf.function(input_signature=[tf.TensorSpec(shape=None,dtype=tf.float32)])
    def apply_kernels(self,X):
        _output = []
        for i in (t := trange(self.n_kernels)):
            #ks = self.ks[i]
            bias = self.biases[i]
            dilation = self.dilations[i]
            weight = self.weights[i]
            
            
            tensor = tf.nn.conv1d(x,filters=weight,Stride=1,padding='VALID',data_format='NCW',dilations=dilation)
            tensor = tf.nn.bias_add(tensor,bias,data_format='NCW')
    
            _max = tf.math.reduce_max(tensor,axis=-1)

            temp = tf.cast(tf.math.greater(tensor,0),tf.float32)
            _ppv = tf.math.reduce_sum(temp,axis=-1) / tensor.shape[-1]

            _output.append(_maX)
            _output.append(_ppv)
            
            t.set_description("apply kernels")
        
        return tf.concat(_output,axis=1)

感谢@Lescurel，我还通过删除 tf.Variables 来提高性能。

大佬总结

以上是大佬教程为你收集整理的将 PyTorch 代码移植到 TensorFlow (GPU) 时的性能/内存问题全部内容，希望文章能够帮你解决将 PyTorch 代码移植到 TensorFlow (GPU) 时的性能/内存问题所遇到的程序开发问题。

如果觉得大佬教程网站内容还不错，欢迎将大佬教程推荐给程序员好友。

本图文内容来源于网友网络收集整理提供，作为学习参考使用，版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ：384754419，请注明来意。

标签：将

上一篇: Oracle 字符串连接太长下一篇:Weblogic 代理模块 mod_wl_24.so...

猜你在找的程序问答相关文章

在烧瓶中重定向时发出POST请求 2022-06-02
从 CreateWindow() 返回的 HWND 的格式值是多少？ 2022-05-31
使用nodejs打印json对象内容 2022-05-31
useEffect 无限循环仅在测试时发生，否则不会发生 - 尽管使用 useReducer 2022-05-31
从雅虎财经检索 ESG 分数 2022-05-31
Gulp：获取“必须指定任务功能”错误，但我只有 1 个任务 2022-05-31
JavaScript 将平面数组转换为嵌套/分组和排序数组 2022-05-31
405 Method Not Allowed 当提交表单到 Flask 时，即使路由有 ['GET', 'PO... 2022-05-31
Mongodb 错误码和对应的 http 状态码 2022-05-31
连接到上游时 Nginx connect() 失败（111：连接被拒绝），客户端：192.168.128.1，服务... 2022-05-31