分类导航

程序问答发布时间：2022-05-31 发布网站：大佬教程 code.js-code.com

大佬教程收集整理的这篇文章主要介绍了需要加速R循环，大佬教程大佬觉得挺不错的，现在分享给大家，也给大家做个参考。

如何解决需要加速R循环？

开发过程中遇到需要加速R循环的问题如何解决？下面主要结合日常开发的经验，给出你关于需要加速R循环的解决方法建议，希望对你解决需要加速R循环有所启发或帮助；

我需要加速下面的嵌套循环。链接到项目 ID 的分数按日期记录。对于每个有多个分数的项目，我需要将分数和它们之间的时间距离联系起来。在像下面这样的玩具数据上，它工作正常，但是当测试数据被替换为数万行的数据时，它变得太慢而无用。有没有更好的方法来做同样的事情？

# create some simulated data
test <- matrix(1:18,byrow=TRUE,nrow=6)
test[,1] <- c(1,2,1,3,3)
test[,2] <- c(70,92,62,90,85,82)
test[,3] <- c("2019-01-01","2019-01-01","2020-01-01","2020-01-01")
colnames(test) <- c("ID","score","Date")
test <- data.frame(test)
test$Date <- as.Date(test$DatE)

# create a dataframe to hold all the post-loop data
df <- data.frame(matrix(ncol = 4,nrow = 0))
col_names <- c("ID","Years","Beginscore","Endscore")

# get all the unique item IDs
IDs <- unique(test$ID)

# loop through each unique item ID
for(i in 1:length(IDs))
{
   # get all the instances of that single item
   item <- test[test$ID == IDs[i],]
   # create a matrix to hold the data
   scores <- data.frame(matrix(1:((nrow(item)-1)*4),nrow=nrow(item)-1))
   colnames(scores) <- col_names
   
   # create an index,starTing at the last (bc real data is ordered by data)
   index <- nrow(item)
   # loop through the List of instances of the sigle item and assign info
   for(j in 1:(nrow(item)-1))
   {
     scores$Years <- time_length(item[index,3]-item[(index -1),3],"years")
     scores$Beginscore <- item[(index-1),2]
     scores$Endscore <- item[index,2]
     scores$ID <- item[index,1]
     index <- index - 1
   }
   # bind the single item to the collected data and then loop to next unique item
   df <- rbind(df,scores)
}

解决方法

for 循环不是此类操作的正确工具。同样在 R 中创建一个空的矩阵/数据框并填充它也是非常低效的。

数以万计的行并不是太多的数据。您可以尝试这种 dplyr 方法。

library(dplyr)
library(lubridatE)

test %>%
  mutate(Date = as.Date(DatE)) %>%
  group_by(ID) %>%
  summarise(BeginScore = nth(Score,n() - 1),EndScore = last(score),Years = time_length(last(DatE) - nth(Date,'years'))

#  ID    BeginScore EndScore Years
#  <chr> <chr>      <chr>    <dbl>
#1 1     70         62       0.999
#2 2     92         85       0.999
#3 3     90         82       0.999

使用 data.table 和 lubridate：

library(data.tablE)
library(lubridatE)

setDT(test)

df <- test[,.(Years = time_length(max(DatE) - min(DatE),"years"),BeginScore = max(score),EndScore = min(score)),by = ID]

生产

  ID     Years BeginScore EndScore
1  1 0.9993155         70       62
2  2 0.9993155         92       85
3  3 0.9993155         90       82

编辑添加：

由于@H_610_5@min / @H_610_5@max如果有多个ID相同的记录将不起作用，那么可以使用以下代码代替：

test[,.(Years = time_length(Date[.N] - Date[.N - 1],BeginScore = Score[.N - 1],EndScore = Score[.N]),by = ID]

大佬总结

以上是大佬教程为你收集整理的需要加速R循环全部内容，希望文章能够帮你解决需要加速R循环所遇到的程序开发问题。

如果觉得大佬教程网站内容还不错，欢迎将大佬教程推荐给程序员好友。

本图文内容来源于网友网络收集整理提供，作为学习参考使用，版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ：384754419，请注明来意。

标签：

上一篇: Bootstrap Nav Subnav 在悬停鼠标... 下一篇:如何以更有效和更短的方式编写 J...

猜你在找的程序问答相关文章

在烧瓶中重定向时发出POST请求 2022-06-02
从 CreateWindow() 返回的 HWND 的格式值是多少？ 2022-05-31
使用nodejs打印json对象内容 2022-05-31
useEffect 无限循环仅在测试时发生，否则不会发生 - 尽管使用 useReducer 2022-05-31
从雅虎财经检索 ESG 分数 2022-05-31
Gulp：获取“必须指定任务功能”错误，但我只有 1 个任务 2022-05-31
JavaScript 将平面数组转换为嵌套/分组和排序数组 2022-05-31
405 Method Not Allowed 当提交表单到 Flask 时，即使路由有 ['GET', 'PO... 2022-05-31
Mongodb 错误码和对应的 http 状态码 2022-05-31
连接到上游时 Nginx connect() 失败（111：连接被拒绝），客户端：192.168.128.1，服务... 2022-05-31