程序问答   发布时间:2022-06-01  发布网站:大佬教程  code.js-code.com
大佬教程收集整理的这篇文章主要介绍了SQL/BigQuery:如何避免将多个非连续成员分组?大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。

如何解决SQL/BigQuery:如何避免将多个非连续成员分组??

开发过程中遇到SQL/BigQuery:如何避免将多个非连续成员分组?的问题如何解决?下面主要结合日常开发的经验,给出你关于SQL/BigQuery:如何避免将多个非连续成员分组?的解决方法建议,希望对你解决SQL/BigQuery:如何避免将多个非连续成员分组?有所启发或帮助;

我遇到了一个我自己似乎无法解决的问题。我按位置和时间戳对行进行分组,并为实体保持静止的实例查找第一个和最后一个时间戳。问题是,对于我当前的代码,当实体返回到它之前所在的位置时,sql 会将行组合在一起。

在我的示例中,实体在 2020-05-24 05:22:00 位于位置 -66.89 10.5002,然后在 2020-05-24 11:13:00 返回该位置。当前查询的结果使它看起来好像该实体一直在该位置,尽管中间的行清楚地显示它已移动。这是一个概念性问题,我真的不知道如何在 sql 中解决。我正在 Big query 中执行此操作,但我记得在 sql Server 中遇到了类似的问题。

代码:

with SELEction as (
SELEct 1 as ID,timestAMP '2020-05-24 11:13:00' as timestamp_,'POINT(-66.89 10.5002)' as geom
union all SELEct
1,timestAMP '2020-05-24 05:22:00','POINT(-66.89 10.5002)'
union all SELEct
1,timestAMP '2020-05-24 05:25:00','POINT(-66.8881 10.4994)'
union all SELEct
1,timestAMP '2020-05-24 09:14:00','POINT(-66.8888 10.4958)'
union all SELEct
1,timestAMP '2020-05-24 07:37:00 UTC','POINT(-66.8873 10.5)'
union all SELEct
1,timestAMP'2020-05-24 07:52:00 UTC','POINT(-66.8873 10.5)'
)

SELEct ID,timestamp_,geom,first_value(timestamp_)
    OVER (PARTITION BY ID,geom ORDER BY timestamp_ ASC ROWS betweeN UNBOUNDED PRECEDING AND UNBOUNDED FolLOWING) AS interval_start,last_value(timestamp_)
    OVER (PARTITION BY ID,geom ORDER BY timestamp_ ASC ROWS betweeN UNBOUNDED PRECEDING AND UNBOUNDED FolLOWING) AS interval_end,FROM
SELEction order by ID,timestamp_

结果。注意第一行和最后一行的 interval_start 和 interval_end

ID 时间戳_ geom interval_start interval_end
1 2020-05-24 05:22:00 UTC 点(-66.89 10.5002) 2020-05-24 05:22:00 UTC 2020-05-24 11:13:00 UTC
1 2020-05-24 05:25:00 UTC 点(-66.8881 10.4994) 2020-05-24 05:25:00 UTC 2020-05-24 05:25:00 UTC
1 2020-05-24 07:37:00 UTC 点(-66.8873 10.5) 2020-05-24 07:37:00 UTC 2020-05-24 07:52:00 UTC
1 2020-05-24 07:52:00 UTC 点(-66.8873 10.5) 2020-05-24 07:37:00 UTC 2020-05-24 07:52:00 UTC
1 2020-05-24 09:14:00 UTC 点(-66.8888 10.4958) 2020-05-24 09:14:00 UTC 2020-05-24 09:14:00 UTC
1 2020-05-24 11:13:00 UTC 点(-66.89 10.5002) 2020-05-24 05:22:00 UTC 2020-05-24 11:13:00 UTC

想要的结果:

ID 时间戳_ geom interval_start interval_end
1 2020-05-24 05:22:00 UTC 点(-66.89 10.5002) 2020-05-24 05:22:00 UTC 2020-05-24 05:22:00 UTC
1 2020-05-24 05:25:00 UTC 点(-66.8881 10.4994) 2020-05-24 05:25:00 UTC 2020-05-24 05:25:00 UTC
1 2020-05-24 07:37:00 UTC 点(-66.8873 10.5) 2020-05-24 07:37:00 UTC 2020-05-24 07:52:00 UTC
1 2020-05-24 07:52:00 UTC 点(-66.8873 10.5) 2020-05-24 07:37:00 UTC 2020-05-24 07:52:00 UTC
1 2020-05-24 09:14:00 UTC 点(-66.8888 10.4958) 2020-05-24 09:14:00 UTC 2020-05-24 09:14:00 UTC
1 2020-05-24 11:13:00 UTC 点(-66.89 10.5002) 2020-05-24 11:13:00 UTC 2020-05-24 11:13:00 UTC

解决方法

虑以下

with SELEction as (
  SELEct 1 as id,timestAMP '2020-05-24 11:13:00' as timestamp_,'POINT(-66.89 10.5002)' as geom union all SELEct
  1,timestAMP '2020-05-24 05:22:00','POINT(-66.89 10.5002)' union all SELEct
  1,timestAMP '2020-05-24 05:25:00','POINT(-66.8881 10.4994)' union all SELEct
  1,timestAMP '2020-05-24 09:14:00','POINT(-66.8888 10.4958)' union all SELEct
  1,timestAMP '2020-05-24 07:37:00 UTC','POINT(-66.8873 10.5)' union all SELEct
  1,timestAMP'2020-05-24 07:52:00 UTC','POINT(-66.8873 10.5)'
),pregrouped_SELEction as (
  SELEct id,timestamp_,geom,countif(flag) over(partition by id order by timestamp_) grp
  from (
    SELEct id,geom != ifnull(lag(geom) over(partition by id order by timestamp_),geom) flag,from SELEction 
  )
  order by id,timestamp_
)
SELEct id,first_value(timestamp_)
    OVER (PARTITION BY id,grp ORDER BY timestamp_ ASC ROWS betweeN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS interval_start,last_value(timestamp_)
    OVER (PARTITION BY id,grp ORDER BY timestamp_ ASC ROWS betweeN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS interval_end,FROM
pregrouped_SELEction order by id,timestamp_    

带输出

SQL/BigQuery:如何避免将多个非连续成员分组?

如您所见 - 我几乎 100% 保留了您的原始查询 - 只是将 geom 替换为 grp 内的 over() 语句 AND 来自 pregrouped_SELEction,它确实计算了组数字 - grp

,

您可以使用窗口函数检查是否至少有两个不同的值:

@H_3_9@min(geom) over (partition by id) <> max(geom) over (partition by id) as has_moved,

大佬总结

以上是大佬教程为你收集整理的SQL/BigQuery:如何避免将多个非连续成员分组?全部内容,希望文章能够帮你解决SQL/BigQuery:如何避免将多个非连续成员分组?所遇到的程序开发问题。

如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。
标签: