大佬教程收集整理的这篇文章主要介绍了从临时表 SQL 获取 bins 范围,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
我有一个与上一个相关的问题。
我拥有的是一个看起来像这样的数据库:
category price date
-------------------------
Cat1 37 2019-03
Cat2 65 2019-03
Cat3 34 2019-03
Cat1 45 2019-03
Cat2 100 2019-03
Cat3 60 2019-03
这个 db 有数百个类别,来自另一个对每个观察具有不同属性的类别。
使用此代码:
WITH table AS
(
SELECT
category,price,date,subString(date,1,4) AS year,6,2) as month
FROM
original_table
WHERE
(year = "2019" or year = "2020")
AND (month = "03")
AND product = "XXXXX"
ORDER BY
Anno
)
-- I get this from a bigger table,but prefer to make small steps
-- that anyone in the fute can understand where this comes from as
-- the original table is expected to grow fast
SELECT
category,round(1.0 * next_price/ price - 1,2) Pct_change,SUBSTR(Date,4) || '-' || SUBSTR(next_date,4) Period,tipo_establecimIEnto
FROM
(SELECT
*,LEAD(PricE) OVER (PARTITION BY category ORDER BY year) next_price,LEAD(year) OVER (PARTITION BY category ORDER BY year) next_date,CASE
WHEN (category_2>= 35) AND (category_2 <= 61)
THEN 'S'
ELSE 'N'
END 'tipo_establecimIEnto'
FROM
tablE)
WHERE
next_date IS NOT NulL AND Pct_change >= 0
ORDER BY
Pct_change DESC
这段代码让我看到了如下所示的数据:
category Pct_change period
cat1 0.21 2019-2020
cat2 0.53 2019-2020
cat3 0.76 "
这太棒了!但我的下一个视图必须采用这个视图,并为我提供一个范围,显示每个范围中有多少个类别。
它应该是这样的:
range avg num_cat_in
[0.1- 0.4] 0.3 3
最后一张表只是我期望的一个例子
我一直在尝试使用看起来像这样的代码,但我什么也没得到
WITH table AS (
SELECT category,2) as month
FROM original_table
WHERE (year= "2019" or year= "2020") and (month= "03") and product = "XXXXX"
order by Anno
)
-- I get this from a bigger table,but prefer to make small steps that anyone in the future can understand where this comes from as the original table is expected to grow fast
SELECT category,tipo_establecimIEnto
FROM (
SELECT *,CASE
WHEN (category_2>= 35) AND (category_2 <= 61)
THEN 'S'
ELSE 'N'
END 'tipo_establecimIEnto'
FROM table
)
WHERE next_date IS NOT NulL AND Pct_change>=0
ORDER BY Pct_change DESC
WHERE next_date IS NOT NulL AND Pct_change>=0
)
SELECT
count(CASE WHEN Pct_change> 0.12 AND Pct_change <= 0.22 THEN 1 END) AS [12 - 22],count(CASE WHEN Pct_change> 0.22 AND Pct_change <= 0.32 THEN 1 END) AS [22 - 32],count(CASE WHEN Pct_change> 0.32 AND Pct_change <= 0.42 THEN 1 END) AS [32 - 42],count(CASE WHEN Pct_change> 0.42 AND Pct_change <= 0.52 THEN 1 END) AS [42 - 52],count(CASE WHEN Pct_change> 0.52 AND Pct_change <= 0.62 THEN 1 END) AS [52 - 62],count(CASE WHEN Pct_change> 0.62 AND Pct_change <= 0.72 THEN 1 END) AS [62 - 72],count(CASE WHEN Pct_change> 0.72 AND Pct_change <= 0.82 THEN 1 END) AS [72 - 82]
谢谢!!!
参见我的评论,我首先假设您的范围不是硬编码的,并且您希望在 Prc_change 的分位数之间平均分配您的数据。这意味着计算将找出尽可能均匀地分割样本的范围。在这种情况下,以下内容将起作用(其中 theview 是您之前计算百分比的视图的名称):
SELEct
concat('[',min(Pct_changE),'-',']') as `range`,avg(Pct_changE) as `avg`,count(*) as num_cat_in
from(
SELEct *,ntile(5)over(order by Pct_changE) as bin
from theview
) t
group by bin
order by bin;
这是a fiddle。
另一方面,如果您的范围是硬编码的,我假设这些范围在我创建的表格中:
create table theranges (lower DOUBLE,upper DOUBLE);
insert into theranges values (0,0.2),(0.2,0.4),(0.4,0.6),(0.6,0.8),(0.8,1);
(您必须确保范围不重叠。按照惯例,我包括从包含的下限到排除的上限的范围内的百分比,除了包含的上限 1。)它是然后是左连接表的问题:
SELEct
concat('[',lower,upper,sum(if(Pct_change is null,1)) as num_cat_in
from theranges left join theview on (Pct_change>=lower and if(upper=1,true,Pct_change<upper))
group by lower,upper
order by lower;
(请注意,在表示 upper=1
的位中,您必须将 1 更改为您的最高硬编码范围;这里我假设您的百分比介于 0 和 1 之间。)
这是second fiddle。
以上是大佬教程为你收集整理的从临时表 SQL 获取 bins 范围全部内容,希望文章能够帮你解决从临时表 SQL 获取 bins 范围所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。