大佬教程收集整理的这篇文章主要介绍了hive学习笔记之四:分区表,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
https://github.com/zq2599/blog_demos
内容:所有原创文章分类汇总及配套源码,涉及Java、Docker、Kubernetes、DevOPS等;
本文是《hive学习笔记》系列的第四篇,要学习的是hive的分区表,简单来说hive的分区就是创建层级目录的一种方式,处于同一分区的记录其实就是数据在同一个子目录下,分区一共有两种:静态和动态,接下来逐一尝试;
先尝试用单个字段分区,t9表有三个字段:名称city、年龄age、城市city,以城市作为分区字段:
@R_@R_197_11263@_2102@ t9 (name String, agE int)
partitioned by (city String)
row format delimited
fields terminated by ',';
hive> desc t9;
OK
name String
age int
city String
# Partition Information
# col_name data_type comment
city String
Time taken: 0.159 seconds, Fetched: 8 row(s)
tom,11
jerry,12
load data
local inpath '/home/hadoop/temp/202010/25/009.txt'
into table t9
partition(city='shenzhen');
load data
local inpath '/home/hadoop/temp/202010/25/009.txt'
into table t9
partition(city='guangzhou');
hive> SELEct * FROM t9;
OK
t9.name t9.age t9.city
tom 11 guangzhou
jerry 12 guangzhou
tom 11 shenzhen
jerry 12 shenzhen
Time taken: 0.104 seconds, Fetched: 4 row(s)
[hadoop@node0 bin]$ ./hadoop fs -ls /user/hive/warehouse/t9/city=guangzhou
Found 1 items
-rwxr-xr-x 3 hadoop supergroup 16 2020-10-31 16:47 /user/hive/warehouse/t9/city=guangzhou/009.txt
[hadoop@node0 bin]$ ./hadoop fs -cat /user/hive/warehouse/t9/city=guangzhou/009.txt
tom,11
jerry,12
[hadoop@node0 bin]$
以上就是以单个字段做静态分区的实践,接下来尝试多字段分区;
@R_@R_197_11263@_2102@ t10 (name String, agE int)
partitioned by (province String, city String)
row format delimited
fields terminated by ',';
3. 第一次导入,province='shanxi', city='xian':
load data
local inpath '/home/hadoop/temp/202010/25/009.txt'
into table t10
partition(province='shanxi', city='xian');
load data
local inpath '/home/hadoop/temp/202010/25/009.txt'
into table t10
partition(province='shanxi', city='hanzhong');
load data
local inpath '/home/hadoop/temp/202010/25/009.txt'
into table t10
partition(province='guangdong', city='guangzhou');
load data
local inpath '/home/hadoop/temp/202010/25/009.txt'
into table t10
partition(province='guangdong', city='shenzhen');
hive> SELEct * FROM t10;
OK
t10.name t10.age t10.province t10.city
tom 11 guangdong guangzhou
jerry 12 guangdong guangzhou
tom 11 guangdong shenzhen
jerry 12 guangdong shenzhen
tom 11 shanxi hanzhong
jerry 12 shanxi hanzhong
tom 11 shanxi xian
jerry 12 shanxi xian
Time taken: 0.129 seconds, Fetched: 8 row(s)
[hadoop@node0 bin]$ ./hadoop fs -cat /user/hive/warehouse/t10/province=shanxi/city=hanzhong/009.txt
tom,11
jerry,12
set hive.exec.dynamic.partition=true
set hive.exec.dynamic.partition.mode=noStrict;
create external table t11 (name String, agE int, province String, city String)
row format delimited
fields terminated by ','
LOCATIOn '/data/external_t11';
tom,11,guangdong,guangzhou
jerry,12,guangdong,shenzhen
tony,13,shanxi,xian
john,14,shanxi,hanzhong
load data
local inpath '/home/hadoop/temp/202010/25/011.txt'
into table t11;
@R_@R_197_11263@_2102@ t12 (name String, agE int)
partitioned by (province String, city String)
row format delimited
fields terminated by ',';
insert overwrite table t12
partition(province, city)
SELEct name, age, province, city from t11;
[hadoop@node0 bin]$ ./hadoop fs -cat /user/hive/warehouse/t12/province=guangdong/city=guangzhou/000000_0
tom,11
微信搜索「程序员欣宸」,我是欣宸,期待与您一同畅游Java世界... https://github.com/zq2599/blog_demos
以上是大佬教程为你收集整理的hive学习笔记之四:分区表全部内容,希望文章能够帮你解决hive学习笔记之四:分区表所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。