大佬教程收集整理的这篇文章主要介绍了hive SerDe序列化和反序列序列化表,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
SerDe 是两个单词的拼写 serialized(序列化) 和 deserialized(反序列化)。 什么是序列化和反序列化呢?
当进程在进行远程通信时,彼此可以发送各种类型的数据,无论是什么类型的数据都会以 二进制序列的形式在网络上传送。发送方需要把对象转化为字节序列才可在网络上传输, 称为对象序列化;接收方则需要把字节序列恢复为对象,称为对象的反序列化。
Hive的反序列化是对Key/value反序列化成hive table的每个列的值。Hive可以方便 的将数据加载到表中而不需要对数据进行转换,这样在处理海量数据时可以节省大量的时间。
what is a SerDe?
· RegexSerDe
create table apacHelog ( host StriNG, identity StriNG, user StriNG, time StriNG, request StriNG, status StriNG, size StriNG, referer StriNG, agent StriNG) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "([^]*) ([^]*) ([^]*) (-|\[^\]*\]) ([^ "]*|"[^"]*") (-|[0-9]*) (-|[0-9]*)(?: ([^ "]*|".*") ([^ "]*|".*"))?" ) STORED AS TEXTFILE;
· JsonSerDe
ADD JAR /usr/lib/hive-hcatalog/lib/hive-hcatalog-core.jar; create table my_table(a String, b bigint, ...) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE;
· CSVSerDe
create table my_table(a String, b String) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.openCSVSerde' WITH SERDEPROPERTIES ("separatorChar" = "t","quoteChar"= "'","escapeChar"= "\") STORED AS TEXTFILE;
·ORCSerDe
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.orcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.orcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.orcOutputFormat'
As of Hive 0.14 a registration mechanism has been introduced for native Hive SerDes. This allows dynamic binding between a "STORED AS" keyword in place of a triplet of {SerDe, InputFormat, and OutputFormat} specification, in CreateTable statements.
The following mappings have been added through this registration mechanism:
Syntax
|
Equivalent
|
---|---|
Syntax
|
Equivalent
|
STORED AS AVRO / STORED AS AVROFILE @H_616_259@ |
@H_247_262@ROW FORMAT SERDE
@H_616_259@
|
STORED AS ORC / STORED AS ORCFILE @H_616_259@ |
@H_247_262@ROW FORMAT SERDE
|
STORED AS PARQUET / STORED AS PARQUETFILE @H_616_259@ |
@H_247_262@ROW FORMAT SERDE
|
STORED AS RCFILE@H_616_259@ |
@H_247_262@STORED AS INPUTFORMAT
@H_247_262@ OUTPUTFORMAT
@H_616_259@
|
STORED AS SEQUENCEFILE@H_616_259@ |
@H_247_262@STORED AS INPUTFORMAT
@H_247_262@ 'org.apache.hadoop.mapred.SequenceFileInputFormat'
@H_247_262@ OUTPUTFORMAT
@H_616_259@
|
STORED AS TEXTFILE@H_616_259@ |
@H_247_262@STORED AS INPUTFORMAT
@H_247_262@ OUTPUTFORMAT
@H_616_259@
|
以上是大佬教程为你收集整理的hive SerDe序列化和反序列序列化表全部内容,希望文章能够帮你解决hive SerDe序列化和反序列序列化表所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。