This work was contributed to the apache hive project. see details here.
This SerDe adds real CSV input and ouput support to hive using the excellent opencsv library.
add jar path/to/csv-serde.jar;
create table my_table(a string, b string, ...)
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
stored as textfile
;
The default separator, quote, and escape characters from the opencsv
library and custom newline replacer defined by linewalks:
DEFAULT_NULLCHAR \u0000
DEFAULT_ESCAPE_CHARACTER \
DEFAULT_QUOTE_CHARACTER "
DEFAULT_SEPARATOR ,
You can also specify custom separator, quote, or escape characters.
add jar path/to/csv-serde.jar;
create table my_table(a string, b string, ...)
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
with serdeproperties (
"nullChar" = "",
"separatorChar" = ",",
"quoteChar" = "'",
"escapeChar" = "\\",
)
stored as textfile
;
Run mvn package
to build. Both a basic artifact as well as a "fat jar" (with opencsv) are produced.
Run mvn eclipse:eclipse
to generate .project
and .classpath
files for eclipse.
csv-serde is open source and licensed under the Apache 2 License.