Error loading CSV to table Error loading CSV to table hadoop hadoop

Error loading CSV to table


References:

Conform to Apache's DDL syntax for Create/Drop/Truncate Table, and, more specifically, to Row Formats & SerDe, in order to change the FIELDS/LINES/etc properties, you can use:

  • the native SerDe (if ROW FORMAT is not specified, or ROW FORMAT DELIMITED is set), or
  • a custom SerDe (by applying the SERDE clause).

Option 1: Using the native Serde

CREATE TABLE survey(country string, age string, industryType string, companyType string, occupation string, salary string, project string)ROW FORMAT DELIMITED     FIELDS         TERMINATED BY ","         ESCAPED BY "\\"    LINES TERMINATED BY "\n"STORED AS TEXTFILE;

Notice the lack of a clause which could specify the character to quote the field values.

Option 2: Using the custom Serde

See the record named "CSV/TSV" in the table from Row Formats & SerDe.

CREATE TABLE survey(country string, age string, industryType string, companyType string, occupation string, salary string, project string)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'WITH SERDEPROPERTIES (    "separatorChar" = ",",    "quoteChar"     = "\"",    "escapeChar"    = "\\")     STORED AS TEXTFILE;

I suggest you to make a test with a CSV file containing one line of values. Inside the line enclose all column values into double quotes and see what happens. Beeing enclosed, the "," character (like the one in salary field) will/should be automatically escaped and, therefore, saved as part of the corresponding column value.

Good luck.