Hadoop/Hive : Loading data from .csv on a local machine

Let me work you through the following simple steps:

Steps:

First, create a table on hive using the field names in your csv file. Lets say for example, your csv file contains three fields (id, name, salary) and you want to create a table in hive called "staff". Use the below code to create the table in hive.

hive> CREATE TABLE Staff (id int, name string, salary double) row format delimited fields terminated by ',';

Second, now that your table is created in hive, let us load the data in your csv file to the "staff" table on hive.

hive>  LOAD DATA LOCAL INPATH '/home/yourcsvfile.csv' OVERWRITE INTO TABLE Staff;

Lastly, display the contents of your "Staff" table on hive to check if the data were successfully loaded

hive> SELECT * FROM Staff;

Thanks.

sql csv hadoop amazon-web-services hive

if you have a hive setup you can put the local dataset directly using Hive load command in hdfs/s3.

You will need to use "Local" keyword when writing your load command.

Syntax for hiveload command

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]

Refer below link for more detailed information.https://cwiki.apache.org/confluence/display/Hive/LanguageManual%20DML#LanguageManualDML-Loadingfilesintotables

sql csv hadoop amazon-web-services hive

There is another way of enabling this,

use hadoop hdfs -copyFromLocal to copy the .csv data file from your local computer to somewhere in HDFS, say '/path/filename'
enter Hive console, run the following script to load from the file to make it as a Hive table. Note that '\054' is the ascii code of 'comma' in octal number, representing fields delimiter.

CREATE EXTERNAL TABLE table name (foo INT, bar STRING) COMMENT 'from csv file' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054' STORED AS TEXTFILE LOCATION '/path/filename';

CodeHunter

Hadoop/Hive : Loading data from .csv on a local machine

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last