Installing Hive on Ubuntu Installing Hive on Ubuntu hadoop hadoop

Installing Hive on Ubuntu


Step 1 : Download and Extract Hadoop

Step 2 : Set JAVA_HOME path to conf/hadoop-env.sh //This step is to set java path for hadoop

Step 3 : conf/core-site.xml:

<configuration>  <property>    <name>fs.default.name</name>  //Place your home folder here for using hadoop     <value>hdfs://localhost:9000</value>  </property></configuration>

Step 4 : conf/hdfs-site.xml:

<configuration>          //This setting for the number of replications of the file or you can add data node for the save the file  <property>    <name>dfs.replication</name>    <value>1</value>  </property></configuration>

Step 5 : conf/mapred-site.xml:

<configuration>  <property>    <name>mapred.job.tracker</name>    <value>localhost:9001</value>        // add your master host in the place of localhost here  </property></configuration>

Step 6 : Login SSH localhost and Format a new distributed-filesystem

bin/hadoop namenode -format

Step 7 : Start the hadoop daemons:

bin/start-all.sh

Step 8 : Check the NameNode & JobTracker below port

http://localhost:50070/       //masterhttp://localhost:50030/       //slave  

// Its is also better to try the ssh for check your working nodes

Step 9 : Download and Extract Hive

Step 10 : Set below Env variables.

export HADOOP_HOME=<hadoop-install-dir>export HIVE_HOME=<hive-install-dir>export PATH=$HIVE_HOME/bin:$PATH$HIVE_HOME/bin/hive


Does step #3 went without hitch? Upto step 3 you are downloading the binaries from SVN to your machine and step 4 is setting the binaries with your Hadoop Environment.

Step 4 suggests you the following:

export PATH=$PATH:/usr/src/hive/build/dist/bin/

-> Means you are adding directory /usr/src/hive/build/dist/bin/ in your PATH environment. You must have installed hive in this folder -> /usr/src/hive/build/dist/bin/ so adding this folder (actually Hive Binaries) in your path will let you run Hive in your machine.

export PATH=$PATH:/usr/src/hive/build/dist/lib/

-> Means you are adding directory /usr/src/hive/build/dist/lib/ in your PATH environment. When you have installed Hive on your machine, Hive related libraries are located in this folder > /usr/src/hive/build/dist/lib/ so adding this directory to your PATH, will help Hive to run successfully.

export PATH=$PATH:/usr/local/hadoop/bin

-> If you already have Hadoop running in your machine, this should already set otherwise this command is just setting Hadoop Binary folder in your machine path.

If you dont know what is PATH, just look for "PATH in Linux" at internet.


This PPA makes it pretty easy to install Hive on Ubuntu.