Can i point multiple location to same hive external table? Can i point multiple location to same hive external table? hadoop hadoop

Can i point multiple location to same hive external table?


Simple answer: no, the location of a Hive external table during creation has to be unique, this is needed by the metastore to understand where your table lives.

That being said, you can probably get away with using partitions: you can specify a location for each of your partitions which seems to be what you want ultimately since you are splitting by month.

So create your table like this:

create external table logdata(col1 string, col2 string) partitioned by (month string) location 's3://logdata'

Then you can add partitions like this:

alter table logdata add partition(month='april') location 's3://logdata/april'

You do this for every month, and now you can query your table specifying whichever partition you want, and Hive will only look at the directories for which you actually want data (for example if you're only processing april and june, Hive will not load may)


I checked out your scenario. I think you can achieve that by using multiple load inpath statements for enabling multiple locations .Below are the steps I took for the test I ran.

hive> create external table xxx (uid int, name string, dept string) row format delimited fields terminated by '\t' stored as textfile;hive> load data inpath '/input/tmp/user_bckt' into table xxx;hive> load data inpath '/input/user_bckt' into table xxx;hive> select count(*) from xxx;10hive> select * from xxx;1   ankur   abinitio2   lokesh  cloud3   yadav   network4   sahu    td5   ankit   data1   ankur   abinitio2   lokesh  cloud3   yadav   network4   sahu    td5   ankit   data

Let me know if this doesn't work for you

EDIT: I just checked the data is getting moved in this case into the hive warehouse opposed to the concept of external table data being left at its original location which is demonstrated below:

hduser@hadoopnn:~$ hls /input/tmpDEPRECATED: Use of this script to execute hdfs command is deprecated.Instead use the hdfs command for it.14/10/05 14:47:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFound 2 items-rw-r--r--   1 hduser hadoop         93 2014-10-04 18:54 /input/tmp/dept_bckt-rw-r--r--   1 hduser hadoop         71 2014-10-04 18:54 /input/tmp/user_bckthduser@hadoopnn:~$ hcp /input/tmp/user_bckt /input/user_bcktDEPRECATED: Use of this script to execute hdfs command is deprecated.Instead use the hdfs command for it.14/10/05 14:47:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablehduser@hadoopnn:~$ logoutConnection to nn closed.hduser@hadoopdn2:~$ hls /input/tmp/DEPRECATED: Use of this script to execute hdfs command is deprecated.Instead use the hdfs command for it.14/10/05 15:05:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFound 1 items-rw-r--r--   1 hduser hadoop         93 2014-10-04 18:54 /input/tmp/dept_bckthduser@hadoopdn2:~$ hls /hive/wh/xxxDEPRECATED: Use of this script to execute hdfs command is deprecated.Instead use the hdfs command for it.14/10/05 15:21:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFound 2 items-rw-r--r--   1 hduser hadoop         71 2014-10-04 18:54 /hive/wh/xxx/user_bckt-rw-r--r--   1 hduser hadoop         71 2014-10-05 14:47 /hive/wh/xxx/user_bckt_copy_1

I am currently looking into the issue here and will get back once done.


NO, the location must be single directory. However, you can alter location to point to multiple directories. But, when you would query table, it will be an error.

Example:1. Alter the location of table as below. I entered two hdfs directories separated by ‘:’, also tried ‘,’ and ';'. It was successful.

hive> alter table ext set location 'hdfs:///solytr:/ext';OKTime taken: 0.086 seconds
  1. But, when the table was queried, it resulted into failure.

hive> select * from ext;
OK Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Pathname /solytr:/ext from hdfs:/solytr:/ext is not a valid DFS filename.
Time taken: 0.057 seconds