How to load xml file into Hive How to load xml file into Hive hadoop hadoop

How to load xml file into Hive


You have several options:

  • Load the XML into a Hive table with a string column, one per row (e.g. CREATE TABLE xmlfiles (id int, xmlfile string). Then use an XPath UDF to do work on the XML.
  • Since you know the XPath's of what you want (e.g. //section1), follow the instructions in the second half of this tutorial to ingest directly into Hive via XPath.
  • Map your XML to Avro as described here because a SerDe exists for seamless Avro-to-Hive mapping.
  • Use XPath to store your data in a regular text file in HDFS and then ingest that into Hive.

It depends on your level of experience and comfort with these approaches.


Use this:

CREATE EXTERNAL TABLE test(name STRING) LOCATION '/user/sornalingam/zipped/output/Tagged/t1'tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="1");

And then use xpath function