How can hadoop mapreduce get data input from CSV file? How can hadoop mapreduce get data input from CSV file? hadoop hadoop

How can hadoop mapreduce get data input from CSV file?


By default Hadoop uses a Text Input reader that feeds the mapper line by line from the input file. The key in the mapper is the number of lines read. Be careful with CSV files though, as single columns/fields can contain a line break. You might want to look for a CSV input reader like this one:

https://github.com/mvallebr/CSVInputFormat/blob/master/src/main/java/org/apache/hadoop/mapreduce/lib/input/CSVNLineInputFormat.java

But, you have to split your line in your code.