Is it possible to run map/reduce job on Hadoop cluster with no input file?

java testing hadoop mapreduce

File paths are relevant for FileInputFormat based inputs like SequenceInputFormat, etc. But inputformats that read from hbase, database do not read from files, so you could make your own implementation of the InputFormat and define your own behaviour in getSplits, RecordReader, createRecordReader. For insperation look into the source code of the TextInputFormat class.

java testing hadoop mapreduce

For MR job unit testing you can also use MRUnit .If you want to generate test data with Hadoop, then I'd recommend you to have a look at the source code of Teragen .

java testing hadoop mapreduce

I guess your are looking to test your map-reduce on samll set of data so in that case i will recommand following

Unit Test For Map-Reduce will solve your problem

If you want to test your mapper/combiner/reducer for a single line of linput from your file , best possible thing is to use UnitTest for each .

sample code:-
using Mocking Frame work in java Use can run these test cases in your IDE

Here i have used Mockito OR MRunit can also be used which too is depended on a Mockito(Java Mocking Framework)

public class BoxPlotMapperTest {@Testpublic void validOutputTextMapper() throws IOException, InterruptedException{    Mapper mapper=new Mapper();//Your Mapper Object     Text line=new Text("single line from input-file"); // single line input from file     Mapper.Context context=Mockito.mock(Mapper.Context.class);    mapper.map(null, line, context);//(key=null,value=line,context)//key was not used in my code so its null     Mockito.verify(context).write(new Text("your expected key-output"), new Text("your expected value-output")); // }@Testpublic void validOutputTextReducer() throws IOException, InterruptedException{    Reducer reduer=new Reducer();    final List<Text> values=new ArrayList<Text>();    values.add(new Text("value1"));    values.add(new Text("value2"));    values.add(new Text("value3"));    values.add(new Text("value4"));    Iterable<Text> iterable=new Iterable<Text>() {        @Override        public Iterator<Text> iterator() {            // TODO Auto-generated method stub            return values.iterator();        }    };    Reducer.Context context=Mockito.mock(Reducer.Context.class);    reduer.reduce(new Text("key"),iterable, context);    Mockito.verify(context).write(new Text("your expected key-output"), new Text("your expected value-output"));}

}

CodeHunter

Is it possible to run map/reduce job on Hadoop cluster with no input file?

I guess your are looking to test your map-reduce on samll set of data so in that case i will recommand following

Unit Test For Map-Reduce will solve your problem

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last