Run a hadoop cluster on docker containers Run a hadoop cluster on docker containers hadoop hadoop

Run a hadoop cluster on docker containers


I did this with two containers running master and slave nodes on two different ubuntu hosts. I did the networking between containers using weave. I have added the images of the containers on docker hub account div4. I installed hadoop in the same way, as its installed on different hosts. I have added the two images with coomands to run haddop on them here:

https://registry.hub.docker.com/u/div4/hadoop_master/https://registry.hub.docker.com/u/div4/hadoop_slave/.


The people from sequenceiq have created a new project called cloud-break that is designed to work with different cloud providers and create hadoop clusters on them easily. You just have to enter your credentials and then it works the same for all providers, as far as I can see.

So for ec2, this will now probably be the easiest solution(especially because of a nice GUI):

https://github.com/sequenceiq/cloudbreak-deployer