What is difference between Oozie workflow, coordinator and bundle What is difference between Oozie workflow, coordinator and bundle hadoop hadoop

What is difference between Oozie workflow, coordinator and bundle


Workflow:

It is a sequence of actions. It is written in xml and the actions can be map reduce, hive, pig etc.

Coordinator:

It is a program that triggers actions (commonly workflow jobs) when a set of conditions are met. Conditions can be a time frequency,other external events etc.

Bundle:

It is defined as a higher level oozie abstraction that batches a set of coordinator jobs.We can specify the time for bundle job to start as well.


Workflow does not have time specifications to run any hadoop job.Coordinator job have the time specifications about job in coordinator.xml using frequency tag.Collective coordinator jobs are considered to be as a Bundle job.In Bundle job, individual users can assign their own jobs by using their job.properties, for their respective jobs.


For my understanding, using bundle could group a couple of coordinators, so it will be better to manager, to view, to start/stop...

Likely we have two data pipeline, one is for log handing(collect/parse/ETL), one is for business logic.

Then I create two bundles to groups the different kinds of coordinators.