Airflow DAG Multiple Runs
I found a solution to my use case. It incorporated using depends_on_past=True (mentioned by @Hitesh Gupta) and setting your airflow.cfg file below:
# The maximum number of active DAG runs per DAGmax_active_runs_per_dag = 1
This allowed us to only have one active DAG run at a time and also to not continue the next DAG run if there were failure in the previous run. This is for Airflow version 1.10.1 that I tested on.
- You can, in addition to supplying a
start_date
, provide your DAG anend_date
- Quoting the docstring
:param start_date
: The timestamp from which the scheduler will attempt to backfill
:type start_date
: datetime.datetime
:param end_date:
A date beyond which your DAG won't run, leave to None for open ended scheduling
:type end_date
: datetime.datetime
While unrelated, also have a look at following scheduler
settings in airflow.cfg
as mentioned in this article
run_duration
num_runs
UPDATE-1
In his article Use apache airflow to run task exactly once, @Andreas P has described a clever technique, which I believe can be adapted to your use-case. While even that won't be a very-tidy solution, it would at-least allow you to specify beforehand the number of runs (integer) for DAG instead of
end_date
.Alternatively (assuming you implement the above approach) rather than rather than baking this skipping-dag-after max-runs functionality within each DAG, you can create a separate orchestrator DAG that disables a given DAG after its max runs have passed.
You have to set property depends_on_past. This is set under DAG's default arguments section and it refers to previous instance dag instance. This is fix your problem.