Airbnb Airflow using all system resources Airbnb Airflow using all system resources docker docker

Airbnb Airflow using all system resources


I have also tried everything I could to get the CPU usage down and Matthew Housley's advice regarding MIN_FILE_PROCESS_INTERVAL was what did the trick.

At least until airflow 1.10 came around... then the CPU usage went through the roof again.

So here is everything I had to do to get airflow to work well on a standard digital ocean droplet with 2gb of ram and 1 vcpu:

1. Scheduler File Processing

Prevent airflow from reloading the dags all the time and set:AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL=60

2. Fix airflow 1.10 scheduler bug

The AIRFLOW-2895 bug in airflow 1.10, causes high CPU load, because the scheduler keeps looping without a break.

It's already fixed in master and will hopefully be included in airflow 1.10.1, but it could take weeks or months until its released. In the meantime this patch solves the issue:

--- jobs.py.orig    2018-09-08 15:55:03.448834310 +0000+++ jobs.py     2018-09-08 15:57:02.847751035 +0000@@ -564,6 +564,7 @@         self.num_runs = num_runs         self.run_duration = run_duration+        self._processor_poll_interval = 1.0         self.do_pickle = do_pickle         super(SchedulerJob, self).__init__(*args, **kwargs)@@ -1724,6 +1725,8 @@             loop_end_time = time.time()             self.log.debug("Ran scheduling loop in %.2f seconds",                            loop_end_time - loop_start_time)+            self.log.debug("Sleeping for %.2f seconds", self._processor_poll_interval)+            time.sleep(self._processor_poll_interval)             # Exit early for a test mode             if processor_manager.max_runs_reached():

Apply it with patch -d /usr/local/lib/python3.6/site-packages/airflow/ < af_1.10_high_cpu.patch;

3. RBAC webserver high CPU load

If you upgraded to use the new RBAC webserver UI, you may also notice that the webserver is using a lot of CPU persistently.

For some reason the RBAC interface uses a lot of CPU on startup. If you are running on a low powered server, this can cause a very slow webserver startup and permanently high CPU usage.

I have documented this bug as AIRFLOW-3037. To solve it you can adjust the config:

AIRFLOW__WEBSERVER__WORKERS=2 # 2 * NUM_CPU_CORES + 1AIRFLOW__WEBSERVER__WORKER_REFRESH_INTERVAL=1800 # Restart workers every 30min instead of 30secondsAIRFLOW__WEBSERVER__WEB_SERVER_WORKER_TIMEOUT=300 #Kill workers if they don't start within 5min instead of 2min

With all of these tweaks my airflow is using only a few % of CPU during idle time on a digital ocean standard droplet with 1 vcpu and 2gb of ram.


I just ran into an issue like this. Airflow was consuming roughly a full vCPU in a t2.xlarge instance, with the vast majority of this coming from the scheduler container. Checking the scheduler logs, I could see that it was processing my single DAG more than once a second even though it only runs once a day.

I found that the MIN_FILE_PROCESS_INTERVAL was set to the default value of 0, so the scheduler was looping over the DAG. I changed the process interval to 65 seconds, and Airflow now uses less than 10 percent of a vCPU in a t2.medium instance.


Try to change the below config in airflow.cfg

# after how much time a new DAGs should be picked up from the filesystemmin_file_process_interval = 0# How many seconds to wait between file-parsing loops to prevent the logs from being spammed.min_file_parsing_loop_time = 1