Kubernetes executor do not parallelize sub DAGs execution in Airflow Kubernetes executor do not parallelize sub DAGs execution in Airflow kubernetes kubernetes

Kubernetes executor do not parallelize sub DAGs execution in Airflow


Kubernetes Executor in Airflow will turn all the first level of tasks into a worker pod with Local Executor.

It means that you will get the Local Executor to execute your SubDagOperator.

In order to run the tasks under SubDagOperator after the spawning the worker pod, you will need to specify the configuration parallelism for the worker pod. So, in case you are using the YAML format for worker pod, you will need to edit it to something like this.

apiVersion: v1kind: Podmetadata:  name: dummy-namespec:  containers:    - args: []      command: []      env:        ###################################        # This is the part you need to add        ###################################        - name: AIRFLOW__CORE__PARALLELISM          value: 10        ###################################        - name: AIRFLOW__CORE__EXECUTOR          value: LocalExecutor        # Hard Coded Airflow Envs        - name: AIRFLOW__CORE__FERNET_KEY          valueFrom:            secretKeyRef:              name: RELEASE-NAME-fernet-key              key: fernet-key        - name: AIRFLOW__CORE__SQL_ALCHEMY_CONN          valueFrom:            secretKeyRef:              name: RELEASE-NAME-airflow-metadata              key: connection        - name: AIRFLOW_CONN_AIRFLOW_DB          valueFrom:            secretKeyRef:              name: RELEASE-NAME-airflow-metadata              key: connection      envFrom: []      image: dummy_image      imagePullPolicy: IfNotPresent      name: base      ports: []      volumeMounts:        - mountPath: "/opt/airflow/logs"          name: airflow-logs        - mountPath: /opt/airflow/dags          name: airflow-dags          readOnly: false        - mountPath: /opt/airflow/dags          name: airflow-dags          readOnly: true          subPath: repo/tests/dags  hostNetwork: false  restartPolicy: Never  securityContext:    runAsUser: 50000  nodeSelector:    {}  affinity:    {}  tolerations:    []  serviceAccountName: 'RELEASE-NAME-worker-serviceaccount'  volumes:    - name: dags      persistentVolumeClaim:        claimName: RELEASE-NAME-dags    - emptyDir: {}      name: airflow-logs    - configMap:        name: RELEASE-NAME-airflow-config      name: airflow-config    - configMap:        name: RELEASE-NAME-airflow-config      name: airflow-local-settings

Then, SubDagOperator will follow the parallelism specified to run the tasks in parallel.