Kubernetes executor do not parallelize sub DAGs execution in Airflow
Kubernetes Executor in Airflow will turn all the first level of tasks into a worker pod with Local Executor.
It means that you will get the Local Executor to execute your SubDagOperator
.
In order to run the tasks under SubDagOperator after the spawning the worker pod, you will need to specify the configuration parallelism
for the worker pod. So, in case you are using the YAML format for worker pod, you will need to edit it to something like this.
apiVersion: v1kind: Podmetadata: name: dummy-namespec: containers: - args: [] command: [] env: ################################### # This is the part you need to add ################################### - name: AIRFLOW__CORE__PARALLELISM value: 10 ################################### - name: AIRFLOW__CORE__EXECUTOR value: LocalExecutor # Hard Coded Airflow Envs - name: AIRFLOW__CORE__FERNET_KEY valueFrom: secretKeyRef: name: RELEASE-NAME-fernet-key key: fernet-key - name: AIRFLOW__CORE__SQL_ALCHEMY_CONN valueFrom: secretKeyRef: name: RELEASE-NAME-airflow-metadata key: connection - name: AIRFLOW_CONN_AIRFLOW_DB valueFrom: secretKeyRef: name: RELEASE-NAME-airflow-metadata key: connection envFrom: [] image: dummy_image imagePullPolicy: IfNotPresent name: base ports: [] volumeMounts: - mountPath: "/opt/airflow/logs" name: airflow-logs - mountPath: /opt/airflow/dags name: airflow-dags readOnly: false - mountPath: /opt/airflow/dags name: airflow-dags readOnly: true subPath: repo/tests/dags hostNetwork: false restartPolicy: Never securityContext: runAsUser: 50000 nodeSelector: {} affinity: {} tolerations: [] serviceAccountName: 'RELEASE-NAME-worker-serviceaccount' volumes: - name: dags persistentVolumeClaim: claimName: RELEASE-NAME-dags - emptyDir: {} name: airflow-logs - configMap: name: RELEASE-NAME-airflow-config name: airflow-config - configMap: name: RELEASE-NAME-airflow-config name: airflow-local-settings
Then, SubDagOperator
will follow the parallelism
specified to run the tasks in parallel.