Directed Acyclic Graph (DAG)
A DAG defines the workflow – a collection of tasks with explicit execution order.
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime
with DAG(
dag_id="example_dag",
start_date=datetime(2024, 1, 1),
schedule_interval="@daily",
) as dag:
BashOperator(task_id="print_date", bash_command="date")
Task & Operator
Tasks are instantiated from operators. Operators encode the logic to be executed.
- BaseOperator – common functionality for all operators.
- BashOperator – runs a Bash command.
- PythonOperator – executes a Python callable.
- Sensor – waits for a condition to become true.
Scheduler
The Scheduler monitors DAG definitions, creates task instances, and submits them to the Executor.
Executor
Executors determine how and where tasks run. Common executors include:
- SequentialExecutor – runs tasks locally, one after another.
- LocalExecutor – runs tasks in parallel on the same machine.
- CeleryExecutor – distributes tasks across a Celery worker pool.
- KubernetesExecutor – spins up pods for each task.
Triggerer
Introduced in Airflow 2.4, the Triggerer handles deferred execution for sensors and async operators.
XCom (Cross‑communication)
XComs allow tasks to exchange small amounts of data.
# Push
ti.xcom_push(key="result", value=42)
# Pull
ti.xcom_pull(key="result") # returns 42
Connections & Variables
Connections store credentials for external services; Variables store generic key‑value pairs.