Working with Variables
Table of Contents
What are Variables?
Airflow Variables are a general-purpose, key-value store for arbitrary metadata. They are particularly useful for storing configuration values that can change between environments or deployments, such as database credentials, API keys, or feature flags. Variables can be accessed within DAGs and tasks, providing a dynamic way to configure your workflows without hardcoding sensitive or environment-specific information.
They offer a centralized and secure way to manage configuration, reducing the need to modify DAG code for simple configuration changes.
Accessing and Managing Variables via the UI
Airflow provides a user-friendly interface for managing variables. You can access this feature through the Airflow webserver:
- Navigate to the Airflow UI.
- Click on the "Admin" menu item in the top navigation bar.
- Select "Variables".
On the Variables page, you can perform the following actions:
- View Variables: See a list of all existing variables, their keys, values (often masked for security), and their last updated timestamps.
- Add New Variable: Click the "Add a new record" button to create a new variable. You'll need to provide a unique 'Key' and a 'Value'.
- Edit Variable: Click the edit icon next to a variable to modify its key or value.
- Delete Variable: Click the delete icon to remove a variable.
For sensitive values, it's recommended to use Airflow's secrets backend integrations.
Programmatic Access to Variables
You can access Airflow Variables directly within your Python DAG code using the Variable
class from the airflow.models
module.
Retrieving a Variable
To get the value of a variable, use the Variable.get()
method.
from airflow.models import Variable
# Get the value of a variable with key 'my_api_key'
api_key = Variable.get("my_api_key")
print(f"API Key: {api_key}")
# Get a variable with a default value if it doesn't exist
db_name = Variable.get("database_name", default_var="airflow_db")
print(f"Database Name: {db_name}")
Setting a Variable (Less Common in DAGs)
While not typically done within DAG definitions (as DAGs are often read-only during runtime), you can set variables programmatically. This is more common in custom operators or setup scripts.
from airflow.models import Variable
# Set a new variable or update an existing one
Variable.set("my_config_value", "production")
# Set a variable with JSON payload (will be stored as a string)
import json
config_data = {"timeout": 60, "retries": 3}
Variable.set("app_settings", json.dumps(config_data))
Deleting a Variable (Less Common in DAGs)
Similar to setting, deletion is usually an administrative task.
from airflow.models import Variable
# Delete a variable
Variable.delete("old_api_key")
Variable Types and Serialization
Airflow Variables store values as strings. If you need to store more complex data structures like lists, dictionaries, or JSON, you should serialize them into a string format (e.g., JSON) before setting and deserialize them after retrieving.
import json
from airflow.models import Variable
# Storing a dictionary
my_dict = {"host": "localhost", "port": 5432}
Variable.set("db_connection_details", json.dumps(my_dict))
# Retrieving and deserializing
retrieved_json = Variable.get("db_connection_details")
db_details = json.loads(retrieved_json)
print(f"DB Host: {db_details['host']}, Port: {db_details['port']}")
# Storing a list
my_list = ["user1", "user2", "admin"]
Variable.set("allowed_users", json.dumps(my_list))
# Retrieving and deserializing
retrieved_list_json = Variable.get("allowed_users")
user_list = json.loads(retrieved_list_json)
print(f"Allowed users: {user_list}")
Security Considerations
Variables can store sensitive information like passwords and API keys. It is crucial to handle them securely:
- Use Secrets Backends: For production environments, configure Airflow to use a dedicated secrets management system (like HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager, Azure Key Vault) instead of storing secrets directly in the Airflow metadata database. This is the recommended approach for sensitive data. Refer to the Secrets Management documentation for detailed instructions.
- Restrict Access: If using the default Airflow backend, be mindful of who has access to the Airflow UI and its database.
- Avoid Hardcoding: Never hardcode sensitive values directly into your DAG files. Always use Variables or a secrets backend.
- Value Masking: The Airflow UI masks variable values by default after they are saved, offering a layer of visual protection.
Best Practices for Using Variables
- Namespace Your Variables: Use a consistent naming convention to avoid conflicts, especially in larger deployments. Prefixing variables with the DAG name or a logical group (e.g.,
my_dag.api_key
,etl.s3_bucket
) can be helpful. - Keep Values Small: Variables are best suited for small configuration values. For larger configuration files or data, consider storing them in external storage (like S3, GCS) and referencing the location/key in a variable.
- Use Default Values: Employ the
default_var
parameter inVariable.get()
to make your DAGs more robust and prevent failures if a variable is missing. - Document Your Variables: Keep a record of your variables, their purpose, and expected format, especially if they are used across multiple DAGs.
- Separate Configuration from Code: Treat variables as external configuration that can be changed without touching DAG code.