Apache Airflow

Working with Variables

Table of Contents

What are Variables?

Airflow Variables are a general-purpose, key-value store for arbitrary metadata. They are particularly useful for storing configuration values that can change between environments or deployments, such as database credentials, API keys, or feature flags. Variables can be accessed within DAGs and tasks, providing a dynamic way to configure your workflows without hardcoding sensitive or environment-specific information.

They offer a centralized and secure way to manage configuration, reducing the need to modify DAG code for simple configuration changes.

Accessing and Managing Variables via the UI

Airflow provides a user-friendly interface for managing variables. You can access this feature through the Airflow webserver:

  1. Navigate to the Airflow UI.
  2. Click on the "Admin" menu item in the top navigation bar.
  3. Select "Variables".

On the Variables page, you can perform the following actions:

For sensitive values, it's recommended to use Airflow's secrets backend integrations.

Programmatic Access to Variables

You can access Airflow Variables directly within your Python DAG code using the Variable class from the airflow.models module.

Retrieving a Variable

To get the value of a variable, use the Variable.get() method.


from airflow.models import Variable

# Get the value of a variable with key 'my_api_key'
api_key = Variable.get("my_api_key")
print(f"API Key: {api_key}")

# Get a variable with a default value if it doesn't exist
db_name = Variable.get("database_name", default_var="airflow_db")
print(f"Database Name: {db_name}")
            

Setting a Variable (Less Common in DAGs)

While not typically done within DAG definitions (as DAGs are often read-only during runtime), you can set variables programmatically. This is more common in custom operators or setup scripts.


from airflow.models import Variable

# Set a new variable or update an existing one
Variable.set("my_config_value", "production")

# Set a variable with JSON payload (will be stored as a string)
import json
config_data = {"timeout": 60, "retries": 3}
Variable.set("app_settings", json.dumps(config_data))
            

Deleting a Variable (Less Common in DAGs)

Similar to setting, deletion is usually an administrative task.


from airflow.models import Variable

# Delete a variable
Variable.delete("old_api_key")
            

Variable Types and Serialization

Airflow Variables store values as strings. If you need to store more complex data structures like lists, dictionaries, or JSON, you should serialize them into a string format (e.g., JSON) before setting and deserialize them after retrieving.


import json
from airflow.models import Variable

# Storing a dictionary
my_dict = {"host": "localhost", "port": 5432}
Variable.set("db_connection_details", json.dumps(my_dict))

# Retrieving and deserializing
retrieved_json = Variable.get("db_connection_details")
db_details = json.loads(retrieved_json)
print(f"DB Host: {db_details['host']}, Port: {db_details['port']}")

# Storing a list
my_list = ["user1", "user2", "admin"]
Variable.set("allowed_users", json.dumps(my_list))

# Retrieving and deserializing
retrieved_list_json = Variable.get("allowed_users")
user_list = json.loads(retrieved_list_json)
print(f"Allowed users: {user_list}")
            

Security Considerations

Variables can store sensitive information like passwords and API keys. It is crucial to handle them securely:

Best Practices for Using Variables