I've been using the built‑in Airflow alerts, but they're a bit limited for our multi‑cloud setup. Has anyone tried integrating PagerDuty or Opsgenie directly with Airflow? I'd love to hear about pros, cons, and any gotchas.
I've been using the built‑in Airflow alerts, but they're a bit limited for our multi‑cloud setup. Has anyone tried integrating PagerDuty or Opsgenie directly with Airflow? I'd love to hear about pros, cons, and any gotchas.
We switched to Prometheus + Grafana for metric‑based alerts. Airflow pushes metrics to Prometheus via statsd, and Grafana handles the alert routing. Works great for us, especially with silencing during maintenance windows.
We built a small wrapper around Airflow's on_failure_callback that publishes to an SNS topic. From there, we fan‑out to Slack, email, and even a custom dashboard. It's lightweight and lets us use whatever subscription model we need.