How to Setup Alertmanager

How to Setup Alertmanager Alertmanager is a critical component of the Prometheus monitoring ecosystem, designed to handle alerts sent by Prometheus servers and route them to the appropriate notification channels. Whether you’re managing cloud infrastructure, microservices, or on-premise systems, effective alerting is non-negotiable for maintaining system reliability and minimizing downtime. Alertm

Nov 10, 2025 - 12:00
Nov 10, 2025 - 12:00
 0

How to Setup Alertmanager

Alertmanager is a critical component of the Prometheus monitoring ecosystem, designed to handle alerts sent by Prometheus servers and route them to the appropriate notification channels. Whether youre managing cloud infrastructure, microservices, or on-premise systems, effective alerting is non-negotiable for maintaining system reliability and minimizing downtime. Alertmanager doesnt just send notificationsit consolidates, deduplicates, and silences alerts to prevent alert fatigue, ensuring that your team receives only the most relevant and actionable information.

Unlike basic alerting tools that fire off every minor anomaly, Alertmanager provides intelligent routing based on labels, grouping rules, and time-based policies. It supports integrations with email, Slack, PagerDuty, Microsoft Teams, Webhooks, and more, making it highly adaptable to any operational workflow. Setting up Alertmanager correctly is not just a technical taskits a strategic decision that impacts your teams responsiveness, system uptime, and overall operational maturity.

This guide walks you through every step of configuring Alertmanager from scratch, covering installation, configuration, integration with Prometheus, best practices, real-world examples, and troubleshooting. By the end, youll have a fully functional, production-ready alerting system that scales with your infrastructure.

Step-by-Step Guide

Prerequisites

Before beginning the setup, ensure you have the following:

  • A Linux-based server (Ubuntu 20.04/22.04 or CentOS 8/9 recommended)
  • Access to the command line with sudo privileges
  • Prometheus server already installed and running
  • Basic understanding of YAML configuration files
  • Network access to external notification services (e.g., Slack, email SMTP)

Alertmanager is designed to work alongside Prometheus, so if you havent installed Prometheus yet, begin by following the official Prometheus installation guide. Once Prometheus is operational, proceed with Alertmanager setup.

Step 1: Download and Install Alertmanager

Alertmanager is distributed as a binary executable. Visit the official GitHub releases page to find the latest stable version. As of this writing, version 0.26.0 is recommended for production use.

Use wget to download the binary:

wget https://github.com/prometheus/alertmanager/releases/download/v0.26.0/alertmanager-0.26.0.linux-amd64.tar.gz

Extract the archive:

tar xvfz alertmanager-0.26.0.linux-amd64.tar.gz

Move the extracted files to a standard system location:

sudo mv alertmanager-0.26.0.linux-amd64/alertmanager /usr/local/bin/

sudo mv alertmanager-0.26.0.linux-amd64/amtool /usr/local/bin/

Create a dedicated system user for Alertmanager to run under for security:

sudo useradd --no-create-home --shell /bin/false alertmanager

Step 2: Create Configuration Directories

Organize your Alertmanager files in a standard structure:

sudo mkdir -p /etc/alertmanager

sudo mkdir -p /etc/alertmanager/templates

sudo mkdir -p /var/lib/alertmanager

Set ownership to the alertmanager user:

sudo chown alertmanager:alertmanager /usr/local/bin/alertmanager

sudo chown alertmanager:alertmanager /usr/local/bin/amtool

sudo chown alertmanager:alertmanager /etc/alertmanager

sudo chown alertmanager:alertmanager /var/lib/alertmanager

Step 3: Create the Alertmanager Configuration File

The core of Alertmanager is its configuration file: /etc/alertmanager/alertmanager.yml. This YAML file defines how alerts are routed, grouped, silenced, and notified.

Create the file:

sudo nano /etc/alertmanager/alertmanager.yml

Heres a minimal but functional configuration:

global:

resolve_timeout: 5m

smtp_smarthost: 'smtp.gmail.com:587'

smtp_from: 'your-email@gmail.com'

smtp_auth_username: 'your-email@gmail.com'

smtp_auth_password: 'your-app-password'

smtp_hello: 'localhost'

smtp_require_tls: true

route:

group_by: ['alertname', 'cluster', 'service']

group_wait: 30s

group_interval: 5m

repeat_interval: 3h

receiver: 'email-notifications'

receivers:

- name: 'email-notifications'

email_configs:

- to: 'ops-team@example.com'

html: '{{ template "email.default.html" . }}'

headers:

subject: '[Alertmanager] {{ .CommonLabels.alertname }} - {{ .CommonLabels.severity }}'

templates:

- '/etc/alertmanager/templates/email.tmpl'

Lets break down the key sections:

  • global: Defines default settings for all alerts, including SMTP server details for email notifications and timeout values.
  • route: Determines how alerts are grouped and routed. group_by ensures similar alerts are bundled together. group_wait delays initial notification to allow more alerts to accumulate. repeat_interval controls how often a resolved alert is re-notified.
  • receivers: Specifies where alerts should be sent. In this case, an email receiver named email-notifications.
  • templates: Points to custom email templates for richer notification formatting.

For production environments, avoid hardcoding passwords. Use environment variables or secret management tools like HashiCorp Vault or Kubernetes Secrets.

Step 4: Create a Custom Email Template (Optional but Recommended)

Custom templates improve readability and provide context. Create a template file:

sudo nano /etc/alertmanager/templates/email.tmpl

Add the following Go template:

{{ define "email.default.html" }}

body { font-family: Arial, sans-serif; } .alert { background-color:

f8d7da; border-left: 4px solid #721c24; padding: 10px; margin: 10px 0; }

.label { font-weight: bold; color:

495057; }

Alertmanager Notification

Alert Name: {{ .CommonLabels.alertname }}

Severity: {{ .CommonLabels.severity }}

Instance: {{ .CommonLabels.instance }}

Description: {{ .CommonAnnotations.description }}

Start Time: {{ .StartsAt }}

View in Prometheus

{{ end }}

This template renders a clean, styled HTML email with key alert details and a direct link to the Prometheus UI for deeper investigation.

Step 5: Configure Prometheus to Send Alerts to Alertmanager

Alertmanager doesnt generate alertsit receives them from Prometheus. You must configure Prometheus to forward alerts to Alertmanager.

Open your Prometheus configuration file (typically /etc/prometheus/prometheus.yml):

sudo nano /etc/prometheus/prometheus.yml

Add or update the alerting section:

alerting:

alertmanagers:

- static_configs:

- targets:

- localhost:9093

Ensure that the alerting block is at the same level as scrape_configs and not nested inside it.

Also, verify that your alert rules are defined in a separate file (e.g., /etc/prometheus/alerts.yml) and referenced in the main config:

rule_files:

- "alerts.yml"

Example alert rule (/etc/prometheus/alerts.yml):

groups:

- name: example

rules:

- alert: HighRequestLatency

expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5

for: 10m

labels:

severity: page

annotations:

summary: "High request latency detected"

description: "{{ $labels.instance }} has a high request latency of {{ $value }}s."

Restart Prometheus after making changes:

sudo systemctl restart prometheus

Step 6: Create a Systemd Service for Alertmanager

To ensure Alertmanager starts automatically on boot and restarts on failure, create a systemd service file:

sudo nano /etc/systemd/system/alertmanager.service

Add the following content:

[Unit]

Description=Alertmanager

Wants=network-online.target

After=network-online.target

[Service]

Type=simple

User=alertmanager

Group=alertmanager

ExecStart=/usr/local/bin/alertmanager \

--config.file=/etc/alertmanager/alertmanager.yml \

--storage.path=/var/lib/alertmanager \

--web.listen-address=0.0.0.0:9093 \

--web.template.files=/etc/alertmanager/templates/*.tmpl

Restart=always

RestartSec=5

[Install]

WantedBy=multi-user.target

Reload systemd and enable the service:

sudo systemctl daemon-reload

sudo systemctl enable alertmanager

sudo systemctl start alertmanager

Verify the service is running:

sudo systemctl status alertmanager

You should see active (running). If not, check logs with:

journalctl -u alertmanager -f

Step 7: Access the Alertmanager Web UI

Alertmanager includes a built-in web interface for monitoring active alerts, silences, and configuration status. By default, it runs on port 9093.

Open your browser and navigate to:

http://your-server-ip:9093

Youll see a dashboard showing:

  • Active alerts grouped by labels
  • Alert history
  • Configuration validation status
  • Silence management interface

Test the setup by triggering a simulated alert. You can use Prometheuss up metric to force a failure:

curl -X POST -d '1' http://localhost:9090/-/reload

Or temporarily stop the Prometheus server to trigger a Prometheus is down alert. Within minutes, you should receive an email notification and see the alert appear in the web UI.

Step 8: Integrate with Slack (Optional but Highly Recommended)

Slack is one of the most popular notification channels. To integrate:

  1. Create an incoming webhook in your Slack workspace: Go to https://your-workspace.slack.com/apps ? Search Incoming Webhooks ? Add to workspace ? Create new webhook ? Copy the webhook URL.
  2. Update your alertmanager.yml to include a Slack receiver:
receivers:

- name: 'slack-notifications'

slack_configs:

- api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL' channel: '

alerts'

text: |

{{ .CommonLabels.alertname }} - {{ .CommonLabels.severity }}

{{ range .Alerts }}

*Description:* {{ .Annotations.description }}

*Instance:* {{ .Labels.instance }}

*Starts At:* {{ .StartsAt.Format "2006-01-02 15:04:05" }}

[View in Prometheus]({{ .GeneratorURL }})

{{ end }}

send_resolved: true

- name: 'email-notifications'

email_configs:

- to: 'ops-team@example.com'

html: '{{ template "email.default.html" . }}'

route:

group_by: ['alertname', 'cluster', 'service']

group_wait: 30s

group_interval: 5m

repeat_interval: 3h

receiver: 'slack-notifications'

routes:

- match:

severity: 'page'

receiver: 'slack-notifications'

- match:

severity: 'warning'

receiver: 'email-notifications'

Key points:

  • Use send_resolved: true to notify when an alert clears.
  • Use nested routes to send critical alerts to Slack and warnings to email.
  • Always test webhook integration with a dummy alert before relying on it in production.

Restart Alertmanager after changes:

sudo systemctl restart alertmanager

Best Practices

1. Use Labels and Annotations Effectively

Labels (e.g., severity, instance, job) are used for grouping and routing. Annotations (e.g., description, summary) provide human-readable context. Always define consistent labels across all alert rules. Use severity with values like info, warning, critical, and page to enable tiered alerting.

2. Avoid Alert Storms with Grouping and Suppression

Alertmanagers grouping feature reduces noise by combining similar alerts. For example, if 50 servers lose connectivity due to a network outage, Alertmanager sends one grouped alert instead of 50 individual ones. Combine this with group_wait (e.g., 30s) to allow time for multiple alerts to accumulate before notification.

3. Set Appropriate Repeat Intervals

Too frequent repeats (e.g., every 5 minutes) cause alert fatigue. For critical alerts, 13 hours is sufficient. Use repeat_interval to prevent repetitive notifications for unresolved issues.

4. Use Silences Strategically

Silences allow you to temporarily mute alerts during maintenance windows or known outages. Always include a reason and expiration time when creating a silence. Use the web UI or amtool to manage them:

amtool silence add --author="admin" --reason="Maintenance" --duration=2h alertname=HighCPUUsage

5. Separate Environments

Use different Alertmanager instances or routing rules for dev, staging, and production. For example, route all dev alerts to a

dev-alerts Slack channel and production alerts to #prod-alerts. This prevents noise from non-critical environments.

6. Secure Configuration Files

Never store secrets like SMTP passwords or Slack webhook URLs in plaintext. Use environment variables:

smtp_auth_password: '{{ .Env.SMTP_PASSWORD }}'

Then set the variable before starting Alertmanager:

export SMTP_PASSWORD=your_app_password

sudo systemctl restart alertmanager

Alternatively, use tools like Vault, AWS Secrets Manager, or Kubernetes Secrets in containerized environments.

7. Monitor Alertmanager Itself

Alertmanager exposes metrics at /metrics. Set up a Prometheus scrape job for Alertmanager to monitor its health:

- job_name: 'alertmanager'

static_configs:

- targets: ['localhost:9093']

Then create an alert for when Alertmanager is down:

- alert: AlertmanagerDown

expr: up{job="alertmanager"} == 0

for: 5m

labels:

severity: critical

annotations:

summary: "Alertmanager is down"

description: "Alertmanager has been unreachable for 5 minutes."

8. Test Alert Rules Before Deployment

Use Prometheuss promtool to validate alert rules:

promtool check rules /etc/prometheus/alerts.yml

Also, use the Alertmanager UIs Test button under the Silences tab to simulate alert routing.

Tools and Resources

Official Documentation

Configuration Validators

  • promtool Command-line utility to validate Prometheus and Alertmanager configurations.
  • YAML Linter Use online tools like YAMLLint to catch syntax errors before restarting services.

Template Libraries

Integration Guides

Monitoring and Debugging Tools

  • amtool Command-line interface for managing silences, templates, and testing routing.
  • curl Test Alertmanagers HTTP API: curl http://localhost:9093/api/v2/alerts
  • Wireshark / tcpdump For debugging webhook delivery failures.

Community and Support

Real Examples

Example 1: E-Commerce Platform Alerting

Scenario: An online store uses Prometheus to monitor API latency, error rates, and database connections.

Alert Rules:

  • High API Error Rate: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
  • Database Connection Pool Exhausted: database_connections{type="active"} / database_connections{type="max"} > 0.9
  • Checkout Service Unreachable: up{job="checkout-service"} == 0

Alertmanager Routing:

  • All severity: page alerts ? Slack channel

    prod-alerts + PagerDuty

  • All severity: warning alerts ? Email to dev team + Slack channel

    dev-warnings

  • Alerts with service=checkout ? Escalate to on-call engineer after 15 minutes

Outcome: During a Black Friday sale, a spike in errors triggered a grouped alert. The team responded within 3 minutes, identified a misconfigured load balancer, and restored service before revenue loss occurred.

Example 2: Kubernetes Cluster Monitoring

Scenario: A team manages 20+ Kubernetes clusters across multiple regions.

Alert Rules:

  • Kubelet Down: up{job="kubelet"} == 0
  • Pod CrashLoopBackOff: sum by (namespace, pod) (kube_pod_container_status_restarts_total) > 5
  • Node Memory Pressure: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes

Alertmanager Configuration:

  • Group by cluster, namespace, alertname
  • Send cluster-wide alerts to

    k8s-alerts

  • Send namespace-specific alerts to team Slack channels (e.g.,

    team-frontend, #team-backend)

  • Use silence for scheduled maintenance windows

Outcome: During a node upgrade, 120 alerts were grouped into 12 consolidated messages. The team avoided alert fatigue and focused on the root cause.

Example 3: Hybrid Cloud Infrastructure

Scenario: A company runs workloads on AWS, Azure, and on-premises data centers.

Challenge: Different teams manage different environments with varying SLAs.

Solution:

  • Use labels: cloud_provider=aws, cloud_provider=azure, cloud_provider=onprem
  • Route AWS alerts to cloud team, on-prem alerts to internal IT
  • Set longer repeat intervals for on-prem alerts (4h) due to slower response cycles
  • Use a custom webhook to send alerts to an internal ticketing system

Result: Reduced misrouted alerts by 85%. Each team now receives only alerts relevant to their domain.

FAQs

Q1: Can Alertmanager work without Prometheus?

No, Alertmanager is designed as a companion to Prometheus. It does not generate alertsit only receives and routes them. Other monitoring systems (e.g., Zabbix, Datadog) have their own alerting engines.

Q2: How do I test if my Alertmanager configuration is valid?

Use the amtool command:

amtool config check /etc/alertmanager/alertmanager.yml

It will return Success or list syntax errors. Always validate before restarting the service.

Q3: Why am I not receiving email alerts?

Common causes:

  • Incorrect SMTP credentials or port
  • Two-factor authentication enabled on the email account (use app-specific passwords)
  • Firewall blocking outbound SMTP traffic
  • Email being marked as spam

Check Alertmanager logs: journalctl -u alertmanager -f for SMTP errors.

Q4: How do I silence an alert permanently?

You cannot silence alerts permanently. Silences have a fixed duration (e.g., 1h, 1d). For long-term suppression, modify the alert rule itself or use ignore labels in your alert expressions.

Q5: Can I use Alertmanager with multiple Prometheus servers?

Yes. Configure each Prometheus server to send alerts to the same Alertmanager instance. Use labels like prometheus_cluster to distinguish sources in routing.

Q6: Whats the difference between Alertmanager and Prometheus alert rules?

Prometheus alert rules define when to trigger an alert (e.g., CPU > 90% for 5m). Alertmanager defines how to handle the alert (e.g., group by service, send to Slack, wait 30s, repeat every 3h). They work together but serve different purposes.

Q7: How do I upgrade Alertmanager?

Download the new binary, stop the service, replace the executable, validate the config, then restart. Always test in a staging environment first.

Q8: Is Alertmanager stateful? Does it store alerts?

Yes. Alertmanager stores active alerts and silences in its storage.path directory. If it restarts, it retains pending alerts. However, it does not persist historical alert data. For long-term alert history, integrate with external systems like Loki or Grafana.

Conclusion

Setting up Alertmanager is more than a technical configurationits a foundational step toward building a resilient, observable infrastructure. When properly configured, Alertmanager transforms raw metric anomalies into actionable, prioritized alerts that empower your team to respond quickly and confidently.

In this guide, we covered the full lifecycle of Alertmanager setup: from downloading and installing the binary, to writing precise routing rules, integrating with Slack and email, securing secrets, and validating configurations. We explored best practices that prevent alert fatigue, ensured scalability, and aligned alerting with real-world operational needs. Real-world examples demonstrated how organizations across industriesfrom e-commerce to Kubernetes clustersleverage Alertmanager to reduce downtime and improve system reliability.

Remember: the goal of alerting is not to notify you of every small fluctuation, but to ensure youre alerted to the right problems at the right time. Avoid over-alerting. Prioritize ruthlessly. Test continuously. Monitor Alertmanager itself. And always keep your templates clean, your labels consistent, and your silences intentional.

As your infrastructure grows, so should your alerting strategy. Alertmanager scales gracefully with your needs, and with the practices outlined here, youre now equipped to build an alerting system thats not just functionalbut exceptional.