Monitoring up and running with Docker Compose, Prometheus and Grafana


Usually the monitoring setup for a cloud native application will be deployed on kubernetes with service discovery and high availability (e.g. using a kubernetes operator like Prometheus Operator). To quickly prototype dashboards and experiment with different metric type options (e.g. histogram vs gauge) you may need a similar setup locally. This post explains how to setup locally a Prometheus/Alert Manager and Grafana monitoring stack with Docker Compose.

First, lets define a general component of the stack as follows:

The following docker-compose.yml file summaries the configuration of all those components:

## docker-compose.yml ##

version: '3'

volumes:
  prometheus_data: {}
  grafana_data: {}

services:

  alertmanager:
    container_name: alertmanager
    hostname: alertmanager
    image: prom/alertmanager
    volumes:
      - ./alertmanager.conf:/etc/alertmanager/alertmanager.conf
    command:
      - '--config.file=/etc/alertmanager/alertmanager.conf'
    ports:
      - 9093:9093

  prometheus:
    container_name: prometheus
    hostname: prometheus
    image: prom/prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./alert_rules.yml:/etc/prometheus/alert_rules.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
    links:
      - alertmanager:alertmanager
    ports:
      - 9090:9090

  grafana:
    container_name: grafana
    hostname: grafana
    image: grafana/grafana
    volumes:
      - ./grafana_datasources.yml:/etc/grafana/provisioning/datasources/all.yaml
      - ./grafana_config.ini:/etc/grafana/config.ini
      - grafana_data:/var/lib/grafana
    ports:
      - 3000:3000

Second, optionally define the alert manager configuration (see reference documentation - link). For local development you probably don’t need to configure anything and can keep the file empty, unless you need to test alert been pushed to an external service by AlertManager. For instance, the following configuration defines how alerts will be sent to pager duty.

## alertmanager.conf ##

global:
  resolve_timeout: 1m
  pagerduty_url: 'https://events.pagerduty.com/v2/enqueue'

route:
  receiver: 'pagerduty-notifications'

receivers:
- name: 'pagerduty-notifications'
  pagerduty_configs:
  - service_key: 0c1cc665a594419b6d215e81f4e38f7
    send_resolved: true

Third, optionally define prometheus alerts in alert_rules.yml. To define some alerts based on metrics in Prometheus, you can group then into a alert_rules.yml so you could validate those alerts are properly triggered in your local setup before configuring them in the production instance. For instance the following configration defines an alert on heap memory used vs max ratio when it crosses 80%

## alert_rules.yml ##

groups:  
  - name: JVMMemory 
    rules:
      - alert: JVMMemoryThresholdCrossed
        # Condition for alerting
        expr: jvm_memory_committed_bytes{region="heap"}/jvm_memory_max_bytes{region="heap"} > 0.8
        # Annotation - additional informational labels to store more information
        annotations:
          title: 'Instance  has crossed 80% heap memory usage'
          description: ' of job  has crossed 80% heap memory usage'
        # Labels - additional labels to be attached to the alert
        labels:
          severity: 'critical'

Forth, and most importantly define Prometheus configuration in prometheus.yml file. This will defines:

This is an example configration file:

## prometheus.yml ##

# global settings
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

alerting:
  alertmanagers:
    - static_configs:
      - targets: ["alertmanager:9093"]

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - /etc/prometheus/alert_rules.yml

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'vad-metrics'
    metrics_path: '/metrics'
    scrape_interval: 5s
    static_configs:
      - targets: ['docker.for.mac.host.internal:9091']

Note: in this case, prometheus will connect to a metrics endpoing at 9091 running outside docker on the machine itself, hence the hostname docker.for.mac.host.internal.

Fifth, define grafana startup configuration. For instance, grafana_config.ini may look like this

## grafana_config.ini ##

[paths]
provisioning = /etc/grafana/provisioning

[server]
enable_gzip = true

Sixth, and most importantly define where the grafana service can find the prometheus service in a grafana_datasources.yml. In our case, Prometheus is running in the container named prometheus thus the file would look like this.

## grafana_datasources.yml ##

apiVersion: 1

datasources:
  - name: 'prometheus'
    type: 'prometheus'
    access: 'proxy'
    url: 'http://prometheus:9090'

Finally, after defining all the configuration file as well as the docker compose file we can start the monitoring stack with docker-compose up

$ docker-compose up
Creating network "monitoring_default" with the default driver
Creating volume "monitoring_prometheus_data" with default driver
Creating volume "monitoring_grafana_data" with default driver
Creating grafana      ... done
Creating alertmanager ... done
Creating prometheus   ... done
Attaching to grafana, alertmanager, prometheus
alertmanager    | level=info ts=2022-01-03T21:52:24.751Z caller=main.go:225 msg="Starting Alertmanager" version="(version=0.23.0, branch=HEAD, revision=61046b17771a57cfd4c4a51be370ab930a4d7d54)"
prometheus      | level=info ts=2022-01-03T21:52:25.364Z caller=main.go:438 msg="Starting Prometheus" version="(version=2.30.3, branch=HEAD, revision=f29caccc42557f6a8ec30ea9b3c8c089391bd5df)"
grafana         | t=2022-01-03T21:52:26+0000 lvl=info msg="Live Push Gateway initialization" logger=live.push_http
grafana         | t=2022-01-03T21:52:26+0000 lvl=info msg="inserting datasource from configuration " logger=provisioning.datasources name=prometheus uid=
grafana         | t=2022-01-03T21:52:26+0000 lvl=info msg="HTTP Server Listen" logger=http.server address=[::]:3000 protocol=http subUrl= socket=

Once the containers are up and running, you can visit

To stop the monitoring stack run the following command

$ docker-compose down -v