Introduction to Prometheus

Open-source monitoring and alerting toolkit

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. It has become a leading solution for monitoring cloud-native applications and is part of the Cloud Native Computing Foundation.

Why Prometheus?#

Multi-dimensional data model#

Prometheus stores time series data identified by metric name and key/value pairs (labels), enabling flexible querying and aggregation.

Powerful query language#

PromQL (Prometheus Query Language) allows you to slice and dice your metrics data for dashboards, alerts, and ad-hoc analysis.

Pull-based architecture#

Prometheus scrapes metrics from instrumented targets over HTTP, making it easy to monitor dynamic environments.

Service discovery#

Prometheus can automatically discover targets in cloud and container environments like Kubernetes, AWS, and Azure.

Core concepts#

Metrics types#

Prometheus supports four metric types:

Counter: A cumulative metric that only increases (e.g., total requests)

1
http_requests_total

Gauge: A metric that can go up and down (e.g., temperature, memory usage)

1
temperature_celsius

Histogram: Samples observations and counts them in buckets (e.g., request durations)

1
http_request_duration_seconds_bucket

Summary: Similar to histogram, but also calculates quantiles

1
http_request_duration_seconds

Labels#

Labels enable multi-dimensional data modeling:

1
http_requests_total{method="GET", endpoint="/api/users", status="200"}

PromQL basics#

1
# Instant vector
2
http_requests_total
3
4
# Range vector (last 5 minutes)
5
http_requests_total[5m]
6
7
# Rate of requests per second
8
rate(http_requests_total[5m])
9
10
# Sum by label
11
sum by (method) (rate(http_requests_total[5m]))
12
13
# Filter by label
14
http_requests_total{status=~"5.."}

Architecture#

Prometheus server#

The main Prometheus server scrapes and stores time series data. It consists of:

Retrieval: Pulls metrics from targets
TSDB: Time series database for storage
HTTP server: Serves PromQL queries and web UI

Exporters#

Exporters expose metrics from third-party systems:

Node Exporter: Hardware and OS metrics
MySQL Exporter: MySQL server metrics
Redis Exporter: Redis metrics

Alertmanager#

Handles alerts sent by Prometheus:

Deduplication
Grouping
Routing to receivers (email, Slack, PagerDuty)
Silencing and inhibition

Pushgateway#

For short-lived jobs that can't be scraped, the Pushgateway accepts pushed metrics.

Basic configuration#

1
# prometheus.yml
2
global:
3
  scrape_interval: 15s
4
  evaluation_interval: 15s
5
6
alerting:
7
  alertmanagers:
8
    - static_configs:
9
        - targets:
10
            - alertmanager:9093
11
12
rule_files:
13
  - "alerts.yml"
14
15
scrape_configs:
16
  - job_name: "prometheus"
17
    static_configs:
18
      - targets: ["localhost:9090"]
19
20
  - job_name: "node"
21
    static_configs:
22
      - targets: ["node-exporter:9100"]

Alert rules#

1
# alerts.yml
2
groups:
3
  - name: example
4
    rules:
5
      - alert: HighMemoryUsage
6
        expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes < 0.1
7
        for: 5m
8
        labels:
9
          severity: warning
10
        annotations:
11
          summary: "High memory usage detected"
12
          description: "Memory available is less than 10%"

Integration with DevOps Hub#

Monitor your DevOps Hub pipelines with Prometheus:

1
scrape_configs:
2
  - job_name: "devopshub"
3
    scheme: https
4
    bearer_token: ${DEVOPSHUB_API_KEY}
5
    static_configs:
6
      - targets: ["metrics.assistance.bg"]