Monitoring

Introduction to Prometheus

Open-source monitoring and alerting toolkit


Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. It has become a leading solution for monitoring cloud-native applications and is part of the Cloud Native Computing Foundation.

Why Prometheus?#

Multi-dimensional data model#

Prometheus stores time series data identified by metric name and key/value pairs (labels), enabling flexible querying and aggregation.

Powerful query language#

PromQL (Prometheus Query Language) allows you to slice and dice your metrics data for dashboards, alerts, and ad-hoc analysis.

Pull-based architecture#

Prometheus scrapes metrics from instrumented targets over HTTP, making it easy to monitor dynamic environments.

Service discovery#

Prometheus can automatically discover targets in cloud and container environments like Kubernetes, AWS, and Azure.

Core concepts#

Metrics types#

Prometheus supports four metric types:

Counter: A cumulative metric that only increases (e.g., total requests)

1
http_requests_total

Gauge: A metric that can go up and down (e.g., temperature, memory usage)

1
temperature_celsius

Histogram: Samples observations and counts them in buckets (e.g., request durations)

1
http_request_duration_seconds_bucket

Summary: Similar to histogram, but also calculates quantiles

1
http_request_duration_seconds

Labels#

Labels enable multi-dimensional data modeling:

1
http_requests_total{method="GET", endpoint="/api/users", status="200"}

PromQL basics#

1
# Instant vector
2
http_requests_total
3
4
# Range vector (last 5 minutes)
5
http_requests_total[5m]
6
7
# Rate of requests per second
8
rate(http_requests_total[5m])
9
10
# Sum by label
11
sum by (method) (rate(http_requests_total[5m]))
12
13
# Filter by label
14
http_requests_total{status=~"5.."}

Architecture#

Prometheus server#

The main Prometheus server scrapes and stores time series data. It consists of:

  • Retrieval: Pulls metrics from targets
  • TSDB: Time series database for storage
  • HTTP server: Serves PromQL queries and web UI

Exporters#

Exporters expose metrics from third-party systems:

  • Node Exporter: Hardware and OS metrics
  • MySQL Exporter: MySQL server metrics
  • Redis Exporter: Redis metrics

Alertmanager#

Handles alerts sent by Prometheus:

  • Deduplication
  • Grouping
  • Routing to receivers (email, Slack, PagerDuty)
  • Silencing and inhibition

Pushgateway#

For short-lived jobs that can't be scraped, the Pushgateway accepts pushed metrics.

Basic configuration#

1
# prometheus.yml
2
global:
3
scrape_interval: 15s
4
evaluation_interval: 15s
5
6
alerting:
7
alertmanagers:
8
- static_configs:
9
- targets:
10
- alertmanager:9093
11
12
rule_files:
13
- "alerts.yml"
14
15
scrape_configs:
16
- job_name: "prometheus"
17
static_configs:
18
- targets: ["localhost:9090"]
19
20
- job_name: "node"
21
static_configs:
22
- targets: ["node-exporter:9100"]

Alert rules#

1
# alerts.yml
2
groups:
3
- name: example
4
rules:
5
- alert: HighMemoryUsage
6
expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes < 0.1
7
for: 5m
8
labels:
9
severity: warning
10
annotations:
11
summary: "High memory usage detected"
12
description: "Memory available is less than 10%"

Integration with DevOps Hub#

Monitor your DevOps Hub pipelines with Prometheus:

1
scrape_configs:
2
- job_name: "devopshub"
3
scheme: https
4
bearer_token: ${DEVOPSHUB_API_KEY}
5
static_configs:
6
- targets: ["metrics.assistance.bg"]

Next steps#