Prometheus Interview Guide

🟢 Easy (Basics)

1. What is Prometheus?
Monitoring + TSDB with pull model and PromQL.
2. Metric types?
Counter, Gauge, Histogram, Summary.
3. Exporter?
Component exposing /metrics (node_exporter, blackbox).
4. PromQL example?
rate(http_requests_total[5m]) for per‑sec rate.
5. Alerting?
Rules fire → Alertmanager routes/dedupes.

🟡 Medium (Hands‑on)

1. Recording rules?
Precompute queries to speed dashboards/alerts.
2. HA?
Two Prom servers + dedupe; or Mimir/Thanos for scale.
3. Remote write/read?
Ship metrics to long‑term backends.
4. Hist vs Summary?
Histograms aggregate; summaries compute client‑side.
5. Service discovery?
K8s/EC2/Consul auto target discovery.

đź”´ Hard (Advanced)

1. Cardinality issues?
Avoid unbounded labels, relabel, drop series.
2. SLO burn‑rate?
Multi‑window queries to detect fast/slow burn.
3. Global view?
Thanos/Mimir with object storage & downsampling.

đź§Ş Scenario Questions & Answers

1. Alert floods.
Tune thresholds, inhibit, group, maintenance silence.
2. High p95 latency in one zone.
Slice by labels; route traffic; check node pressure.
3. Disk usage explosion.
Retention, compaction, remote write for cold data.

Generated for quick interview revision — basics, hands-on, advanced, and scenarios.