What is Prometheus and Why Do You Need It? Prometheus is a powerful monitoring system that collects and processes numerical data (metrics) from applications. It helps you track key indicators such as: The number of requests handled by your service. The response time for each request. Memory and CPU usage. The number of errors occurring in the system. By using Prometheus, you can answer critical questions like: "Is my service running efficiently?" "What are the performance bottlenecks?" "Do we need to scale up our infrastructure?" And how does Prometheus Collect Metrics? There are two primary ways Prometheus gathers data: Pull model – Prometheus actively queries services for their metrics. Push model (Pushgateway) – Services push their metrics to an intermediary, which Prometheus then collects. Let’s break them down. Pull Model In the pull model, Prometheus actively fetches metrics from your application via HTTP (e.g., http://your-service:8080/metrics). This is the default and most commonly used method. Setting up Prometheus with Golang (Pull Model) Install the necessary libraries: go get github.com/prometheus/client_golang/prometheus go get github.com/prometheus/client_golang/prometheus/promhttp Define your metrics (e.g., counting HTTP requests): import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto" ) var httpRequestsTotal = promauto.NewCounter(prometheus.CounterOpts{ Name: "http_requests_total", Help: "Total number of HTTP requests", }) Expose a /metrics endpoint: import ( "net/http" "github.com/prometheus/client_golang/prometheus/promhttp" ) func main() { http.Handle("/metrics", promhttp.Handler()) } Configure Prometheus to scrape metrics from your service in prometheus.yml: scrape_configs: - job_name: "example_service" static_configs: - targets: ["localhost:8080"] Now, Prometheus will automatically query http://localhost:8080/metrics every few seconds to collect data. Why is the Pull Model Preferred? Simplicity – Prometheus controls the scraping schedule and frequency. Fewer points of failure – No need for an additional service to receive metrics. Automatic cleanup – If a service stops responding, Prometheus simply stops receiving data, avoiding stale metrics. Push Model (Pushgateway Approach) In the push model, a service sends its metrics to an intermediary service called Pushgateway, which stores them until Prometheus fetches them. How it Works (Push Model) Your application pushes metrics to Pushgateway: import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/push" ) func main() { registry := prometheus.NewRegistry() jobCounter := prometheus.NewCounter(prometheus.CounterOpts{ Name: "job_execution_count", Help: "Number of executed jobs", }) registry.MustRegister(jobCounter) jobCounter.Inc() err := push.New("http://localhost:9090", "my_service_or_job"). Collector(jobCounter). Grouping("instance", "worker_1"). Push() if err != nil { panic(err) } } Configure Prometheus to collect data from Pushgateway: scrape_configs: - job_name: "pushgateway" static_configs: - targets: ["localhost:9091"] When is the Push Model Actually Useful? Short-lived jobs (batch tasks, cron jobs) that complete before Prometheus can scrape them. Network restrictions where Prometheus cannot directly access the service. External data sources (IoT devices, external APIs) that cannot be scraped directly. Which Model Should You Use? Method Best for... Pros Cons Pull (Recommended) Web services, APIs, long-running applications Simple setup, fewer dependencies, automatic cleanup Not suitable for very short-lived tasks Push (Pushgateway) Batch jobs, tasks without stable network access Allows pushing data from short-lived jobs Stale data, extra complexity, risk of bottlenecks Why Push Model is Not Ideal? Although Pushgateway solves some problems (e.g., short-lived processes that terminate before Prometheus scrapes them), it introduces several new issues: Difficult to manage stale data If a service dies, its old metrics remain in Pushgateway. Prometheus has no way of knowing if the service is still running. You must manually delete outdated metrics using push.Delete(...) or configure expiry policies. Additional complexity Instead of a direct Service → Prometheus link, you now have Service → Pushgateway → Prometheus. Pushgateway is an extra dependency, increasing maintenance overhead. Potential bottlenecks If many services push metrics frequently, Pushgateway can become overwhelmed. Unlike direct Prometheus scrapes (which distribute the load), all requests hit a single Pushgateway instance. Data consistency issues If multiple services push metrics with the same name but different values, data may be overwritten, leading to incorrect results. Conclusion Prometheus is a powerful and reliable tool for monitoring services. For most applications, the pull model is the best choice—it's simple, efficient, and ensures fresh data without additional complexity. However, if you're working with short-lived processes like Lambda functions or batch jobs, the push model via Pushgateway can be useful to capture metrics before the process exits. Choosing the right approach ensures better observability and maintainability of your system. Take care! What is Prometheus and Why Do You Need It? What is Prometheus and Why Do You Need It? Prometheus is a powerful monitoring system that collects and processes numerical data (metrics) from applications. It helps you track key indicators such as: The number of requests handled by your service. The response time for each request. Memory and CPU usage. The number of errors occurring in the system. The number of requests handled by your service. The response time for each request. Memory and CPU usage. The number of errors occurring in the system. By using Prometheus, you can answer critical questions like: By using Prometheus, you can answer critical questions like: "Is my service running efficiently?" "What are the performance bottlenecks?" "Do we need to scale up our infrastructure?" "Is my service running efficiently?" "Is my service running efficiently?" "What are the performance bottlenecks?" "What are the performance bottlenecks?" "Do we need to scale up our infrastructure?" "Do we need to scale up our infrastructure?" And how does Prometheus Collect Metrics? And how does Prometheus Collect Metrics? There are two primary ways Prometheus gathers data: Pull model – Prometheus actively queries services for their metrics. Push model (Pushgateway) – Services push their metrics to an intermediary, which Prometheus then collects. Pull model – Prometheus actively queries services for their metrics. Pull model Push model (Pushgateway) – Services push their metrics to an intermediary, which Prometheus then collects. Push model (Pushgateway) Let’s break them down. Pull Model Pull Model In the pull model , Prometheus actively fetches metrics from your application via HTTP (e.g., http://your-service:8080/metrics ). pull model actively fetches http://your-service:8080/metrics This is the default and most commonly used method. Setting up Prometheus with Golang (Pull Model) Setting up Prometheus with Golang (Pull Model) Install the necessary libraries: go get github.com/prometheus/client_golang/prometheus go get github.com/prometheus/client_golang/prometheus/promhttp Define your metrics (e.g., counting HTTP requests): import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto" ) var httpRequestsTotal = promauto.NewCounter(prometheus.CounterOpts{ Name: "http_requests_total", Help: "Total number of HTTP requests", }) Expose a /metrics endpoint: import ( "net/http" "github.com/prometheus/client_golang/prometheus/promhttp" ) func main() { http.Handle("/metrics", promhttp.Handler()) } Configure Prometheus to scrape metrics from your service in prometheus.yml: scrape_configs: - job_name: "example_service" static_configs: - targets: ["localhost:8080"] Install the necessary libraries: go get github.com/prometheus/client_golang/prometheus go get github.com/prometheus/client_golang/prometheus/promhttp Install the necessary libraries: Install the necessary libraries: go get github.com/prometheus/client_golang/prometheus go get github.com/prometheus/client_golang/prometheus/promhttp go get github.com/prometheus/client_golang/prometheus go get github.com/prometheus/client_golang/prometheus/promhttp Define your metrics (e.g., counting HTTP requests): import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto" ) var httpRequestsTotal = promauto.NewCounter(prometheus.CounterOpts{ Name: "http_requests_total", Help: "Total number of HTTP requests", }) Define your metrics (e.g., counting HTTP requests): Define your metrics import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto" ) var httpRequestsTotal = promauto.NewCounter(prometheus.CounterOpts{ Name: "http_requests_total", Help: "Total number of HTTP requests", }) import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto" ) var httpRequestsTotal = promauto.NewCounter(prometheus.CounterOpts{ Name: "http_requests_total", Help: "Total number of HTTP requests", }) Expose a /metrics endpoint: import ( "net/http" "github.com/prometheus/client_golang/prometheus/promhttp" ) func main() { http.Handle("/metrics", promhttp.Handler()) } Expose a /metrics endpoint: Expose a /metrics import ( "net/http" "github.com/prometheus/client_golang/prometheus/promhttp" ) func main() { http.Handle("/metrics", promhttp.Handler()) } import ( "net/http" "github.com/prometheus/client_golang/prometheus/promhttp" ) func main() { http.Handle("/metrics", promhttp.Handler()) } Configure Prometheus to scrape metrics from your service in prometheus.yml: scrape_configs: - job_name: "example_service" static_configs: - targets: ["localhost:8080"] Configure Prometheus to scrape metrics from your service in prometheus.yml : Configure Prometheus to scrape metrics from your service prometheus.yml scrape_configs: - job_name: "example_service" static_configs: - targets: ["localhost:8080"] scrape_configs: - job_name: "example_service" static_configs: - targets: ["localhost:8080"] Now, Prometheus will automatically query http://localhost:8080/metrics every few seconds to collect data. http://localhost:8080/metrics Why is the Pull Model Preferred? Why is the Pull Model Preferred? Simplicity – Prometheus controls the scraping schedule and frequency. Fewer points of failure – No need for an additional service to receive metrics. Automatic cleanup – If a service stops responding, Prometheus simply stops receiving data, avoiding stale metrics. Simplicity – Prometheus controls the scraping schedule and frequency. Simplicity Fewer points of failure – No need for an additional service to receive metrics. Fewer points of failure Automatic cleanup – If a service stops responding, Prometheus simply stops receiving data, avoiding stale metrics. Automatic cleanup Push Model (Pushgateway Approach) Push Model (Pushgateway Approach) In the push model , a service sends its metrics to an intermediary service called Pushgateway , which stores them until Prometheus fetches them. push model sends Pushgateway How it Works (Push Model) How it Works (Push Model) Your application pushes metrics to Pushgateway: import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/push" ) func main() { registry := prometheus.NewRegistry() jobCounter := prometheus.NewCounter(prometheus.CounterOpts{ Name: "job_execution_count", Help: "Number of executed jobs", }) registry.MustRegister(jobCounter) jobCounter.Inc() err := push.New("http://localhost:9090", "my_service_or_job"). Collector(jobCounter). Grouping("instance", "worker_1"). Push() if err != nil { panic(err) } } Configure Prometheus to collect data from Pushgateway: scrape_configs: - job_name: "pushgateway" static_configs: - targets: ["localhost:9091"] Your application pushes metrics to Pushgateway: import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/push" ) func main() { registry := prometheus.NewRegistry() jobCounter := prometheus.NewCounter(prometheus.CounterOpts{ Name: "job_execution_count", Help: "Number of executed jobs", }) registry.MustRegister(jobCounter) jobCounter.Inc() err := push.New("http://localhost:9090", "my_service_or_job"). Collector(jobCounter). Grouping("instance", "worker_1"). Push() if err != nil { panic(err) } } Your application pushes metrics to Pushgateway: pushes metrics import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/push" ) func main() { registry := prometheus.NewRegistry() jobCounter := prometheus.NewCounter(prometheus.CounterOpts{ Name: "job_execution_count", Help: "Number of executed jobs", }) registry.MustRegister(jobCounter) jobCounter.Inc() err := push.New("http://localhost:9090", "my_service_or_job"). Collector(jobCounter). Grouping("instance", "worker_1"). Push() if err != nil { panic(err) } } import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/push" ) func main() { registry := prometheus.NewRegistry() jobCounter := prometheus.NewCounter(prometheus.CounterOpts{ Name: "job_execution_count", Help: "Number of executed jobs", }) registry.MustRegister(jobCounter) jobCounter.Inc() err := push.New("http://localhost:9090", "my_service_or_job"). Collector(jobCounter). Grouping("instance", "worker_1"). Push() if err != nil { panic(err) } } Configure Prometheus to collect data from Pushgateway: scrape_configs: - job_name: "pushgateway" static_configs: - targets: ["localhost:9091"] Configure Prometheus to collect data from Pushgateway: scrape_configs: - job_name: "pushgateway" static_configs: - targets: ["localhost:9091"] scrape_configs: - job_name: "pushgateway" static_configs: - targets: ["localhost:9091"] When is the Push Model Actually Useful? When is the Push Model Actually Useful? Actually Short-lived jobs (batch tasks, cron jobs) that complete before Prometheus can scrape them. Network restrictions where Prometheus cannot directly access the service. External data sources (IoT devices, external APIs) that cannot be scraped directly. Short-lived jobs (batch tasks, cron jobs) that complete before Prometheus can scrape them. Short-lived jobs (batch tasks, cron jobs) Network restrictions where Prometheus cannot directly access the service. Network restrictions External data sources (IoT devices, external APIs) that cannot be scraped directly. External data sources Which Model Should You Use? Which Model Should You Use? Method Best for... Pros Cons Pull (Recommended) Web services, APIs, long-running applications Simple setup, fewer dependencies, automatic cleanup Not suitable for very short-lived tasks Push (Pushgateway) Batch jobs, tasks without stable network access Allows pushing data from short-lived jobs Stale data, extra complexity, risk of bottlenecks Method Best for... Pros Cons Pull (Recommended) Web services, APIs, long-running applications Simple setup, fewer dependencies, automatic cleanup Not suitable for very short-lived tasks Push (Pushgateway) Batch jobs, tasks without stable network access Allows pushing data from short-lived jobs Stale data, extra complexity, risk of bottlenecks Method Best for... Pros Cons Method Method Method Best for... Best for... Best for... Pros Pros Pros Cons Cons Cons Pull (Recommended) Web services, APIs, long-running applications Simple setup, fewer dependencies, automatic cleanup Not suitable for very short-lived tasks Pull (Recommended) Pull (Recommended) Pull (Recommended) Web services, APIs, long-running applications Web services, APIs, long-running applications Simple setup, fewer dependencies, automatic cleanup Simple setup, fewer dependencies, automatic cleanup Not suitable for very short-lived tasks Not suitable for very short-lived tasks Push (Pushgateway) Batch jobs, tasks without stable network access Allows pushing data from short-lived jobs Stale data, extra complexity, risk of bottlenecks Push (Pushgateway) Push (Pushgateway) Push (Pushgateway) Batch jobs, tasks without stable network access Batch jobs, tasks without stable network access Allows pushing data from short-lived jobs Allows pushing data from short-lived jobs Stale data, extra complexity, risk of bottlenecks Stale data, extra complexity, risk of bottlenecks Why Push Model is Not Ideal? Why Push Model is Not Ideal? Although Pushgateway solves some problems (e.g., short-lived processes that terminate before Prometheus scrapes them), it introduces several new issues : Pushgateway introduces several new issues Difficult to manage stale data If a service dies, its old metrics remain in Pushgateway. Prometheus has no way of knowing if the service is still running. You must manually delete outdated metrics using push.Delete(...) or configure expiry policies. Additional complexity Instead of a direct Service → Prometheus link, you now have Service → Pushgateway → Prometheus. Pushgateway is an extra dependency, increasing maintenance overhead. Potential bottlenecks If many services push metrics frequently, Pushgateway can become overwhelmed. Unlike direct Prometheus scrapes (which distribute the load), all requests hit a single Pushgateway instance. Data consistency issues If multiple services push metrics with the same name but different values, data may be overwritten, leading to incorrect results. Difficult to manage stale data If a service dies, its old metrics remain in Pushgateway. Prometheus has no way of knowing if the service is still running. You must manually delete outdated metrics using push.Delete(...) or configure expiry policies. Difficult to manage stale data If a service dies, its old metrics remain in Pushgateway. Prometheus has no way of knowing if the service is still running. You must manually delete outdated metrics using push.Delete(...) or configure expiry policies. If a service dies, its old metrics remain in Pushgateway. If a service dies, its old metrics remain in Pushgateway. Prometheus has no way of knowing if the service is still running. Prometheus has no way of knowing if the service is still running. You must manually delete outdated metrics using push.Delete(...) or configure expiry policies. You must manually delete outdated metrics using push.Delete(...) or configure expiry policies. push.Delete(...) Additional complexity Instead of a direct Service → Prometheus link, you now have Service → Pushgateway → Prometheus. Pushgateway is an extra dependency, increasing maintenance overhead. Additional complexity Instead of a direct Service → Prometheus link, you now have Service → Pushgateway → Prometheus. Pushgateway is an extra dependency, increasing maintenance overhead. Instead of a direct Service → Prometheus link, you now have Service → Pushgateway → Prometheus. Instead of a direct Service → Prometheus link, you now have Service → Pushgateway → Prometheus . Service → Prometheus Service → Pushgateway → Prometheus Pushgateway is an extra dependency, increasing maintenance overhead. Pushgateway is an extra dependency, increasing maintenance overhead. Potential bottlenecks If many services push metrics frequently, Pushgateway can become overwhelmed. Unlike direct Prometheus scrapes (which distribute the load), all requests hit a single Pushgateway instance. Potential bottlenecks If many services push metrics frequently, Pushgateway can become overwhelmed. Unlike direct Prometheus scrapes (which distribute the load), all requests hit a single Pushgateway instance. If many services push metrics frequently, Pushgateway can become overwhelmed. If many services push metrics frequently, Pushgateway can become overwhelmed. Unlike direct Prometheus scrapes (which distribute the load), all requests hit a single Pushgateway instance. Unlike direct Prometheus scrapes (which distribute the load), all requests hit a single Pushgateway instance. Data consistency issues If multiple services push metrics with the same name but different values, data may be overwritten, leading to incorrect results. Data consistency issues If multiple services push metrics with the same name but different values, data may be overwritten, leading to incorrect results. If multiple services push metrics with the same name but different values, data may be overwritten, leading to incorrect results. Conclusion Conclusion Prometheus is a powerful and reliable tool for monitoring services. For most applications, the pull model is the best choice—it's simple, efficient, and ensures fresh data without additional complexity. However, if you're working with short-lived processes like Lambda functions or batch jobs, the push model via Pushgateway can be useful to capture metrics before the process exits. pull model short-lived processes push model Choosing the right approach ensures better observability and maintainability of your system. Take care!