“Dashboard anything. Observe everything.” Overview Recently I’ve started working in a project heavy focused on in which the configuration and all the dashboards and alarms were implemented by someone else. observability and monitoring Prometheus Grafana Basically I was blindfolded working with a black-box in terms of application know-how and monitoring perspective. Foundation Some important concepts before going further: stores data in a time-series data model, basically streams of values associated with a timestamp. Prometheus scrapes endpoints also known as and a collection of instances form a Prometheus instances job. Every time-series is uniquely identified by its metric name and optional key-value pairs called labels. #<metric name>{<label name>=<label value>, ...} kube_node_labels{cluster="aws-01", label_kubernetes_io_role="master"} supports many different storage backends for your time-series data (data source). We will focus on Prometheus. Grafana Start exploring Get all labels As I said we know close to zero things about our metrics, therefore some probing (in terms of metrics and labels) is required. For this we can create a new ephemeral dashboard, and tinker with query type variable. So let’s say we want to get all the labels…kind of greedy. We have the following options: label_names() Returns a list of label names. label_values(label) Returns a list of label values for the label in every metric. label_values(metric, label) Returns a list of label values for the label in the specified metric. metrics(metric) Returns a list of metrics matching the specified metric regex. query_result(query) Returns a list of Prometheus query result for the query. Keep in mind that label function doesn’t support queries and isn't a part of Prometheus functions, but a part of templating (that’s why we created a new dashboard). Grafana We can get all the labels also using Prometheus API endpoint /api/v1/labels: Extract all the values for a particular label Let’s assume that we want all the values for label we can do this using function or using Prometheus API endpoint . dockerVersion Grafana label_values /api/v1/label/<label_name>/values Get all the jobs Prometheus API exposes the following endpoint in order to gather all the jobs scraped by that particular instance. /api/v1/label/job/values Quick check if the instances are healthy/reachable - automatically generated labels and time-series When Prometheus scrapes a target, it attaches some labels automatically to the scraped time series which serve to identify the scraped target. # up time series is useful for instance availability monitoring up{job="<job-name>", instance="<instance-id>"} Going further we also can check all the labels for a particular job using the metric. up Conclusions and are frequently used in organizations, and they’re such a useful monitor-stack because despite the fact that at a certain point we might not know all the implementation details concerning metric collection, they allow us to cast some light using the aforementioned tips. Grafana Prometheus Grafana Prometheus API label_names() curl -s http://127.0.0.1:9091/api/v1/labels label_values(<label_name>) label_values(<metric_name>,<label_name>) curl -s http://127.0.0.1:9091/api/v1/label/<label_name>/values up{job="<job_name>"} curl -s http://127.0.0.1:9091/api/v1/label/job/values