paint-brush
How to Export Metrics from Databricks Serving Endpoint to Datadogby@lexaneon

How to Export Metrics from Databricks Serving Endpoint to Datadog

by Alexey ArtemovFebruary 12th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

If you are using Databricks serving endpoint, and you wish to export metrics to Datadog, you can face with some challenges in Datadog documentation.
featured image - How to Export Metrics from Databricks Serving Endpoint to Datadog
Alexey Artemov HackerNoon profile picture

If you are using Databricks serving endpoint, and you wish to export metrics to Datadog, you can face with some challenges in Datadog documentation. This article will help you to solve them, at least it will help you to make a PoC.


Here are list of links which you can use:


Databricks serving endpoint is a service which allows you to deploy your model and serve it. It is a part of Databricks MLflow. You can find more information about it here: https://docs.databricks.com/applications/mlflow/model-serving.html

Once you created an endpoint, out of the box you can see some metrics in Databricks UI. Also list of metrics can be received via REST API:

curl --request GET "https://${DATABRICKS_HOST}/api/2.0/serving-endpoints/{endpoint_name}/metrics" --header "Authorization: Bearer ${DATABRICKS_TOKEN}"


But if you want to export them to Datadog, you need to install and configure Datadog agent.

Pre-requisites

  • Created Databricks serving endpoint
  • Bearer PAT token behalf of Databricks user/service principal, which has `CAN_QUERY` permission on the endpoint
  • Datadog API key
  • Enabled OpenMetrics integration in your Datadog account, link: https://app.datadoghq.com/integrations

Install Datadog agent

Datadog offers several ways to install agent, they are listed here: https://app.datadoghq.com/account/settings/agent/latest?platform=overview, however looks like some of them are not working, or at least I was not able to make them work. That is why I decided to use the most simple way — install agent on Docker.

Before agent installation, you need to create a file `datadog.yaml`, which should be located `/etc/datadog-agent/datadog.yaml`, with the following content:

api_key: "__DD_API_KEY__"
site: "datadoghq.com"
hostname: "__YOUR_HOST_NAME__"


Then you need to create `metric-config.yaml` file, which should be located `/etc/datadog-agent/conf.d/openmetrics.d/conf.yaml.default`, with the following content:

instances:
 - openmetrics_endpoint: https://__DATABRICKS_HOST__/api/2.0/serving-endpoints/__ENDPOINT_NAME__/metrics

   metrics:
     - cpu_usage_percentage:
         name: cpu_usage_percentage
         type: gauge
     - mem_usage_percentage:
        name: mem_usage_percentage
        type: gauge
     - provisioned_concurrent_requests_total:
        name: provisioned_concurrent_requests_total
        type: gauge
     - request_4xx_count_total:
        name: request_4xx_count_total
        type: gauge
     - request_5xx_count_total:
        name: request_5xx_count_total
        type: gauge
     - request_count_total:
        name: request_count_total
        type: gauge
     - request_latency_ms:
        name: request_latency_ms
        type: histogram

   tag_by_endpoint: false

   send_distribution_buckets: true

   headers:
     Authorization: Bearer __DATABRICKS_TOKEN__
     Content-Type: application/openmetrics-text


The full possible options for `conf.yaml.default` can be found in template file which is located in a container under `/opt/datadog-agent/agent/conf.d/openmetrics.d/`


I’ve created the next Dockerfile, because I have some difficulties with that one which is provided in Datadog documentation:

FROM ubuntu:20.04

# Set environment variables
ENV DD_API_KEY=[SPECIFY_DD_API_KEY]
ENV DD_SITE=datadoghq.com
ENV DD_HOSTNAME=datadog-agent-ubuntu
ENV DATABRICKS_HOST=[SPECIFY_DATABRICKS_HOST]
ENV DATABRICKS_TOKEN=[SPECIFY_DATABRICKS_TOKEN]
ENV ENDPOINT_NAME=[SPECIFY_ENDPOINT_NAME]

# Install curl
RUN apt-get update && apt-get install -y curl

# Copy the datadog.yaml file into the Docker image
COPY datadog.yaml /etc/datadog-agent/datadog.yaml
RUN sed -i -e "s@__DD_API_KEY__@$DD_API_KEY@g" /etc/datadog-agent/datadog.yaml

# Install the Datadog agent
RUN bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script_agent7.sh)"

# Copy the metrics-config.yaml file into the Docker image
COPY metrics-config.yaml /etc/datadog-agent/conf.d/openmetrics.d/conf.yaml.default

RUN sed -i -e "s@__DATABRICKS_HOST__@$DATABRICKS_HOST@g; s@__ENDPOINT_NAME__@$ENDPOINT_NAME@g; s@__DATABRICKS_TOKEN__@$DATABRICKS_TOKEN@g; " /etc/datadog-agent/conf.d/openmetrics.d/conf.yaml.default

RUN service datadog-agent restart

# Run a process that doesn't exit
CMD ["tail", "-f", "/dev/null"]

Usage

  • Run serving endpoint
  • Define environment variables in `Dockerfile’
  • Build docker image
  • Run docker container
# Step 1: Build the Docker image
# Navigate to the directory containing the Dockerfile
cd /path/to/directory/with/dockerfile

# Build the Docker image
# Replace 'my-image' with the name you want to give to your Docker image
docker build -t datadoc-agent-image .

# Step 2: Run a container from the built image
# Replace 'my-container' with the name you want to give to your Docker container
docker run --name datadog-agent -d datadoc-agent-image

# Step 3: Connect to a container and restart agent
docker exec -it datadog-agent  bash
service datadog-agent restart

Some usefully commands for DataDog agent:

datadog-agent status
datadog-agent health

service datadog-agent start
service datadog-agent restart


Also appears here.