InfluxDB has become a popular way for sensors and services to store time-series data, and InfluxData makes that quick and easy thanks to InfluxDB Cloud. Some of the customers we've spoken to aren't storing everything in a managed cloud service though. Sometimes they have specific business, regulatory, or compliance reasons to store the data themselves in their own data center. Other times they're using multiple outputs in Telegraf to send data to both InfluxDB Cloud as part of their disaster recovery plan.
Whatever the reason, having a single client connect to a private InfluxDB database securely potentially involves setting up network routes, opening up firewalls, adjusting NACLs and security groups. Not to mention navigating whatever internal processes might be involved to get all those changes approved and deployed. The value of InfluxDB isn't about having just one client writing data occasionally, it's about being able to support hundreds or more. So repeat that network administration overhead for each new device, or whenever a device's network configuration changes, and you've very quickly got a situation where maintaining those tight network controls becomes unsustainable. Most people would then weigh up between a couple of options: 1) Require a VPN be established between the remote client and the network InfluxDB database is in. (Potentially) Less on-going admin for the network admins where InfluxDB is running, but a higher setup burden for whoever is managing the clients. And do they really want their network connected to yours via a VPN? Or 2) Expose the InfluxDB database directly to the public internet, and trust the authentication system to be enough to protect it.
Today I'm going to take you through a third choice. It'll only take a few minutes to implement, require no network changes, and will prevent you from having to put your private database onto the public internet. We'll also get a whole host of other benefits thrown in along the way which I'll cover once we've got it working.
Note: The code examples below may change as we continue to improve the developer experience. Please check https://docs.ockam.io/ for the most up to date documentation.
If you'd like to jump straight into a working example of this, we have available on GitHub an README
.
To simulate an end-to-end example of a client sending data into InfluxDB we're going to start a Telegraf instance, have it emit CPU events to InfluxDB, and then change the way those two connect to show how we'd connect them across different hosts or networks. For the sake of this example though, we'll be running everything on a single machine.
If you’ve previously setup Ockam you can skip this section and move straight to setting up InfluxDB below.
brew install build-trust/ockam/ockam
(if you’re not using brew for package management we have
Once installed you need to enroll your local identity with Ockam Orchestrator, run the command below and follow the instructions provided:
ockam enroll
If you've already got InfluxDB running on your local machine and have an authentication token you can use to write to it you can skip this section and move straight on to installing Telegraf below.
The prerequisites for this section are to first install:
Once installed we need to start InfluxDB so that we have somewhere to store the metrics data we're going to generate. On most systems that will be as simple as running:
influxd
You should then see some log output, with the final line confirming that influxd is now listening on port 8086
:
2023-02-21T23:49:43.106268Z info Listening {"log_id": "0fv9CURl000", "service": "tcp-listener", "transport": "http", "addr": ":8086", "port": 8086}
If influxd started successfully then you can open a new terminal session and leave this running in the background. If influxd did not start successfully check the
Now we're going to use the influx CLI command to complete the initial database setup so that influxd can receive our data. Run the setup command and complete the required prompts, remember the organization and bucket names you use as we'll need them later:
influx setup
Next you'll need copy the token for the user you just created, which you can retrieve with the auth command:
influx auth list
To generate the base configuration run:
telegraf config \
--section-filter agent:inputs:outputs \
--input-filter cpu \
--output-filter influxdb_v2 > telegraf.conf
Open the generated telegraf.conf file and find the [[outputs.influxdb_v2]]
section which should look like this:
[[outputs.influxdb_v2]]
## The URLs of the InfluxDB cluster nodes.
##
## Multiple URLs can be specified for a single cluster, only ONE of the
## urls will be written to each interval.
## ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
urls = ["http://127.0.0.1:8086"]
## Token for authentication.
token = ""
## Organization is the name of the organization you wish to write to.
organization = ""
## Destination bucket to write into.
bucket = ""
Replace the empty values for token, organization, and bucket with the values from the previous section about
telegraf --config telegraf.conf
To make it easy to re-use your values for future commands and testing, store the appropriate values (i.e., replace the placeholders below with your actual values) into a series of environment variables:
export INFLUX_PORT=8086 \
INFLUX_TOKEN=your-token-here \
INFLUX_ORG=your-org \
INFLUX_BUCKET=your-bucket
Now we can check that Telegraf is regularly sending data to InfluxDB. The configuration we created earlier will emit CPU stats every 10 seconds, so we can send a query to InfluxDB and as it to return all data it has for the past 1 minute:
curl \
--header "Authorization: Token $INFLUX_TOKEN" \
--header "Accept: application/csv" \
--header 'Content-type: application/vnd.flux' \
--data "from(bucket:\"$INFLUX_BUCKET\") |> range(start:-1m)" \
http://localhost:$INFLUX_PORT/api/v2/query?org=$INFLUX_ORG
The example above connects these two services, running on the same host, by using the default unencrypted HTTP transport. Most non-trivial configurations will have InfluxDB running on a separate host with one or more Telegraf nodes sending data in. In production it is unlikely that an unencrypted transport is acceptable, it's also not always desirable to potentially expose the InfluxDB port to public internet.
In this section we'll show you how both of these problems can be solved with very minimal configuration changes to any existing services.
The first step is to enroll yourself with Ockam, and create enrollment tokens for your InfluxDB and Telegraf nodes:
ockam enroll
export OCKAM_INFLUXDB_TOKEN=$( \
ockam project enroll --attribute component=influxdb)
export OCKAM_TELEGRAF_TOKEN=$( \
ockam project enroll --attribute component=telegraf)
Now we can create a node for our InfluxDB service:
ockam identity create influxdb
ockam project authenticate $OCKAM_INFLUXDB_TOKEN --identity influxdb
ockam node create influxdb --identity influxdb
ockam policy create \
--at influxdb \
--resource tcp-outlet \
--expression '(= subject.component "telegraf")'
ockam tcp-outlet create \
--at influxdb \
--to 127.0.0.1:8086
ockam relay create influxdb \
--to influxdb
There's a few things that have happened in those commands, so let's quickly unpack them:
influxdb
, and enrolled it with Ockam using the token we'd generated earlier. If you look back at the command that generated the token you'll see we also tagged this token with an attribute of component=influxdb
.influxdb
node, which states that only nodes that have a component
attribute with a value of telegraf
will be able to connect to a TCP outlet.influxdb
and route traffic to it.
It's now time to establish the other side of this connection by creating the corresponding client node for Telegraf:
ockam identity create telegraf
ockam project authenticate $OCKAM_TELEGRAF_TOKEN --identity telegraf
ockam node create telegraf --identity telegraf
ockam policy create \
--at telegraf \
--resource tcp-inlet \
--expression '(= subject.component "influxdb")'
ockam tcp-inlet create \
--at telegraf \
--from 127.0.0.1:8087 \
--to /project/default/service/forward_to_influxdb/secure/api/service/outlet
Now we can unpack these three commands and what they've done:
telegraf
node.component
with a value of influxdb
.8087
), and where it should forward that traffic to. This node will forward data through to the forwader we created earlier, which will in turn pass it to our influxdb node, which then sends it to the InfluxDB database.
That's it! The listener on localhost port 8087
is now forwarding all traffic to InfluxDB, wherever that is running. If that database was on a different host, running in the cloud, or in a private data center the enrollment and forwarding would still ensure our communication with 127.0.0.1:8087
would be securely connected to wherever that database is running.
This example created the TCP inlet on port 8087
primarily because the influxd service was running on the same host and already bound to port 8086
. In a production deployment where Telegraf and InfluxDB are on separate hosts the TCP inlet could listen on port 8086
and this default configuration would not need to change.
While this is a simplified example running on a single host, the following instructions are the same irrespective of your deployment. Once the influxdb
and telegraf
nodes are enrolled and running, the only change you need to make is to update your telegraf.conf to point to the local listening port:
[[outputs.influxdb_v2]]
urls = ["http://127.0.0.1:8087"]
Restart the Telegraf service, and we can then check that it's still storing data by using the same command we used earlier.
The example here has, in less than 10 minutes, solved our most pressing issue while adding in a number of valuable improvements that aren't immediately obvious because they've been abstracted away:
Private databases stay private: Your InfluxDB hasn't had its port exposed to the public internet, and so there is no route for third-parties to access your server
Encrypted in transit: The secure channel that is established between your nodes is encrypted end-to-end. Each node generates its own cryptographic keys, and is issued its own unique credentials. They then use these to establish a mutually trusted secure channel between each other. By removing the dependency on a third-party service to store or distribute keys you're able to reduce your vulnerability surface area and eliminate single points of failure.
Identity & Attribute Based Access Control: Authorization to even establish a route to the InfluxDB database is tied to the unique identity of the client requesting access, which is both more flexible in terms of support modern and often dynamic deployment approaches and also more clearly aligns with our intentions around access control. If they client is able to establish trust that they are who they say they are then they are able to route their packets to the database. Contrast that to historical approaches which a permanent access decisions based on assumptions about the remote network (e.g. is this request coming from an IP addess we have pre-authorized?). This is in addition to the authentication and authorization controls on the InfluxDB database itself which will continue to work as they always have.
Secure-by-design: The combination of all of the above means that the only way to access the InfluxDB database is over a secure channel, which means all communication is encrypted in transit irrespective of any misconfiguration at either end (e.g., HTTP/TLS not being enabled). And because each node exchanges credentials with each other rather than sharing a common certificate or shared encryption key you can also be sure that:
No other parties in the supply-chain are able to modify the data in transit. The data you receive at the consumer is exactly what was sent by your clients.
That only authorized clients can write to InfluxDB, ensuring that the data in your topic is highly trustworthy. If you have even more stringent requirements you can take control of your credential authority and enforce granular authorization policies.
Reduced vulnerability surface area: Ockam simplifies client access concerns by exposing all remote services as local services. We call this
Also published here.