Continuous Monitoring with TICK stack by@mlabouardy

Continuous Monitoring with TICK stack

June 23rd 2022 1,916 reads
Read on Terminal Reader
react to story with heart
react to story with light
react to story with boat
react to story with money
image
Mohamed Labouardy HackerNoon profile picture

Mohamed Labouardy

image
Monitoring your system is required. It helps you detect any issues before they cause any major downtime that effect your customers and damage your business reputation. It helps you also to plan growth based on the real usage of your system. But collecting metrics from different data sources isn’t enough, you need to personalize your monitoring to meet your own business needs and define the right alerts so that any abnormal changes in the system will reported.
In this post, I will show you how to setup a resilient continuous monitoring platform with only open source projects & how to define an event alert to report changes in the system.
image
Clone the following Github repository:
1 — Terraform & AWS
In the tick-stack/terraform directory, update the variables.tfvars file with your own AWS credentials (make sure you have the right IAM policies) :
region = “AWS REGION”
access_key = “YOUR AWS ACCESS KEY ID”
secret_key = “YOUR AWS SECRET KEY”
key_name = “YOUR SSH KEY PAIR”
Issue the following command to download the AWS provider plugin:
terraform init
image
Issue the following command to provision the infrastructure:
terraform apply — var-file=variables.tfvars
image
2 — Ansible & Docker
Update the inventory file with your instance DNS name:
[servers]
ec2–52–206–156–244.compute-1.amazonaws.com
Then, install the Ansible custom role:
ansible-galaxy install mlabouardy.tick
Execute the Ansible Playbook:
ansible-playbook — private-key=aws.pem -i inventory playbook.yml
image
Point your browser to http://DNS_NAME:8083, you should see InfluxDB Admin Dashboard:
image
Now, create an InfluxDB Data Source in Chronograf(http://DNS_NAME:8888):
image
Create a new Dashboard as follow:
image
You can create multiple graphs to visualize different types of metrics:
image
Note: For in depth details on how to create interactive & dynamic dashboards in Chronograf check my previous tutorial.
You need to elaborate on the data collected to do something like alerting. So make sure to enable Kapacitor:
image
Define a new alert to send a Slack notification if the CPU utilization is higher than 70%.
image
To test it out, we need to generate some workload. For this case, I used stress:
apt-get install stress
Stressing the CPU:
stress — cpu 4 — timeout 20s
After few seconds, you should receive a Slack notification.
react to story with heart
react to story with light
react to story with boat
react to story with money
L O A D I N G
. . . comments & more!