paint-brush
Setting Up MinIO With Quickwitby@minio
7,574 reads
7,574 reads

Setting Up MinIO With Quickwit

by MinIOFebruary 23rd, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

MinIO is the right choice for Quickwit because of its industry-leading performance and scalability.
featured image - Setting Up MinIO With Quickwit
MinIO HackerNoon profile picture

MinIO is frequently used to store data from logging, metrics and trace data whether it be ElasticSearch, OpenTelemetry, OpenSearch, OpenObserve or any of the other dozen or so great monitoring solutions. MinIO is more efficient when used with storage tiering, which decreases total cost of ownership for the data stored, plus you get the added benefits of writing data to MinIO that is immutableversioned and protected by erasure coding. In addition, saving data to MinIO object storage makes it available to other cloud native machine learning and analytics applications.


Quickwit and MinIO share a lot of the same principles. Quickwit is designed for sub-second search straight from object storage allowing true decoupled compute and storage. This means you can store your data on cheap commodity hardware, while MinIO handles the Replication and Integrity of the data. As your needs and requirements change, you scale out your cluster as needed. Quickwit has concepts of Tenants similar to MinIO that are easily isolated and can manage their individual usage.


In today’s post we’ll show you how to setup MinIO and Quickwit with a specific focus on


  • Configuring MinIO as a storage provider for Quickwit
  • Set up MinIO as a metadata store for Quickwit

Installing MinIO

In a previous blog we discussed how to configure MinIO as a SystemD service. We’ll use the same principles here except instead of a binary it will be installed as an OS package.


  • Install the MinIO .deb package. If you are using another OS family you can find other packages here


root@aj-test-1:~# wget https://dl.min.io/server/minio/release/linux-amd64/archive/minio_20231120224007.0.0_amd64.deb -O minio.deb

root@aj-test-1:~# dpkg -i minio.deb


  • Create a user and group minio-user and minio-user, respectively


root@aj-test-1:~# groupadd -r minio-user
root@aj-test-1:~# useradd -M -r -g minio-user minio-user


  • Create the data directory for MinIO and set the permissions with the user and group created in the previous step


root@aj-test-1:~# mkdir /opt/minio

root@aj-test-1:~# chown minio-user:minio-user /opt/minio


  • Enable and Start MinIO service


root@aj-test-1:~# systemctl enable minio

root@aj-test-1:~# systemctl start minio


  • You can verify MinIO is running either through the console by going to http://localhost:9001 or through mc admin


root@aj-test-1:~# wget https://dl.min.io/client/mc/release/linux-amd64/mc

root@aj-test-1:~# chmod +x mc

root@aj-test-1:~# mv mc /usr/local/bin/mc

root@aj-test-1:~# mc alias set local http://127.0.0.1:9000 minioadmin minioadmin

root@aj-test-1:~# mc admin info local
●  127.0.0.1:9000
   Uptime: 5 minutes
   Version: 2023-11-25T07:17:05Z
   Network: 1/1 OK
   Drives: 1/1 OK
   Pool: 1

Pools:
   1st, Erasure sets: 1, Disks per erasure set: 1

1 drive online, 0 drives offline


If you see messages similar to these, you can be assured that MinIO has started. Now we’ll create a bucket and later some objects using Quickwit.


root@aj-test-1:~# mc mb local/quickwit

Bucket created successfully `local/quickwit`.


Now we are ready to install Quickwit and configure it with MinIO as the backend.

Configure Quickwit

The Quickwit installer automatically picks the correct binary archive for your environment and then downloads and unpacks it in your working directory. In this case since we are running Ubuntu it will install packages related to that OS but it supports all the popular distributions.


curl -L https://install.quickwit.io | sh

cd ./quickwit-v*/
./quickwit --version


Curl the configuration file and let's modify it to add the MinIO bits.


curl -o quickwit.yaml
https://github.com/quickwit-oss/quickwit/blob/main/config/quickwit.yaml


Open the yaml and first add the credentials to configure MinIO


storage:
   s3:
     flavor: minio
     access_key_id: minioadmin
     secret_access_key: minioadmin
     endpoint: http://127.0.0.1:9000


Next we’ll add the Storage and Metadata store configurations


default_index_root_uri: s3://quickwit/indexes

metastore_uri: s3://quickwit/indexes


Once the above configurations are set in the YAML, save it and close. In order to use it set it as an environment variable and run the service


export QW_CONFIG=./quickwit.yaml

./quickwit run


We can check if its working by browsing the UI at http://localhost:7280 or doing a GET

curl http://localhost:7280/api/v1/version


Let's create an index configured to receive Stackoverflow posts. You need to create an index configured with a YAML to map your input documents to your index fields and whether these fields should be stored and indexed.


curl -o stackoverflow-index-config.yaml 
https://raw.githubusercontent.com/quickwit-oss/quickwit/v0.6.4/config/tutorials/stackoverflow/index-config.yaml


Once the index is downloaded create it


./quickwit index create --index-config ./stackoverflow-index-config.yaml


To hydrate the index we just created, we’ll download a sample of the first 10,000 Stackoverflow posts and then feed this data into Quickwit which will store it on MinIO in the backend.


curl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json


./quickwit index ingest --index stackoverflow --input-path stackoverflow.posts.transformed-10000.json --force


As soon as the ingest command finishes you can start querying data by using the search command


./quickwit index search --index stackoverflow --query "search AND engine"


You can use more advanced features such as aggregations like the following query to find the most popular tags used on the questions in this dataset


curl -XPOST "http://localhost:7280/api/v1/stackoverflow/search" -H 'Content-Type: application/json' -d '{
    "query": "type:question",
    "max_hits": 0,
    "aggs": {
        "foo": {
            "terms":{
                "field":"tags",
                "size": 10
            }
        }
    }
}'

Final Thoughts

MinIO is the right choice for Quickwit because of its industry-leading performance and scalability. MinIO’s combination of scalability and high-performance puts every data-intensive workload, not just Quickwit, within reach. MinIO is capable of tremendous performance - a recent benchmark achieved 325 GiB/s (349 GB/s) on GETs and 165 GiB/s (177 GB/s) on PUTs with just 32 nodes of off-the-shelf NVMe SSDs. This makes managing Quickwit with MinIO seamless for log management, distributed tracing, and immutable data such as conversational data, event-based analytics among others.


By storing the data in MinIO, Quickwit can be used as a Grafana datasource for achieving fast visibility into the operations of your application. You can see patterns and set alerts in Grafana's graphical interface that would allow you to run historical analysis and act on anomalies based on certain thresholds. For example, you might want to check for trends or bottlenecks and try to identify patterns in workload type during a specific time of the day.


Got questions? Want to get started? Reach out to us on Slack.


Also published here.